Netflix: From Zero to Production-Ready in Minutes (QCon 2017)

17 views

Published on

Slides from Tim Bozarth's (@timbozarth) QCon 2017 presentation (https://qconnewyork.com/ny2017/presentation/zero-production-ready-minutes)

Abstract:
The fabric of Netflix's approach to building new highly-available services is evolving. The Runtime Platform Team is focused on improving developer productivity while simultaneously making it simpler to build and maintain the high-availability services that Netflix expects. Starting with application generation, and leveraging a new approach to communication between services (RPC), we're simplifying what's needed to build a fast, reliable, and optimized service capable of delivering a fantastic customer experience.

We'll be sharing how Netflix is enabling engineers to go from "zero" to "production ready" in minutes - incorporating best-practices learned through years in the cloud. We will also share the story of transitioning from our home-grown RPC machinery to open-source standards, how we recognized when it was the right time to walk away from our own creations, and how our new approach is improving team velocity across Netflix engineering.

Published in: Engineering
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
17
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Netflix: From Zero to Production-Ready in Minutes (QCon 2017)

  1. 1. From Zero to Production-Ready in Minutes Tim Bozarth @timbozarth
  2. 2. Dev Experience: Level up your Eng Effectiveness
  3. 3. Agenda 1. It was the best of times... 2. Best practices made easy 3. Goodbye hand-written clients 4. From NIH to OSS
  4. 4. It was the best of times… (ie: The story of the skeletons in our closet) 1
  5. 5. About Netflix.. 100m+ members 1000+ developers 190+ countries 1/3 US download traffic 500+ microservices Over 100,000 VMs
  6. 6. Runtime Platform Enable developers to productively create and integrate software in the Netflix ecosystem.
  7. 7. Major Investments in Platform
  8. 8. High Availability = Winning Moments of Truth
  9. 9. High Availability =
  10. 10. Challenges: Hard to take advantage of evolving best practices Owning client-side logic is complex and stressful Non-Java experience is hard
  11. 11. Challenges:
  12. 12. Productivity++ (availability is table stakes)
  13. 13. Complexity is the mind killer.
  14. 14. Runtime Platform Enable developers to productively create and integrate software in the Netflix ecosystem.
  15. 15. Best-practices made easy (Better living through less complexity) 2
  16. 16. Generators
  17. 17. Generators What: Gives you a deployed app on the “paved road” in minutes.
  18. 18. Generators Why: To make it easy to adopt, understand, and build production-ready apps.
  19. 19. + Best Practices
  20. 20. Historically: “Let’s go!”
  21. 21. With Generators: “Let’s go!”
  22. 22. + + + + + =
  23. 23. But wait! There’s more! (Consistency)
  24. 24. Components != PaaS
  25. 25. Goodbye hand-written client libraries 3
  26. 26. @Netflix every service owner is responsible for a client
  27. 27. Clients defend themselves from failure (and the foundation to much of Netflix’s micro-service success)
  28. 28. Your service Their service Your Client
  29. 29. RPC Internals Platform Integration Serialization & Deserialization Bespoke business logic Your Client Your service Includes integration with Metrics, Caching, Discovery, Fallbacks, etc...
  30. 30. Your Service RPC Internals Dependencies RPC Internals Platform Integration Serialization & Deserialization Bespoke business logic Server Logic Platform Integration Serialization & Deserialization Your Client Their Server
  31. 31. Problems Server-API changes are a nightmare So much hand-written RPC-related code No cross-language client story
  32. 32. These are solvable problems
  33. 33. +
  34. 34. 2 big wins: Code Generation New Abstraction Layer
  35. 35. PROTO
  36. 36. Your Service gRPC generated interfaces Depend encies gRPC Generated Client Bespoke business logic (please no) Server Logic Your Client Their Server Service Proto
  37. 37. aching, Circuit-breake Fallbacks, Failure Injection, Discovery, equest-context-tracin etrics, Retries, Hedge
  38. 38. Interceptors encapsulate common patterns (outside the user’s typical concern domain)
  39. 39. Client Defense Examples: • Fallbacks • Advanced Caching • Retries • Failure Injection • Hedged Requests • Circuit Breakers (Hystrix) • Common analytics & event-logs • ... and much more
  40. 40. Complex, multi-tier caching took a lot of code.
  41. 41. (In proto) (In client config)
  42. 42. gRPC ❤ languages!
  43. 43. NIH → OSS 4
  44. 44. Value Effort
  45. 45. Value Effort
  46. 46. With every step comes the decision to take another.
  47. 47. Inertia is a powerful force, and a terrible strategy.
  48. 48. Favor commodity when it’s not our core competency (oh right! AWS!)
  49. 49. Wrapping up… Ω
  50. 50. Everything discussed is done gRPC = 10%+ of Netflix RPC 800+ projects made with generators 100+ services currently deployed from generators This stuff = Default for 6-12 months
  51. 51. Code generation is the short & long term solution IDLs = micro-services’ best friend Don’t build stuff you don’t need to

×