Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cloud-Native Application Debugging with Envoy and Service Mesh

Microservices have been great for accelerating the software innovation and delivery, but they also present new challenges, especially as abstractions and automated orchestration at every layer make pinpointing the issue seem like walking around a maze with a blindfold. Existing tools weren’t designed for distributed environments, and the new tools need to consider how to leverage these abstraction layers to better observe, test, and troubleshoot issues.
Christian Posta walks you through Envoy Proxy and service mesh architecture for L7 data plane, the key features in Envoy that can help in debugging and troubleshooting, chaos engineering as a testing methodology for microservices, how to approach a testing and debugging framework for microservices, and new open source tools that address these areas. You’ll explore a workflow to discover and resolve microservices issues, including injecting experiments for stress testing the applications, gathering requests in flight, recording and replaying them, and debugging them step by step without affecting production traffic.

  • Login to see the comments

Cloud-Native Application Debugging with Envoy and Service Mesh

  1. 1. Cloud-native Application Debugging with Envoy and Service Mesh Christian Posta Field CTO – Solo.io
  2. 2. 2 | Copyright © 2020 CHRISTIAN POSTA Global Field CTO, Solo.io @christianposta christian@solo.io https://blog.christianposta.com https://slideshare.net/ceposta
  3. 3. 3 | Copyright © 2020 01 02 03 04 05 06 Challenges of microservices, debugging Introduction to our lab environment Distributed tracing with a service mesh Debugging microservices Debugging in production with record and replay Proactive debugging with chaos experimentation Approximate flow of workshop
  4. 4. 4 | Copyright © 2020 Moving to microservices?
  5. 5. 5 | Copyright © 2020 Microservices and Kubernetes
  6. 6. 6 | Copyright © 2020 Move fast, safely https://puppet.com/resources/whitepaper/state-of-devops-report
  7. 7. 7 | Copyright © 2020 SERVICE MESH JOURNEY INNOVATION MODERNIZE TO MICROSERVICES SERVICE MESH MANAGEMENT ANY MESH - ANYWHERE ADAPTIVE SERVICE MESH
  8. 8. 8 | Copyright © 2020 December 11, 2018 2018 TOP WOMEN ENTREPRENEURS IN CLOUD INNOVATION Seventh Annual Award Honors Women Founders for Outstanding Accomplishments in Cloud and Emerging Technologies, Sponsored by Facebook, Intel, and Google. Award Winning Innovation Key Industry Collaborations
  9. 9. 9 | Copyright © 20209 | Copyright © 2020 The problem
  10. 10. 10 | Copyright © 2020 As we move to services architectures, on cloud-native deployment platforms, we increase the complexity between our services.
  11. 11. 11 | Copyright © 2020 Cloud application networking challenges • Service discovery • Retries • Timeouts • Load balancing • Rate limiting • Thread bulk heading • Circuit breaking
  12. 12. 12 | Copyright © 2020 Cloud application networking challenges • Edge/DMZ routing • Surgical / fine / per-request routing • A/B rollout • Traffic shaping • Request racing • Internal releases / dark launches • Request shadowing • Fault injection
  13. 13. 13 | Copyright © 2020 Cloud application networking challenges • Adaptive, zone-aware routing • Deadlines • Health checking • Stats, metric, collection • Logging • Distributed tracing • Security
  14. 14. 14 | Copyright © 2020 How do we begin to understand what’s happening so we can debug?
  15. 15. 15 | Copyright © 2020 How we typically like to solve this problem:
  16. 16. 16 | Copyright © 202016 | Copyright © 2020 Decentralized, language-independent observability in the network Foundational technology to help solve these challenges in a cloud-native application architecture
  17. 17. 17 | Copyright © 2020 Envoy is to Application Networking what Kubernetes is to Container Deployment http://envoyproxy.io
  18. 18. 18 | Copyright © 2020 Envoy implements: • zone aware, least request load balancing • circuit breaking • outlier detection • retries, retry policies • timeout (including budgets) • traffic shadowing • request racing • rate limiting • access logging, statistics collection • Many other features!
  19. 19. 19 | Copyright © 2020 Envoy to do application networking heavy lifting
  20. 20. 20 | Copyright © 2020 Why Envoy? • C++ • Built ground-up for services environment • Large, diverse, vibrant community • Dynamic configuration model • Highly extensible (in C++  we’ll come back to this) • Many out of the box L7 filters (HTTP, HTTP2, grpc, redis, mysql, DynamoDB, thrift, zookeeper, kafka, et. al.) • Incredible trove of telemetry, tracing out of the box • Very versatile deployment options (as we’ll see)
  21. 21. 21 | Copyright © 2020 Versatility of Envoy: Edge proxy
  22. 22. 22 | Copyright © 2020 Versatility of Envoy: Middle proxy
  23. 23. 23 | Copyright © 2020 Versatility of Envoy: Service proxy
  24. 24. 24 | Copyright © 2020 Control plane for managing mesh of service proxies
  25. 25. 25 | Copyright © 2020 Service proxy lives with application instance
  26. 26. 26 | Copyright © 2020 Service mesh technologies provide the following: • Service discovery / Load balancing • Secure service-to-service communication • Traffic control / shaping / shifting • Policy / Intention based access control • Traffic metric collection • Service resilience • API / programmable interface
  27. 27. 27 | Copyright © 2020 These application-networking technologies provide a nice API for programming our network
  28. 28. 28 | Copyright © 202028 | Copyright © 2020 Setting up the lab environment
  29. 29. 29 | Copyright © 2020 http://bit.ly/debug-microservices
  30. 30. 30 | Copyright © 2020 http://bit.ly/debug-microservices
  31. 31. 31 | Copyright © 2020 http://bit.ly/debug-microservices
  32. 32. 32 | Copyright © 2020 Consul Service Mesh
  33. 33. 33 | Copyright © 2020 Consul Service Mesh connect = { proxy = { config = { upstreams = [ { destination_name = "mysql", local_bind_port = 8001 } ] } } }
  34. 34. 34 | Copyright © 2020 Consul Service Mesh
  35. 35. 35 | Copyright © 202035 | Copyright © 2020 Tracing with a service mesh
  36. 36. 36 | Copyright © 2020 @christianposta DB S3 DEBUGGING IN PRODUCTION CLUSTER POD 1 POD 2 POD 3 POD 4
  37. 37. 37 | Copyright © 2020 @christianposta DB S3 P P P P DEBUGGING IN PRODUCTION CLUSTER POD 1 POD 2 POD 3 POD 4
  38. 38. 38 | Copyright © 202038 | Copyright © 2020 Lab: Distributed Tracing
  39. 39. 39 | Copyright © 202039 | Copyright © 2020 Debugging
  40. 40. 40 | Copyright © 2020 THE PROBLEM
  41. 41. 41 | Copyright © 2020 THE PROBLEM A MONOLITHIC APPLICATION CONSISTS OF A SINGLE PROCESS AN ATTACHED DEBUGGER ALLOWS VIEWING THE COMPLETE STATE OF THE APPLICATION DURING RUNTIME A MICROSERVICES APPLICATION CONSISTS OF POTENTIALLY HUNDREDS OF PROCESSES IS IT POSSIBLE TO GET A COMPLETE VIEW OF THE STATE OF A SUCH APPLICATION?!
  42. 42. 42 | Copyright © 202042 | Copyright © 2020 Demo: multi-language, distributed debugging with Squash
  43. 43. 43 | Copyright © 2020 SQUASH DEFAULT MODE SECURE MODE
  44. 44. 44 | Copyright © 2020 SQUASH DEFAULT MODE Node Namespace: ns-a Namespace: squash s-dlvc1
  45. 45. 45 | Copyright © 2020 -> ls -l /proc/self/ns total 0 lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 cgroup -> cgroup:[4026531835] lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 ipc -> ipc:[4026531839] lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 mnt -> mnt:[4026531840] lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 net -> net:[4026532009] lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 pid -> pid:[4026531836] lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 pid_for_children -> pid:[4026531836] lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 user -> user:[4026531837] lrwxrwxrwx 1 idit idit 0 Dec 7 01:14 uts -> uts:[4026531838] -> inod of mnt namespace (unique identifier to the container namespace) via CRI api call ExecSyncRequest Node Namespace: ns-a s-dlv CRI c1 We need to translate the pid of the process (application that run in the container) to the host pid namespace to allow debugger to attach. Namespace: Squash
  46. 46. 46 | Copyright © 2020 SQUASH SECURE MODE Node Namespace: ns-a Namespace: squash s-dlvc1 CRD Intent squash
  47. 47. 47 | Copyright © 2020 DOCS: HTTPS://SQUASH.SOLO.IO GITHUB: HTTPS://GITHUB.COM/SOLO-IO/ SQUASH COMMUNITY: HTTPS://SLACK.SOLO.IO
  48. 48. 48 | Copyright © 202048 | Copyright © 2020 Break: 3:00p – 3:30p When we come back: Debugging microservices lab NOTE: Make sure to charge your devices!
  49. 49. 49 | Copyright © 202049 | Copyright © 2020 Lab: Squash
  50. 50. 50 | Copyright © 202050 | Copyright © 2020 Debugging in production
  51. 51. 51 | Copyright © 2020 @christianposta DB S3 DEBUGGING IN PRODUCTION CLUSTER POD 1 POD 2 > ONLY HEADER WILL BE SENT > SAMPLING POD 3 POD 4
  52. 52. 52 | Copyright © 2020 @christianposta DB S3 P P P P DEBUGGING IN PRODUCTION CLUSTER POD 1 POD 2 POD 3 POD 4 > ONLY HEADER WILL BE SENT > SAMPLING
  53. 53. 53 | Copyright © 2020 @christianposta DB S3 P P P P DEBUGGING IN PRODUCTION CLUSTER
  54. 54. 54 | Copyright © 2020 @christianposta DB S3 P P P P DEBUGGING IN PRODUCTION CLUSTER
  55. 55. 55 | Copyright © 2020 @christianposta DEBUGGING IN PRODUCTION ++
  56. 56. 56 | Copyright © 202056 | Copyright © 2020 Getting traffic into your mesh Workflow-specific APIs for Envoy Proxy
  57. 57. 57 | Copyright © 2020 Versatility of Envoy: Edge proxy
  58. 58. 58 | Copyright © 2020 Envoy needs a control plane.
  59. 59. 59 | Copyright © 2020 API Gateway built on Envoy https://github.com/solo-io/gloo
  60. 60. 60 | Copyright © 2020 Gloo Data Plane and Control Plane EXTERNAL AUTH RATE LIMITING GLOO FILTERS ROUTER UPSTREAM EXTERNAL AUTH SERVER RATE LIMITING SERVER CACHING DATA LOSS PREVENTION LAMBDA NATS.IO TRANSFORMATION WEB APPLICATION FIREWALL (WAF)
  61. 61. 61 | Copyright © 2020 API Gateway built on Envoy ENVIRONMENT SECRET CONFIGURATION Data Plane Upstream gRPC-JSON transcoder Rate limiting External AUTH … Control Plane Configure and manage envoy’s plugins Router
  62. 62. 62 | Copyright © 2020 Gloo API Gateway • Unify backend APIs running in Kubernetes, VMs, Physical, FaaS, etc • Decentralized configuration: allow service teams to move fast • Declarative configuration • Provides a control plane for Envoy • Security (Oauth/ODIC, API Key, TLS, SNI, OPA, HMAC, custom) • Kubernetes native / run outside Kube as well • Highly pluggable/extensible • “If you know Kubernetes, you know Gloo”  user quote
  63. 63. 63 | Copyright © 202063 | Copyright © 2020 Lab: Using Loop with Gloo
  64. 64. 64 | Copyright © 2020 DOCS: COMING REAL SOON … GITHUB: COMING REAL SOON … COMMUNITY: HTTPS://SLACK.SOLO.IO
  65. 65. 65 | Copyright © 202065 | Copyright © 2020 Demo: Loop with service mesh
  66. 66. 66 | Copyright © 202066 | Copyright © 2020 Proactive debugging
  67. 67. 67 | Copyright © 2020 @christianposta CHAOS ENGINEERING THINK OF A VACCINE OR A FLU SHOT INJECT YOURSELF WITH SOMETHING HARMFUL IN ORDER TO PREVENT A FUTURE ISSUE. CAREFULLY INJECTING THIS HARM INTO YOUR SYSTEMS TO TEST THE SYSTEM’S ABILITY TO RESPOND TO IT. “BREAK THINGS ON PURPOSE" IN ORDER TO LEARN HOW TO BUILD MORE RESILIENT SYSTEMS.
  68. 68. 68 | Copyright © 2020 PROBLEMS WITH CHAOS ENGINEERING TODAY? LANGUAGE SPECIFIC CODE MODIFICATION 1 2
  69. 69. 69 | Copyright © 2020 @christianposta NETWORK ABSTRACTION EAST-WEST TRAFFIC NORTH-SOUTH TRAFFIC SERVICE I SERVICE II SERVICE III SERVICE IV SERVICE V
  70. 70. 70 | Copyright © 2020 @christianposta CONTROL EXPERIMENT ⍄ DEFINE EXPERIMENTS (SET OF: MESSAGE DELAYS, NETWORK FAULTS) ⍄ RUN EVERY INTERVAL (E.G. EVERY FRIDAY AT 9PM) ⍄ GATHERED METRICS – COMPARE BASELINE ⍄ STOP EXPERIMENT IF CONDITION REACHED
  71. 71. 71 | Copyright © 2020 @christianposta GLOOSHOT GLOOSHOT ALLOWS YOU TO PERFORM CHAOS EXPERIMENTS AT THE SERVICE MESH LEVEL. DEFINE ERROR CONDITIONS IN TERMS OF SUCH FAILURE MODES: ⍄ MESSAGE DELAYS ⍄ NETWORK FAULTS. RUN EXPERIMENTS UNTIL A STOP CONDITION IS MET. GLOOSHOT INTERFACES WITH ALL MAJOR SERVICE MESHES THROUGH SERVICE MESH INTERFACE (SMI).
  72. 72. 72 | Copyright © 202072 | Copyright © 2020 Demo: Glooshot
  73. 73. 73 | Copyright © 2020 DOCS: HTTPS://GLOOSHOT.SOLO.IO GITHUB: HTTPS://GITHUB.COM/SOLO-IO/GLOOSHOT COMMUNITY: HTTPS://SLACK.SOLO.IO
  74. 74. 74 | Copyright © 202074 | Copyright © 2020 What to watch for Upcoming improvements for which to keep an eye out
  75. 75. 75 | Copyright © 2020 Web Assembly shaking up the data plane
  76. 76. 76 | Copyright © 2020 Web Assembly shaking up the data plane https://github.com/envoyproxy/envoy-wasm
  77. 77. 77 | Copyright © 2020 Web Assembly shaking up the data plane https://webassemblyhub.io
  78. 78. 78 | Copyright © 2020 @christianposta THANK YOU FOR COMING OUT! @christianposta christian@solo.io https://blog.christianposta.com https://slideshare.net/ceposta
  79. 79. 79 | Copyright © 2020 • https://solo.io • https://slack.solo.io • https://gloo.solo.io • https://envoyproxy.io • https://istio.io • https://webassemblyhub.io • https://servicemeshhub.io • https://blog.christianposta.com

×