This document discusses Kubernetes usage at VMware SAAS. It covers dynamic provisioning of applications on Kubernetes, monitoring tools used like DataDog and Log Insight, and best practices for upgrading Kubernetes clusters. Key points include using stateless applications where possible, service discovery using Kubernetes services, dynamic provisioning using an onboarding service, and performing rolling upgrades for stateful applications to minimize downtime.
2. VMware SAAS
2
1 K8S Usage
2 Dynamic Provisioning
3 Monitoring & Upgrade
VMware SAAS in the current context refers to CMBU initiative alone
3. VMware SAAS: Overview
• MCM solution combining the capabilities of Application orchestration, monitoring, management,
costing etc
• SaaS first, platform not a product
• Enables DevOps engineers and Developers to be more agile and leverage resources from any
cloud.
• Built using variety of tech Stack
– Distributed Micro-services
– Java, Scala, Spring, Guice, Xenon etc.,
– Relational databases, Key value stores, Document stores etc.,
VMware SAAS in the current context refers to CMBU initiative alone 3
4. Types of Application
• Green Field (New) Apps
– Xenon based Micro-services
– Distributed
– Soft states often categorized to Stateless apps
• Brown Field (Existing) Apps
– Existing applications follows n-tier architecture
– Containerization
– Leverages K8S Pod
– Separate Stateful & Stateless apps
4
5. Deployments
• Core business logic are modelled as stateless containers
– Java & Scala applications
– Load based replica sets, auto scale
• Soft state applications
– Xenon services & distributed task processing
– Distributed applications uses remote nodes to store the data than disk
– Spread pod across nodes (anti-affinity)
• Configure Liveness Probe
– Health Checks API
5
6. Stateful Sets
• Persistence layer & distributed applications
• Mongo DB
– Containerized, replicated
– Entry point Initializes and configures replica set
– Customized Storage class : aws-ebs (type, iops, zone)
• Postgres DB
– Containerized, Active-stand by
– Streaming replication
– Pgpool
6
7. Best Practices
• Cloud agnostic
– Avoid AWS specific services
• Prefer stateless
• Service Discovery
– Internal communication
• Uses FQDN of K8S service name
• HTTP, TCP
– External Service communication
• NGINX/ HA proxy
• CI/CD Pipeline using vRealize code stream
7
9. Dynamic Provisioning
• Use cases
– Single instance of N-tier application/distributed application supports “X” tenants
– Use case demands due to compliance/ data isolation requirements etc.,
– Horizontal scaling of application (collection of heterogeneous K8S resources)
• Onboarding Service
– Runs as a deployment inside cluster
– Use JSON templates to create K8S resources
– Lazy while provisioning and avoid shuffling
– Orchestrates updates
9
10. Dynamic Provisioning
• Control plane
– Build using Xenon
– K8S REST spec client in Java
– Business logic to scale based on the usage & load
– Wraps bootstrap logic using containers
• Challenges
– K8S Version upgrade
– Scale down
– Environment abstractions
10
12. Monitoring
• Tools Used
– VMWare vRealize Log Insight for log monitoring (Kibana is good too)
– Data dog (Grafana is good too)
• Support for all standard processes like JVM, RDBMS, Mongo etc.,
• Docker & Kubernetes
• Support for posting custom metrics
– Pingdom
• Users perspective
– Pager duty
– Status.io
• Communicate health to stakeholders
12
13. Monitoring
• Application
– Pod contains Data dog agent & Log insight agent
– Captures application metrics
• Dropwizard metrics & Xenon Stats
• Publishes to data dog
– Custom monitoring dashboards in Data dog
• Dev Ops UI
– Uses Kubernetes auth
– Separate from application authn & authz
• Fault Injection
– Simulates Pod failures
– Simulates CPU Usage & network delays
13
14. Upgrade
• Side by side Upgrade
– Preferred option for soft state (deployment) applications
– Pauses user request momentarily to minimize downtime
• Rolling upgrade
– Preferred option for Stateful sets
– Avoid transformations and provide backward compatibility
• Backup & Restore
– Soft states are backed up to S3.
– EBS are periodically snapshotted to S3
– Periodic restore to Staging
– DR scenarios
14