SlideShare a Scribd company logo
1 of 25
Download to read offline
Reveal Your Deepest
Kubernetes Secrets Metrics
Kubernetes Colorado - 10/25/2017
Bob Cotton - FreshTracks.io
About Me
● Co-Founder - FreshTracks.io - A CA Accelerator Incubation
● bob@freshtracks.io
● @bob_cotton
Agenda
● Short overview of Prometheus
● Sources of metrics
○ Node
○ kubelet and containers
○ Kubernetes API
○ etcd
○ Derived metrics (kube-state-metrics)
● The new K8s metrics server
● Horizontal pod auto-scaler
● K8s cluster hierarchies and metrics aggregation
Prometheus
Prometheus - A Short Overview
● Based on monitoring patterns at Google
● Open sourced by SoundCloud in early 2015
● Second project admitted to the Cloud Native Computing Foundation after Kubernetes
● Adoption surge is tracking Kubernetes
○ 63% of teams using Kubernetes use Prometheus
● Metric discovery and collection
● High-performance timeseries database
● Alert definition and generation
Prometheus - Unique Features
● Supports “Metrics 2.0” - i.e. metrics contain any number of label/values
● “Pull based” metrics collection
● Service discovery mechanism
● Rich set of “exporters”
● Extremely high-performance TSDB
● PromQL
● Alert Manager
● Easily installable from Helm
● Grafana typically used for visualization
Labeled Series - aka Metrics 2.0
APIServiceRegistrationController_adds{
instance="172.20.107.203:443",
job="kubernetes-apiservers"} 1
apiserver_request_count{
client="kubectl/v1.7.5 (darwin/amd64)kubernetes/17d7182",
code="200",
contentType="application/json",
instance="172.20.78.248:443",
job="kubernetes-apiservers",
resource="pods",
verb="GET"} 20
apiserver_request_count{
client="kubectl/v1.7.5 (darwin/amd64) kubernetes/17d7182",
code="200",
contentType="application/json",
instance="172.20.78.248:443",
job="kubernetes-apiservers",
resource="pods",
verb="LIST"} 30
172_20_78_248.apiserver.pods.get.request_count = 20
Prometheus
K8s API Server
TSDB
Kublet
(cAdvisor)
node-exporter
kube_state_metrics
App containers
other exporters
node_exporter
App containers
Kublet
(cAdvisor)
Service Discovery
Prometheus Exposition Format and Exporters
● The Prometheus exposition format - Text over http
○ Efforts to make it a standard
● Supported by Sysdig and the TICK collector
● Close to 100 exporters for various technologies
● The jmx_exporter can cover any Java/JMX application
● https://prometheus.io/docs/instrumenting/exporters/
Demo
● “Official Exporters”
○ node_exporter
○ jmx_exporter
○ snmp_exporter
○ haproxy_exporter
○ cloudwatch_exporter
○ collectd_exporter
○ mysql_exporter
○ memcached_exporter
Sources of Metrics in Kubernetes
Host Metrics from the node_exporter
● Standard Host Metrics
○ Load Average
○ CPU
○ Memory
○ Disk
○ Network
○ Many others
● ~1000 Unique series in a typical node
Demo
Node
node_exporter
/metrics
Container Metrics from cAdvisor
● cAdvisor is embedded into the kubelet, so we
scrape the kubelet to get container metrics
● These are the so-called “core” metrics
● For each container on the node:
○ CPU Usage (user and system) and time throttled
○ Filesystem read/writes/limits
○ Memory usage and limits
○ Network transmit/receive/dropped
Demo
Node
node_exporter
/metrics
kubelet
cAdvisor
Kubernetes Metrics from the K8s API Server
● Metrics about the performance of the K8s API Server
○ Performance of controller work queues
○ Request Rates and Latencies
○ Etcd helper cache work queues and cache performance
○ General process status (File Descriptors/Memory/CPU
Seconds)
○ Golang status (GC/Memory/Threads)
Node
node_exporter
kubelet
cAdvisor
Any other Pod
/metrics
API Server
Etcd Metrics from etcd
● Etcd is “master of all truth” within a K8s cluster
○ Leader existence and leader change rate
○ Proposals committed/applied/pending/failed
○ Disk write performance
○ Network and gRPC counters
K8s Derived Metrics from kube-state-metrics
● Counts and meta-data about many K8s types
○ Counts of many “nouns”
○ Resource Limits
○ Container states
■ ready/restarts/running/terminated/waiting
○ _labels series just carries labels from Pods
● cronjob
● daemonset
● deployment
● horizontalpodautoscaler
● job
● limitrange
● namespace
● node
● persistentvolumeclaim
● pod
● replicaset
● replicationcontroller
● resourcequota
● service
● statefulset
Sources of Metrics in Kubernetes
● Node via the node_exporter
● Container metrics via the kubelet and cAdvisor
● Kubernetes API server
● etcd
● Derived metrics via kube-state-metrics
Scheduling and Autoscaling
i.e. The Metrics Pipeline
The New “Metrics Server”
● Replaces Heapster
● Standard (versioned and auth) API aggregated into the K8s API Server
● In “beta” in K8s 1.8
● Used by the scheduler and (eventually) the Horizontal Pod Autoscaler
● A stripped-down version of Heapster
● Reports on “core” metrics (CPU/Memory/Network) gathered from cAdvisor
● For internal to K8s use only.
● Pluggable for custom metrics
Feeding the Horizontal Pod Autoscaler
● Before the metrics server the HPA utilized Heapster for it’s Core metrics
○ This will be the metrics-server going forward
● API Adapter will bridge to third party monitoring system
○ e.g. Prometheus
Kubernetes
Hierarchy and Aggregation
Core Metrics Aggregation
● K8s clusters form a hierarchy
● We can aggregate the “core” metrics to any level
● This allows for some interesting monitoring
opportunities
○ Using P8s “recording rules” aggregate the core metrics
at every level
○ Insights into all levels of your Kubernetes cluster
● This also applies to any custom application metric
Demo
Namespace
Deployment
Pod
Container
FreshTracks.io is Hiring!
https://searchjobs.ca.com/go/Accelerator/3279000/
Questions?
Resources
● Prometheus.io
● Core Metrics in Kubelet
● Kubernetes monitoring architecture
● What is the new metrics-server?

More Related Content

What's hot

Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit
confluent
 

What's hot (20)

Managing multiple event types in a single topic with Schema Registry | Bill B...
Managing multiple event types in a single topic with Schema Registry | Bill B...Managing multiple event types in a single topic with Schema Registry | Bill B...
Managing multiple event types in a single topic with Schema Registry | Bill B...
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
PGConf APAC 2018 - Managing replication clusters with repmgr, Barman and PgBo...
PGConf APAC 2018 - Managing replication clusters with repmgr, Barman and PgBo...PGConf APAC 2018 - Managing replication clusters with repmgr, Barman and PgBo...
PGConf APAC 2018 - Managing replication clusters with repmgr, Barman and PgBo...
 
Sprint 42 review
Sprint 42 reviewSprint 42 review
Sprint 42 review
 
Glance Updates - Kilo Edition
Glance Updates - Kilo EditionGlance Updates - Kilo Edition
Glance Updates - Kilo Edition
 
Serverless ETL and Optimization on ML pipeline
Serverless ETL and Optimization on ML pipelineServerless ETL and Optimization on ML pipeline
Serverless ETL and Optimization on ML pipeline
 
Optimizing {Java} Application Performance on Kubernetes
Optimizing {Java} Application Performance on KubernetesOptimizing {Java} Application Performance on Kubernetes
Optimizing {Java} Application Performance on Kubernetes
 
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentKafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
 
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
 
Virtual training Intro to Kapacitor
Virtual training  Intro to Kapacitor Virtual training  Intro to Kapacitor
Virtual training Intro to Kapacitor
 
2016 08-30 Kubernetes talk for Waterloo DevOps
2016 08-30 Kubernetes talk for Waterloo DevOps2016 08-30 Kubernetes talk for Waterloo DevOps
2016 08-30 Kubernetes talk for Waterloo DevOps
 
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps WayDevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
 
Analyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Analyzing MySQL Logs with ClickHouse, by Peter ZaitsevAnalyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Analyzing MySQL Logs with ClickHouse, by Peter Zaitsev
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
 
Architectural patterns for high performance microservices in kubernetes
Architectural patterns for high performance microservices in kubernetesArchitectural patterns for high performance microservices in kubernetes
Architectural patterns for high performance microservices in kubernetes
 
Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native Way
 
Introducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationIntroducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task Automation
 

Viewers also liked

Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...
Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...
Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...
JAXLondon2014
 
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
Brian Grant
 

Viewers also liked (12)

The Snake and the Butler
The Snake and the ButlerThe Snake and the Butler
The Snake and the Butler
 
oVirt CI Package Managenent
oVirt CI Package ManagenentoVirt CI Package Managenent
oVirt CI Package Managenent
 
Kubernetes and bluemix
Kubernetes  and  bluemixKubernetes  and  bluemix
Kubernetes and bluemix
 
Frontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling frameworkFrontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling framework
 
Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...
Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...
Detecting Events on the Web in Real Time with Java, Kafka and ZooKeeper - Jam...
 
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
 
StormCrawler in the wild
StormCrawler in the wildStormCrawler in the wild
StormCrawler in the wild
 
A brief study on Kubernetes and its components
A brief study on Kubernetes and its componentsA brief study on Kubernetes and its components
A brief study on Kubernetes and its components
 
Orchestrating Microservices with Kubernetes
Orchestrating Microservices with Kubernetes Orchestrating Microservices with Kubernetes
Orchestrating Microservices with Kubernetes
 
Business use of Social Media and Impact on Enterprise Architecture
Business use of Social Media and Impact on Enterprise ArchitectureBusiness use of Social Media and Impact on Enterprise Architecture
Business use of Social Media and Impact on Enterprise Architecture
 
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
 
Deep-dive into Microservice Outer Architecture
Deep-dive into Microservice Outer ArchitectureDeep-dive into Microservice Outer Architecture
Deep-dive into Microservice Outer Architecture
 

Similar to Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017

kubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdfkubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdf
bchiriamina2
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
DigitalOcean
 

Similar to Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017 (20)

OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikOSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
 
kubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdfkubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdf
 
Kubernetes #2 monitoring
Kubernetes #2   monitoring Kubernetes #2   monitoring
Kubernetes #2 monitoring
 
Monitoring Cockpit for OpenShift Clusters
Monitoring Cockpit for OpenShift ClustersMonitoring Cockpit for OpenShift Clusters
Monitoring Cockpit for OpenShift Clusters
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
 
Kubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best PracticesKubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best Practices
 
Getting started with Kubernetes and Azure DevOps
Getting started with Kubernetes and Azure DevOpsGetting started with Kubernetes and Azure DevOps
Getting started with Kubernetes and Azure DevOps
 
Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)
 
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
 
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without DowntimeHow to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 intro
 
Docker on docker leveraging kubernetes in docker ee
Docker on docker leveraging kubernetes in docker eeDocker on docker leveraging kubernetes in docker ee
Docker on docker leveraging kubernetes in docker ee
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Monitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operatorMonitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operator
 
Autoscaling in kubernetes v1
Autoscaling in kubernetes v1Autoscaling in kubernetes v1
Autoscaling in kubernetes v1
 
Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdf
 
OSDC 2018 | Monitoring Kubernetes at Scale by Monica Sarbu
OSDC 2018 | Monitoring Kubernetes at Scale by Monica SarbuOSDC 2018 | Monitoring Kubernetes at Scale by Monica Sarbu
OSDC 2018 | Monitoring Kubernetes at Scale by Monica Sarbu
 
Google Kubernetes Engine Deep Dive Meetup
Google Kubernetes Engine Deep Dive MeetupGoogle Kubernetes Engine Deep Dive Meetup
Google Kubernetes Engine Deep Dive Meetup
 
Monitoring Kubernetes with Elasticsearch Services - Ted Jung, Consulting Arch...
Monitoring Kubernetes with Elasticsearch Services - Ted Jung, Consulting Arch...Monitoring Kubernetes with Elasticsearch Services - Ted Jung, Consulting Arch...
Monitoring Kubernetes with Elasticsearch Services - Ted Jung, Consulting Arch...
 
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
 

Recently uploaded

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
HenryBriggs2
 

Recently uploaded (20)

DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 

Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017

  • 1. Reveal Your Deepest Kubernetes Secrets Metrics Kubernetes Colorado - 10/25/2017 Bob Cotton - FreshTracks.io
  • 2. About Me ● Co-Founder - FreshTracks.io - A CA Accelerator Incubation ● bob@freshtracks.io ● @bob_cotton
  • 3. Agenda ● Short overview of Prometheus ● Sources of metrics ○ Node ○ kubelet and containers ○ Kubernetes API ○ etcd ○ Derived metrics (kube-state-metrics) ● The new K8s metrics server ● Horizontal pod auto-scaler ● K8s cluster hierarchies and metrics aggregation
  • 5. Prometheus - A Short Overview ● Based on monitoring patterns at Google ● Open sourced by SoundCloud in early 2015 ● Second project admitted to the Cloud Native Computing Foundation after Kubernetes ● Adoption surge is tracking Kubernetes ○ 63% of teams using Kubernetes use Prometheus ● Metric discovery and collection ● High-performance timeseries database ● Alert definition and generation
  • 6. Prometheus - Unique Features ● Supports “Metrics 2.0” - i.e. metrics contain any number of label/values ● “Pull based” metrics collection ● Service discovery mechanism ● Rich set of “exporters” ● Extremely high-performance TSDB ● PromQL ● Alert Manager ● Easily installable from Helm ● Grafana typically used for visualization
  • 7. Labeled Series - aka Metrics 2.0 APIServiceRegistrationController_adds{ instance="172.20.107.203:443", job="kubernetes-apiservers"} 1 apiserver_request_count{ client="kubectl/v1.7.5 (darwin/amd64)kubernetes/17d7182", code="200", contentType="application/json", instance="172.20.78.248:443", job="kubernetes-apiservers", resource="pods", verb="GET"} 20 apiserver_request_count{ client="kubectl/v1.7.5 (darwin/amd64) kubernetes/17d7182", code="200", contentType="application/json", instance="172.20.78.248:443", job="kubernetes-apiservers", resource="pods", verb="LIST"} 30 172_20_78_248.apiserver.pods.get.request_count = 20
  • 8. Prometheus K8s API Server TSDB Kublet (cAdvisor) node-exporter kube_state_metrics App containers other exporters node_exporter App containers Kublet (cAdvisor) Service Discovery
  • 9. Prometheus Exposition Format and Exporters ● The Prometheus exposition format - Text over http ○ Efforts to make it a standard ● Supported by Sysdig and the TICK collector ● Close to 100 exporters for various technologies ● The jmx_exporter can cover any Java/JMX application ● https://prometheus.io/docs/instrumenting/exporters/ Demo ● “Official Exporters” ○ node_exporter ○ jmx_exporter ○ snmp_exporter ○ haproxy_exporter ○ cloudwatch_exporter ○ collectd_exporter ○ mysql_exporter ○ memcached_exporter
  • 10. Sources of Metrics in Kubernetes
  • 11. Host Metrics from the node_exporter ● Standard Host Metrics ○ Load Average ○ CPU ○ Memory ○ Disk ○ Network ○ Many others ● ~1000 Unique series in a typical node Demo Node node_exporter /metrics
  • 12. Container Metrics from cAdvisor ● cAdvisor is embedded into the kubelet, so we scrape the kubelet to get container metrics ● These are the so-called “core” metrics ● For each container on the node: ○ CPU Usage (user and system) and time throttled ○ Filesystem read/writes/limits ○ Memory usage and limits ○ Network transmit/receive/dropped Demo Node node_exporter /metrics kubelet cAdvisor
  • 13. Kubernetes Metrics from the K8s API Server ● Metrics about the performance of the K8s API Server ○ Performance of controller work queues ○ Request Rates and Latencies ○ Etcd helper cache work queues and cache performance ○ General process status (File Descriptors/Memory/CPU Seconds) ○ Golang status (GC/Memory/Threads) Node node_exporter kubelet cAdvisor Any other Pod /metrics API Server
  • 14. Etcd Metrics from etcd ● Etcd is “master of all truth” within a K8s cluster ○ Leader existence and leader change rate ○ Proposals committed/applied/pending/failed ○ Disk write performance ○ Network and gRPC counters
  • 15. K8s Derived Metrics from kube-state-metrics ● Counts and meta-data about many K8s types ○ Counts of many “nouns” ○ Resource Limits ○ Container states ■ ready/restarts/running/terminated/waiting ○ _labels series just carries labels from Pods ● cronjob ● daemonset ● deployment ● horizontalpodautoscaler ● job ● limitrange ● namespace ● node ● persistentvolumeclaim ● pod ● replicaset ● replicationcontroller ● resourcequota ● service ● statefulset
  • 16. Sources of Metrics in Kubernetes ● Node via the node_exporter ● Container metrics via the kubelet and cAdvisor ● Kubernetes API server ● etcd ● Derived metrics via kube-state-metrics
  • 17. Scheduling and Autoscaling i.e. The Metrics Pipeline
  • 18. The New “Metrics Server” ● Replaces Heapster ● Standard (versioned and auth) API aggregated into the K8s API Server ● In “beta” in K8s 1.8 ● Used by the scheduler and (eventually) the Horizontal Pod Autoscaler ● A stripped-down version of Heapster ● Reports on “core” metrics (CPU/Memory/Network) gathered from cAdvisor ● For internal to K8s use only. ● Pluggable for custom metrics
  • 19.
  • 20. Feeding the Horizontal Pod Autoscaler ● Before the metrics server the HPA utilized Heapster for it’s Core metrics ○ This will be the metrics-server going forward ● API Adapter will bridge to third party monitoring system ○ e.g. Prometheus
  • 22. Core Metrics Aggregation ● K8s clusters form a hierarchy ● We can aggregate the “core” metrics to any level ● This allows for some interesting monitoring opportunities ○ Using P8s “recording rules” aggregate the core metrics at every level ○ Insights into all levels of your Kubernetes cluster ● This also applies to any custom application metric Demo Namespace Deployment Pod Container
  • 25. Resources ● Prometheus.io ● Core Metrics in Kubelet ● Kubernetes monitoring architecture ● What is the new metrics-server?