SlideShare a Scribd company logo
1 of 36
GE Digital
Monitoring Cloud Foundry
Using Sensu and Graphite
2
GE Digital 4
GE Digital and Predix
GE Digital
Cloud Platform for
Industrial Internet of Things
5
Predix
GE Digital 6
Aviation
GE Digital 7
Transportation
GE Digital 8
Oil and Gas
GE Digital 9
Power and Water
GE Digital 11
Healthcare
GE Digital
Predix and CloudFoundry
12
GE Digital
It all started simply…
15
From POC to Production
Built a Development CF Environment in a few months
Leadership challenge: Deliver 4 production apps to a paying customer in 3 months
Game On! Time to operationalize CF!
Showed Developers could get apps to MVP phase quickly!
GE Digital
Step One: Monitor all the things!
17
Have you *looked* at all_the_CF_things?
19
Cloud Controller
NATS
Runners
HM9000
Go Routers
ETCD
UAA
UAA DB
Cloud Controller DB
Doppler
Loggregator
BOSH
External Load balancers
Consul
MUCH WOW
SUCH AMAZE
GE Digital 21
SUCH AMAZE!
This is going to be challenging!
GE Digital
CloudWatch?!?
24
GE Digital
Time to build…
25
GE Digital
Monitoring and Metrics Goals
26
AutomaticUtility Service Extensible
Driven by
Configuration
Management
GE Digital
Build MVP Solution
27
Monitoring
Framework
Data VisualizationMetrics Collection
GE Digital
Sensu
28
GE Digital
Sensu Architecture
29
Sensu Clients
Check Execution
External data input
RabbitMQ Cluster
Message Transport
Redis
Health Check
StateSensu Servers
Sensu Servers
Publish Check Requests
Processes Events
Sensu APIs
Sensu API Servers
REST interface to
Monitoring System
GE Digital
Service Check Flow
31
Sensu Servers
1
Sensu Server publishes
check request to
subscriber queues
2
Sensu Clients listen to
queue ‘subscriptions’
and execute
commands
Updates check
state in Redis
3
Sensu Clients
publish check
execution
responses to
message queue
Sensu Servers
processes check
responses
4
5
Trigger handlers
if configured
GE Digital
Service Check Anatomy
• Command or Script to run which outputs data to
STDOUT or STDERR
• Produce exit codes to indicate state
• 0 - OK, 1 - Warning, 2 - Critical, >=3 - Custom
• Optional response payload (JSON)
• Subscribers
• Group of nodes that should execute check
• Check Interval
• Handlers
• Actions to take on event (if any)
GE Digital
Service Check Request
{
"checks": {
"check_disk_usage": {
"command": "check-disk-usage.rb -w :::disk.warn|80::: -c :::disk.crit|90:::",
"subscribers": [
"prod-DEA"
],
"handlers": [
"pagerduty"
],
"interval": 60
}
}
}
GE Digital
Take Action! - Sensu Event Handlers
• Handlers are actions executed by a Sensu server when
events are received
• Send PagerDuty Alert - Send metric to Graphite - Send to IRC
• 4 Handler Types
• Pipe - External commands that consume event data via STDIN
• TCP / UDP - Forward event data to external TCP/UDP sockets!
• Transport - Publish event data to named message queue
• Handler Sets - Bundle of handlers e.g. email, slack, pager duty
GE Digital
Metrics
Graphite - Grafana
36
GE Digital
Graphite Architecture
38
Carbon Relay
Metrics Ingest & Routing
Consistent Hashing - Replicas
Carbon Cache - WhisperDB
Metrics Storage
Graphite API
Metrics Retrieval
Grafana
Data Visualization
Dashboards
Sensu Graphite Handler
Process Metrics Events
from message queueSensu Servers
Sensu Client
Graphite Metrics
Health Checks
(REST)
GE Digital
Monitoring Cloud Foundry
39
Monitoring Cloud Foundry
GE Digital
Automatic Coverage
40
• Created BOSH release of Sensu-Client
• Sensu-Client Job included in all BOSH deployments
• Every node belongs to ‘All’ Sensu Subscription by
default
• Now capturing base Linux stats for ALL nodes
• CPU, Memory, Network, Disk stored in Graphite
GE Digital
Metrics Names
42
Allow easy aggregation of stats in Graphite!
uswest02-pr-cf.runner_z1.0.interface.eth0.txBytes
BOSH Deployment BOSH Job Index Metric
GE Digital 43
GE Digital
Cloud Foundry Visibility
44
• CF ‘Collector’ - Open Source project
• Listens on NATS bus for CF subsystem announcements
• Polls components /healthz and /varz endpoints
• Publishes results directly to Graphite via Carbon daemon
• Being phased out in favor of Doppler / Nozzle
implementation
GE Digital 46
GE Digital 47
GE Digital
Sweet Dashboard bro
48
I can haz automated pagerduty alertz pleas?!?
GE Digital
Sensu Graphite Data Check
49
$ curl “http://graphite-api/render?target=sumSeries(USW02-PR-
COLLECTOR.CloudController.*.*.healthy)&from=-5min&until=now&format=json&maxDataPoints=100"
Graphite API
Metrics Retrieval
Sensu Client
HTTP Health Checks
Carbon Cache - WhisperDB
Metrics Storage
{
"target": “sumSeries(USW02-PR-COLLECTOR.CloudController.*.*.healthy)",
“datapoints":
[[12.0, 1463551380], [12.0, 1463551440], [12.0, 1463551500], [12.0, 1463551560], [12.0, 1463551620]]
}
GE Digital
Monitoring Cloud Foundry
50
Questions?
@barrows_jeff
jeff.barrows@ge.com
General Electric Company reserves the right to make changes in specifications and features, or discontinue the product or service described at
any time, without notice or obligation. These materials do not constitute a representation, warranty or documentation regarding the product or
service featured. Illustrations are provided for informational purposes, and your configuration may differ. This information does not constitute legal,
financial, coding, or regulatory advice in connection with your use of the product or service. Please consult your professional advisors for any such
advice. No part of this document may be distributed, reproduced or posted without the express written permission of General Electric Company.
GE, Predix and the GE Monogram are trademarks of General Electric Company. ©2015 General Electric Company – All rights reserved.

More Related Content

What's hot

Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationNagios
 
Serverspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collideServerspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collidem_richardson
 
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Icinga
 
Regain Control Thanks To Prometheus
Regain Control Thanks To PrometheusRegain Control Thanks To Prometheus
Regain Control Thanks To PrometheusEtienne Coutaud
 
Application Monitoring using Datadog
Application Monitoring using DatadogApplication Monitoring using Datadog
Application Monitoring using DatadogMukta Aphale
 
Sensu @ Yelp!: A Guided Tour
Sensu @ Yelp!: A Guided TourSensu @ Yelp!: A Guided Tour
Sensu @ Yelp!: A Guided TourKyle Anderson
 
Groovy there's a docker in my application pipeline
Groovy there's a docker in my application pipelineGroovy there's a docker in my application pipeline
Groovy there's a docker in my application pipelineKris Buytaert
 
Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Andy Sykes
 
Monitoring with sensu
Monitoring with sensuMonitoring with sensu
Monitoring with sensumiquelruizm
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksNagios
 
Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearlDavid Tibbs
 
Building Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptnBuilding Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptnJohannes Bräuer
 
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-HealingApplying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-HealingAndreas Grabner
 
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Diego Pacheco
 
How Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA WorldHow Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA WorldKyle Anderson
 
Cisco DevNet CREATE 2019 - NetBeez Network Performance API
Cisco DevNet CREATE 2019 - NetBeez Network Performance APICisco DevNet CREATE 2019 - NetBeez Network Performance API
Cisco DevNet CREATE 2019 - NetBeez Network Performance APINetBeez, Inc.
 
What's new in NGINX Plus R9
What's new in NGINX Plus R9What's new in NGINX Plus R9
What's new in NGINX Plus R9NGINX, Inc.
 
Deployment Automation & Self-Healing with Dynatrace & Ansible
Deployment Automation & Self-Healing with Dynatrace & AnsibleDeployment Automation & Self-Healing with Dynatrace & Ansible
Deployment Automation & Self-Healing with Dynatrace & AnsibleJürgen Etzlstorfer
 

What's hot (20)

Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Serverspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collideServerspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collide
 
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
 
Sensu
SensuSensu
Sensu
 
Regain Control Thanks To Prometheus
Regain Control Thanks To PrometheusRegain Control Thanks To Prometheus
Regain Control Thanks To Prometheus
 
Application Monitoring using Datadog
Application Monitoring using DatadogApplication Monitoring using Datadog
Application Monitoring using Datadog
 
12 Factors Kubernetes
12 Factors Kubernetes12 Factors Kubernetes
12 Factors Kubernetes
 
Sensu @ Yelp!: A Guided Tour
Sensu @ Yelp!: A Guided TourSensu @ Yelp!: A Guided Tour
Sensu @ Yelp!: A Guided Tour
 
Groovy there's a docker in my application pipeline
Groovy there's a docker in my application pipelineGroovy there's a docker in my application pipeline
Groovy there's a docker in my application pipeline
 
Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)
 
Monitoring with sensu
Monitoring with sensuMonitoring with sensu
Monitoring with sensu
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearl
 
Building Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptnBuilding Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptn
 
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-HealingApplying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
 
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
 
How Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA WorldHow Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA World
 
Cisco DevNet CREATE 2019 - NetBeez Network Performance API
Cisco DevNet CREATE 2019 - NetBeez Network Performance APICisco DevNet CREATE 2019 - NetBeez Network Performance API
Cisco DevNet CREATE 2019 - NetBeez Network Performance API
 
What's new in NGINX Plus R9
What's new in NGINX Plus R9What's new in NGINX Plus R9
What's new in NGINX Plus R9
 
Deployment Automation & Self-Healing with Dynatrace & Ansible
Deployment Automation & Self-Healing with Dynatrace & AnsibleDeployment Automation & Self-Healing with Dynatrace & Ansible
Deployment Automation & Self-Healing with Dynatrace & Ansible
 

Viewers also liked

Grafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesGrafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesPhilip Wernersbach
 
An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine Hakka Labs
 
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016Shannon Williams
 
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More!
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More! Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More!
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More! Redis Labs
 
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water OperationsPuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water OperationsPuppet
 
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with SensuSense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with SensuBethany Erskine
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceLN Renganarayana
 
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...Kris Buytaert
 
Time to say goodbye to your Nagios based setup
Time to say goodbye to your Nagios based setupTime to say goodbye to your Nagios based setup
Time to say goodbye to your Nagios based setupCheck my Website
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBleesjensen
 
Building a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring PlatformBuilding a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring PlatformAmazon Web Services
 

Viewers also liked (13)

Grafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesGrafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and Challenges
 
An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine
 
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
 
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More!
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More! Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More!
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More!
 
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water OperationsPuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
 
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with SensuSense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a Service
 
Influxdb and time series data
Influxdb and time series dataInfluxdb and time series data
Influxdb and time series data
 
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...
 
InfluxDB & Grafana
InfluxDB & GrafanaInfluxDB & Grafana
InfluxDB & Grafana
 
Time to say goodbye to your Nagios based setup
Time to say goodbye to your Nagios based setupTime to say goodbye to your Nagios based setup
Time to say goodbye to your Nagios based setup
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDB
 
Building a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring PlatformBuilding a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring Platform
 

Similar to Cf summit-2016-monitoring-cf-sensu-graphite

DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIsCisco DevNet
 
observability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new softwareobservability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new softwareSneha Inguva
 
Predix Builder Roadshow
Predix Builder RoadshowPredix Builder Roadshow
Predix Builder RoadshowPredix
 
Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21JDA Labs MTL
 
DSDT Meetup Nov 2017
DSDT Meetup Nov 2017DSDT Meetup Nov 2017
DSDT Meetup Nov 2017DSDT_MTL
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTJames Chittenden
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...InfluxData
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryDataWorks Summit/Hadoop Summit
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformmatteo mazzeri
 
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud DataflowHow to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud DataflowLucas Arruda
 
TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...tdc-globalcode
 
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...Henning Jacobs
 
DSD-INT 2018 Delft-FEWS new features - Boot Ververs
DSD-INT 2018 Delft-FEWS new features - Boot VerversDSD-INT 2018 Delft-FEWS new features - Boot Ververs
DSD-INT 2018 Delft-FEWS new features - Boot VerversDeltares
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneDataWorks Summit
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouseApplication Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouseVictoriaMetrics
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
 
Extending WSO2 Analytics Platform
Extending WSO2 Analytics PlatformExtending WSO2 Analytics Platform
Extending WSO2 Analytics PlatformWSO2
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and componentsShapeBlue
 

Similar to Cf summit-2016-monitoring-cf-sensu-graphite (20)

DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
 
observability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new softwareobservability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new software
 
Predix Builder Roadshow
Predix Builder RoadshowPredix Builder Roadshow
Predix Builder Roadshow
 
Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21
 
DSDT Meetup Nov 2017
DSDT Meetup Nov 2017DSDT Meetup Nov 2017
DSDT Meetup Nov 2017
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoT
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
 
Mobile gpu cloud computing
Mobile gpu cloud computing Mobile gpu cloud computing
Mobile gpu cloud computing
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
 
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud DataflowHow to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
 
TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha BigData How we figured out we had a SRE team at ...
 
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
 
DSD-INT 2018 Delft-FEWS new features - Boot Ververs
DSD-INT 2018 Delft-FEWS new features - Boot VerversDSD-INT 2018 Delft-FEWS new features - Boot Ververs
DSD-INT 2018 Delft-FEWS new features - Boot Ververs
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouseApplication Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Extending WSO2 Analytics Platform
Extending WSO2 Analytics PlatformExtending WSO2 Analytics Platform
Extending WSO2 Analytics Platform
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and components
 
IoT at Google Scale
IoT at Google ScaleIoT at Google Scale
IoT at Google Scale
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Cf summit-2016-monitoring-cf-sensu-graphite

  • 1. GE Digital Monitoring Cloud Foundry Using Sensu and Graphite 2
  • 2. GE Digital 4 GE Digital and Predix
  • 3. GE Digital Cloud Platform for Industrial Internet of Things 5 Predix
  • 7. GE Digital 9 Power and Water
  • 9. GE Digital Predix and CloudFoundry 12
  • 10. GE Digital It all started simply… 15 From POC to Production Built a Development CF Environment in a few months Leadership challenge: Deliver 4 production apps to a paying customer in 3 months Game On! Time to operationalize CF! Showed Developers could get apps to MVP phase quickly!
  • 11. GE Digital Step One: Monitor all the things! 17 Have you *looked* at all_the_CF_things?
  • 12. 19 Cloud Controller NATS Runners HM9000 Go Routers ETCD UAA UAA DB Cloud Controller DB Doppler Loggregator BOSH External Load balancers Consul MUCH WOW SUCH AMAZE
  • 13. GE Digital 21 SUCH AMAZE! This is going to be challenging!
  • 15. GE Digital Time to build… 25
  • 16. GE Digital Monitoring and Metrics Goals 26 AutomaticUtility Service Extensible Driven by Configuration Management
  • 17. GE Digital Build MVP Solution 27 Monitoring Framework Data VisualizationMetrics Collection
  • 19. GE Digital Sensu Architecture 29 Sensu Clients Check Execution External data input RabbitMQ Cluster Message Transport Redis Health Check StateSensu Servers Sensu Servers Publish Check Requests Processes Events Sensu APIs Sensu API Servers REST interface to Monitoring System
  • 20. GE Digital Service Check Flow 31 Sensu Servers 1 Sensu Server publishes check request to subscriber queues 2 Sensu Clients listen to queue ‘subscriptions’ and execute commands Updates check state in Redis 3 Sensu Clients publish check execution responses to message queue Sensu Servers processes check responses 4 5 Trigger handlers if configured
  • 21. GE Digital Service Check Anatomy • Command or Script to run which outputs data to STDOUT or STDERR • Produce exit codes to indicate state • 0 - OK, 1 - Warning, 2 - Critical, >=3 - Custom • Optional response payload (JSON) • Subscribers • Group of nodes that should execute check • Check Interval • Handlers • Actions to take on event (if any)
  • 22. GE Digital Service Check Request { "checks": { "check_disk_usage": { "command": "check-disk-usage.rb -w :::disk.warn|80::: -c :::disk.crit|90:::", "subscribers": [ "prod-DEA" ], "handlers": [ "pagerduty" ], "interval": 60 } } }
  • 23. GE Digital Take Action! - Sensu Event Handlers • Handlers are actions executed by a Sensu server when events are received • Send PagerDuty Alert - Send metric to Graphite - Send to IRC • 4 Handler Types • Pipe - External commands that consume event data via STDIN • TCP / UDP - Forward event data to external TCP/UDP sockets! • Transport - Publish event data to named message queue • Handler Sets - Bundle of handlers e.g. email, slack, pager duty
  • 25. GE Digital Graphite Architecture 38 Carbon Relay Metrics Ingest & Routing Consistent Hashing - Replicas Carbon Cache - WhisperDB Metrics Storage Graphite API Metrics Retrieval Grafana Data Visualization Dashboards Sensu Graphite Handler Process Metrics Events from message queueSensu Servers Sensu Client Graphite Metrics Health Checks (REST)
  • 26. GE Digital Monitoring Cloud Foundry 39 Monitoring Cloud Foundry
  • 27. GE Digital Automatic Coverage 40 • Created BOSH release of Sensu-Client • Sensu-Client Job included in all BOSH deployments • Every node belongs to ‘All’ Sensu Subscription by default • Now capturing base Linux stats for ALL nodes • CPU, Memory, Network, Disk stored in Graphite
  • 28. GE Digital Metrics Names 42 Allow easy aggregation of stats in Graphite! uswest02-pr-cf.runner_z1.0.interface.eth0.txBytes BOSH Deployment BOSH Job Index Metric
  • 30. GE Digital Cloud Foundry Visibility 44 • CF ‘Collector’ - Open Source project • Listens on NATS bus for CF subsystem announcements • Polls components /healthz and /varz endpoints • Publishes results directly to Graphite via Carbon daemon • Being phased out in favor of Doppler / Nozzle implementation
  • 33. GE Digital Sweet Dashboard bro 48 I can haz automated pagerduty alertz pleas?!?
  • 34. GE Digital Sensu Graphite Data Check 49 $ curl “http://graphite-api/render?target=sumSeries(USW02-PR- COLLECTOR.CloudController.*.*.healthy)&from=-5min&until=now&format=json&maxDataPoints=100" Graphite API Metrics Retrieval Sensu Client HTTP Health Checks Carbon Cache - WhisperDB Metrics Storage { "target": “sumSeries(USW02-PR-COLLECTOR.CloudController.*.*.healthy)", “datapoints": [[12.0, 1463551380], [12.0, 1463551440], [12.0, 1463551500], [12.0, 1463551560], [12.0, 1463551620]] }
  • 35. GE Digital Monitoring Cloud Foundry 50 Questions? @barrows_jeff jeff.barrows@ge.com
  • 36. General Electric Company reserves the right to make changes in specifications and features, or discontinue the product or service described at any time, without notice or obligation. These materials do not constitute a representation, warranty or documentation regarding the product or service featured. Illustrations are provided for informational purposes, and your configuration may differ. This information does not constitute legal, financial, coding, or regulatory advice in connection with your use of the product or service. Please consult your professional advisors for any such advice. No part of this document may be distributed, reproduced or posted without the express written permission of General Electric Company. GE, Predix and the GE Monogram are trademarks of General Electric Company. ©2015 General Electric Company – All rights reserved.