SlideShare a Scribd company logo
1 of 35
Download to read offline
Fact-based Monitoring
puppetconf 2014
Alexis Lê-Quôc @alq
Alexis Lê-Quôc, @alq
CTO at Datadog
Poll: Monitoring makes me…
happy
proud
cry
want to hide
Puppet brings Automation to
Systems Management
Improve
Monitoring
the way Puppet has
improved
Systems Management
“The good old days”
• Your “CMDB” was Excel
• SSH in and hack away
• Little time for anything else
Then Puppet came…
• Expressive rules that capture expected result
• Using facts and classifiers, a.k.a. metadata to figure out where to
apply changes
• That freed up a lot of our time*
* on a per-machine basis
–Me (just now)
“Puppet brings immunity of configuration to change in
infrastructure”
I have seen this before…
–C.J. Date (1977)
“[SQL brings] immunity of application to change in storage
structure and access strategy”
http://www.cs.berkeley.edu/~brewer/cs262/SystemR.pdf
SQL
• 1974 IBM introduces System R and its Structured Query Language
• Expressive rules that capture expected result
• Using facts and predicates, a.k.a. metadata to figure out what data
to get
• That freed up a lot of development time
SQL
• From a time-consuming, imperative mess (“how”)
• … to expressive data queries (“what”)
SQL query
SELECT (desired facts)

FROM (existing facts)

WHERE (matching criteria)
Puppet
• From a time-consuming, imperative mess (“how”)
• … to expressive configuration queries (“what”)
puppet apply
CHANGE (desired facts)

FROM (existing puppet facts)

WHERE (matching puppet classes)
Is there a pattern?
–MCollective overview
“Break free from ever more complex naming conventions for
hostnames as a means of identity. Use a very rich set of meta
data provided by each machine to address them.”
MCollective
• From a time-consuming, imperative mess (“how”)
• … to expressive orchestration queries (“what”)
mco rpc service restart service=nginx
-F webpool=A
EXEC (desired actions)

FROM (existing puppet facts)

WHERE (matching puppet classes)
Back to monitoring
• Monitoring is to behavior what Puppet is to configuration
• Monitoring is to behavior what MCollective is to orchestration
Monitoring
• From a time-consuming, imperative mess (“how”)
• … to expressive monitoring queries (“what”)
Monitoring query
MONITOR (desired behavior)

FROM (existing heartbeats/metrics)

WHERE (matching puppet facts)
Examples
• “All provisioned web servers in the production environment,
datacenter ABC must respond to queries within 200ms”
• “All PostgreSQL servers must have a postgres: bgwriter process
running”
• “At least one ActiveMQ server is up to support mcollective"
• Never mention a hostname
Hosts are not the center of the
monitoring universe.
Facts are!
Hosts are just places where facts occur.
The proof is in the pudding…
Hosts at the center of the universe
a.k.a. the Wrong Way
–Nagios Core 4 manual on monitoring clusters
“Its fairly straightforward, so hopefully you find things easy to
understand…”
Host-centric: Monitor a DNS cluster
check_command
check_service_cluster!"DNS Cluster"!0!1!
$SERVICESTATEID:host1:DNS Service$,$SERVICESTATEID:host2:DNS
Service$,$SERVICESTATEID:host3:DNS Service$
Where do host1, host2, host3 come from?
Host-centric: can’t use facts directly
• “Host groups solve this problem”. No, they don’t.
• Combinatorial explosion, e.g. trivially
• 4 data centers (us-1, us-2, eu, apac)
• 5 classes (web, db, cache, appserver, hadoop)
• 3 environments (test, staging, prod)
• => up to 119 materialized host groups
Nagios-bashing?
• No!
• Same fatal flaw with all host-centric monitoring tools
• Host-centric monitoring forces an extra, expensive step:
• replicate fact-based conditionals in host-centric templates
–puppet-nagios author
“Please note that this module is not for the faint of heart. Even I
(the author) have my head hurt each time I have to make
modifications to it…”
Facts at the center of the universe
a.k.a. the Right Way
"De Revolutionibus manuscript p9b" by Nicolas Copernicus - www.bj.uj.edu.pl. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:De_Revolutionibus_manuscript_p9b.jpg#mediaviewer/
File:De_Revolutionibus_manuscript_p9b.jpga
Earlier Examples
• “All provisioned web servers in the production environment,
datacenter ABC must respond to queries within 200ms”
• “All PostgreSQL servers must have a postgres: bgwriter process
running”
• “At least one ActiveMQ server is up to support mcollective"
In Sensu (heartbeats)
• “All PostgreSQL servers must have a postgres: bgwriter process
running”
class postgres::monitoring::sensu {
sensu::subscription { 'postgres': }
}
• Monitoring using a fact-based query
• Is node of class “postgres” and subscribed to “postgres” or not?
• If so, it will execute the postgres check
In Datadog (metrics)
• “All provisioned web servers in the production environment,
datacenter ABC must respond to queries within 200ms”
$ puppet module install datadog-datadog_agent
class {
‘datadog_agent’:
api_key => …,
tags => [$environment],
fact_to_tags => [“datacenter”]
}
include datadog_agent::integrations::nginx
In Datadog (metrics)
• Monitoring using a fact-based query
• Puppet facts directly reused
max(nginx.request.latency{production,datacenter:ABC}) < 200
What to take away
Fact-based monitoring
1. Hosts are not at the center of the monitoring universe
2. Expressive monitoring uses queries
3. Monitoring queries should use Puppet facts
Thank you!

More Related Content

What's hot

Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerTomas Doran
 
Puppet Development Workflow
Puppet Development WorkflowPuppet Development Workflow
Puppet Development WorkflowJeffery Smith
 
An Abridged Guide to Event Sourcing
An Abridged Guide to Event SourcingAn Abridged Guide to Event Sourcing
An Abridged Guide to Event SourcingTomer Gabel
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic EnvironmentDe-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic EnvironmentPuppet
 
Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Deploying PHP Applications with Ansible
Deploying PHP Applications with AnsibleDeploying PHP Applications with Ansible
Deploying PHP Applications with AnsibleOrestes Carracedo
 
SEP DevOps Ignite Talk - Packer
SEP DevOps Ignite Talk - PackerSEP DevOps Ignite Talk - Packer
SEP DevOps Ignite Talk - PackerRyan Sweeney
 
PyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPiyush Kumar
 
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner) Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner) Puppet
 
Rails assets revisited
Rails assets revisitedRails assets revisited
Rails assets revisitederichsen
 
Ansible Oxford - Cows & Containers
Ansible Oxford - Cows & ContainersAnsible Oxford - Cows & Containers
Ansible Oxford - Cows & Containersjonatanblue
 
Automation with Packer and TerraForm
Automation with Packer and TerraFormAutomation with Packer and TerraForm
Automation with Packer and TerraFormWesley Charles Blake
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusJakob Karalus
 
Testing Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerTesting Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerDennis Rowe
 

What's hot (19)

Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for Docker
 
Puppet Development Workflow
Puppet Development WorkflowPuppet Development Workflow
Puppet Development Workflow
 
An Abridged Guide to Event Sourcing
An Abridged Guide to Event SourcingAn Abridged Guide to Event Sourcing
An Abridged Guide to Event Sourcing
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
JavaScript Event Loop
JavaScript Event LoopJavaScript Event Loop
JavaScript Event Loop
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic EnvironmentDe-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
 
Ansible and AWS
Ansible and AWSAnsible and AWS
Ansible and AWS
 
ECS위에 Log Server 구축하기
ECS위에 Log Server 구축하기ECS위에 Log Server 구축하기
ECS위에 Log Server 구축하기
 
Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Deploying PHP Applications with Ansible
Deploying PHP Applications with AnsibleDeploying PHP Applications with Ansible
Deploying PHP Applications with Ansible
 
SEP DevOps Ignite Talk - Packer
SEP DevOps Ignite Talk - PackerSEP DevOps Ignite Talk - Packer
SEP DevOps Ignite Talk - Packer
 
PyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPyCon India 2012: Celery Talk
PyCon India 2012: Celery Talk
 
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner) Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
 
Rails assets revisited
Rails assets revisitedRails assets revisited
Rails assets revisited
 
Ansible Oxford - Cows & Containers
Ansible Oxford - Cows & ContainersAnsible Oxford - Cows & Containers
Ansible Oxford - Cows & Containers
 
Automation with Packer and TerraForm
Automation with Packer and TerraFormAutomation with Packer and TerraForm
Automation with Packer and TerraForm
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
 
Testing Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerTesting Ansible with Jenkins and Docker
Testing Ansible with Jenkins and Docker
 

Similar to Fact-Based Monitoring - PuppetConf 2014

Fact-Based Monitoring
Fact-Based MonitoringFact-Based Monitoring
Fact-Based MonitoringDatadog
 
Fact based monitoring
Fact based monitoringFact based monitoring
Fact based monitoringDatadog
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewLei (Harry) Zhang
 
Introduction to Akka - Atlanta Java Users Group
Introduction to Akka - Atlanta Java Users GroupIntroduction to Akka - Atlanta Java Users Group
Introduction to Akka - Atlanta Java Users GroupRoy Russo
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
 
OpenStack Summit Vancouver: Lessons learned on upgrades
OpenStack Summit Vancouver:  Lessons learned on upgradesOpenStack Summit Vancouver:  Lessons learned on upgrades
OpenStack Summit Vancouver: Lessons learned on upgradesFrédéric Lepied
 
06 integrate elasticsearch
06 integrate elasticsearch06 integrate elasticsearch
06 integrate elasticsearchErhwen Kuo
 
Using Puppet in Small Infrastructures
Using Puppet in Small InfrastructuresUsing Puppet in Small Infrastructures
Using Puppet in Small InfrastructuresRachel Andrew
 
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!Puppet
 
LISA2017 Kubernetes: Hit the Ground Running
LISA2017 Kubernetes: Hit the Ground RunningLISA2017 Kubernetes: Hit the Ground Running
LISA2017 Kubernetes: Hit the Ground RunningChris McEniry
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disquszeeg
 
What you need to know for postgresql operation
What you need to know for postgresql operationWhat you need to know for postgresql operation
What you need to know for postgresql operationAnton Bushmelev
 
Introduce flux & react in practice
Introduce flux & react in practiceIntroduce flux & react in practice
Introduce flux & react in practiceHsuan Fu Lien
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…Sergey Dzyuban
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
 
My first powershell script
My first powershell scriptMy first powershell script
My first powershell scriptDavid Cobb
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)Tibo Beijen
 
What to expect from Java 9
What to expect from Java 9What to expect from Java 9
What to expect from Java 9Ivan Krylov
 

Similar to Fact-Based Monitoring - PuppetConf 2014 (20)

Fact-Based Monitoring
Fact-Based MonitoringFact-Based Monitoring
Fact-Based Monitoring
 
Fact based monitoring
Fact based monitoringFact based monitoring
Fact based monitoring
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical View
 
Introduction to Akka - Atlanta Java Users Group
Introduction to Akka - Atlanta Java Users GroupIntroduction to Akka - Atlanta Java Users Group
Introduction to Akka - Atlanta Java Users Group
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharingVinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
 
OpenStack Summit Vancouver: Lessons learned on upgrades
OpenStack Summit Vancouver:  Lessons learned on upgradesOpenStack Summit Vancouver:  Lessons learned on upgrades
OpenStack Summit Vancouver: Lessons learned on upgrades
 
06 integrate elasticsearch
06 integrate elasticsearch06 integrate elasticsearch
06 integrate elasticsearch
 
Using Puppet in Small Infrastructures
Using Puppet in Small InfrastructuresUsing Puppet in Small Infrastructures
Using Puppet in Small Infrastructures
 
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
 
LISA2017 Kubernetes: Hit the Ground Running
LISA2017 Kubernetes: Hit the Ground RunningLISA2017 Kubernetes: Hit the Ground Running
LISA2017 Kubernetes: Hit the Ground Running
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 
What you need to know for postgresql operation
What you need to know for postgresql operationWhat you need to know for postgresql operation
What you need to know for postgresql operation
 
Introduce flux & react in practice
Introduce flux & react in practiceIntroduce flux & react in practice
Introduce flux & react in practice
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management Application
 
Tech4Africa 2014
Tech4Africa 2014Tech4Africa 2014
Tech4Africa 2014
 
My first powershell script
My first powershell scriptMy first powershell script
My first powershell script
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
What to expect from Java 9
What to expect from Java 9What to expect from Java 9
What to expect from Java 9
 

More from Puppet

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyamlPuppet
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)Puppet
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscodePuppet
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twentiesPuppet
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codePuppet
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approachPuppet
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationPuppet
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliancePuppet
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowPuppet
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Puppet
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppetPuppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkPuppet
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping groundPuppet
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy SoftwarePuppet
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User GroupPuppet
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsPuppet
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyPuppet
 

More from Puppet (20)

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepo
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyaml
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscode
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twenties
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance code
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approach
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automation
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliance
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNow
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden Windows
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael Pinson
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin Reeuwijk
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping ground
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User Group
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOps
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Fact-Based Monitoring - PuppetConf 2014

  • 3. Poll: Monitoring makes me… happy proud cry want to hide
  • 4. Puppet brings Automation to Systems Management
  • 5. Improve Monitoring the way Puppet has improved Systems Management
  • 6. “The good old days” • Your “CMDB” was Excel • SSH in and hack away • Little time for anything else
  • 7. Then Puppet came… • Expressive rules that capture expected result • Using facts and classifiers, a.k.a. metadata to figure out where to apply changes • That freed up a lot of our time* * on a per-machine basis
  • 8. –Me (just now) “Puppet brings immunity of configuration to change in infrastructure”
  • 9. I have seen this before…
  • 10. –C.J. Date (1977) “[SQL brings] immunity of application to change in storage structure and access strategy” http://www.cs.berkeley.edu/~brewer/cs262/SystemR.pdf
  • 11. SQL • 1974 IBM introduces System R and its Structured Query Language • Expressive rules that capture expected result • Using facts and predicates, a.k.a. metadata to figure out what data to get • That freed up a lot of development time
  • 12. SQL • From a time-consuming, imperative mess (“how”) • … to expressive data queries (“what”) SQL query SELECT (desired facts)
 FROM (existing facts)
 WHERE (matching criteria)
  • 13. Puppet • From a time-consuming, imperative mess (“how”) • … to expressive configuration queries (“what”) puppet apply CHANGE (desired facts)
 FROM (existing puppet facts)
 WHERE (matching puppet classes)
  • 14. Is there a pattern?
  • 15. –MCollective overview “Break free from ever more complex naming conventions for hostnames as a means of identity. Use a very rich set of meta data provided by each machine to address them.”
  • 16. MCollective • From a time-consuming, imperative mess (“how”) • … to expressive orchestration queries (“what”) mco rpc service restart service=nginx -F webpool=A EXEC (desired actions)
 FROM (existing puppet facts)
 WHERE (matching puppet classes)
  • 17. Back to monitoring • Monitoring is to behavior what Puppet is to configuration • Monitoring is to behavior what MCollective is to orchestration
  • 18. Monitoring • From a time-consuming, imperative mess (“how”) • … to expressive monitoring queries (“what”) Monitoring query MONITOR (desired behavior)
 FROM (existing heartbeats/metrics)
 WHERE (matching puppet facts)
  • 19. Examples • “All provisioned web servers in the production environment, datacenter ABC must respond to queries within 200ms” • “All PostgreSQL servers must have a postgres: bgwriter process running” • “At least one ActiveMQ server is up to support mcollective" • Never mention a hostname
  • 20. Hosts are not the center of the monitoring universe. Facts are! Hosts are just places where facts occur.
  • 21. The proof is in the pudding…
  • 22. Hosts at the center of the universe a.k.a. the Wrong Way
  • 23. –Nagios Core 4 manual on monitoring clusters “Its fairly straightforward, so hopefully you find things easy to understand…”
  • 24. Host-centric: Monitor a DNS cluster check_command check_service_cluster!"DNS Cluster"!0!1! $SERVICESTATEID:host1:DNS Service$,$SERVICESTATEID:host2:DNS Service$,$SERVICESTATEID:host3:DNS Service$ Where do host1, host2, host3 come from?
  • 25. Host-centric: can’t use facts directly • “Host groups solve this problem”. No, they don’t. • Combinatorial explosion, e.g. trivially • 4 data centers (us-1, us-2, eu, apac) • 5 classes (web, db, cache, appserver, hadoop) • 3 environments (test, staging, prod) • => up to 119 materialized host groups
  • 26. Nagios-bashing? • No! • Same fatal flaw with all host-centric monitoring tools • Host-centric monitoring forces an extra, expensive step: • replicate fact-based conditionals in host-centric templates
  • 27. –puppet-nagios author “Please note that this module is not for the faint of heart. Even I (the author) have my head hurt each time I have to make modifications to it…”
  • 28. Facts at the center of the universe a.k.a. the Right Way "De Revolutionibus manuscript p9b" by Nicolas Copernicus - www.bj.uj.edu.pl. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:De_Revolutionibus_manuscript_p9b.jpg#mediaviewer/ File:De_Revolutionibus_manuscript_p9b.jpga
  • 29. Earlier Examples • “All provisioned web servers in the production environment, datacenter ABC must respond to queries within 200ms” • “All PostgreSQL servers must have a postgres: bgwriter process running” • “At least one ActiveMQ server is up to support mcollective"
  • 30. In Sensu (heartbeats) • “All PostgreSQL servers must have a postgres: bgwriter process running” class postgres::monitoring::sensu { sensu::subscription { 'postgres': } } • Monitoring using a fact-based query • Is node of class “postgres” and subscribed to “postgres” or not? • If so, it will execute the postgres check
  • 31. In Datadog (metrics) • “All provisioned web servers in the production environment, datacenter ABC must respond to queries within 200ms” $ puppet module install datadog-datadog_agent class { ‘datadog_agent’: api_key => …, tags => [$environment], fact_to_tags => [“datacenter”] } include datadog_agent::integrations::nginx
  • 32. In Datadog (metrics) • Monitoring using a fact-based query • Puppet facts directly reused max(nginx.request.latency{production,datacenter:ABC}) < 200
  • 33. What to take away
  • 34. Fact-based monitoring 1. Hosts are not at the center of the monitoring universe 2. Expressive monitoring uses queries 3. Monitoring queries should use Puppet facts