SlideShare a Scribd company logo
1 of 41
.WTF/is/sensu
A DevOps guide to monitoring
.WTF/is/monitorin
g
A DevOps guide to monitoring
.WTF/whois
self:
author: ‘Toby Jackson <toby.jackson@futurenet.com>’
role: ‘Operations Engineer’
twitter: ‘@warmfusion’
github: ‘github.com/warmfusion’
employer: ‘www.futureplc.com/yourfuturejob/’
.WTF/is/monitoring?experience
●Developer turned Engineer
●Implemented Sensu at Future PLC
○340+ hosts, vms, switches etc
●Helped shape our approach to monitoring
.WTF/is/monitoring?_index
Why do we monitor our systems?
What should we look for?
How can Sensu help us?
Questions…?
.WTF/is/monitoring?why
Part One - Why do we monitor our systems
.WTF/is/monitoring?why
● Client - Are they down, or is it just me?
● CEO - Are we making money?
● Manager - Are we meeting SLA agreements?
● Engineer - Am I woken up for right reasons?
● Developer - Did my deploy work?
● Everyone...
○ What’s happening in our environment?
.WTF/is/monitoring?why_tomorrow
● Client - Is maintenance going to happen soon?
● CEO - Are we going to keep making money?
● Manager - Can we meet new SLA agreements?
● Engineer - Why might I get woken up tonight?
● Developer - When do I need to optimise?
● Everyone...
○ Whats going to happen in our environment?
.WTF/is/monitoring?what
Part Two - What should we look for?
.WTF/is/monitoring?disclaimer
Some approaches work better than others
don’t be afraid to experiment.
.WTF/is/monitoring?principles
Focus on your customers
Use a couple of monitoring systems
De-couple your checks from your code
Remember workflow events
Many simple checks > Fewer clever checks
Don’t wake me up if it can wait
.WTF/is/monitoring?first_steps
● Look for the big impact entry points
● Review past incidents for danger zones
● Don’t be afraid to admit that risky code exists
.WTF/is/monitoring?common
●Disk, Ram, Load, Network
●Patches available
●Uptime
●Logged in users
●Config Management status
.WTF/is/monitoring?services
●Create http status endpoints
●JSON is great
●200 OK / 503 Service Unavailable
●Lightweight
●Downstream dependencies?
●Service metrics?
.WTF/is/monitoring?clusters
●Aggregate checks
●Members don’t matter
●Deploys and maintenance is ok
●Avoid bypassing balancers
.WTF/is/monitoring?company
●Programmatic goals can be monitored
●See if revenue, purchases or direct customer
interactions can be watched
●Watch for social media mentions
.WTF/is/monitoring?practise_simple
● nginx & php running
● Balancer: 200 OK
● nginx: 200 OK
● Cron: ignore for now
Web Load Balancer
Web01
nginx
php
cron
Web02
nginx
php
.WTF/is/monitoring?practise_adv
● Balancer
>50% backends up
● Nginx
< 200ms response
● Cron
err log empty &&
<1hr old
Web Load Balancer
Web01
nginx
php
cron
Web02
nginx
php
.WTF/is/monitoring?practise_clever
● Spike in traffic
● Failure counts
above thresholds
● Response sizes are
curiously large
● Lots of (valid) API
Auth requests
Web Load Balancer
Web01
nginx
php
cron
Web02
nginx
php
Your users matter
Know when they’re in pain
Develop a standardised app status page
Conventional checks are used more frequently
Check lots of small things
Scales better and helps to isolate incidents quickly
.WTF/is/monitoring?what
.WTF/is/sensu
Part Three - How can Sensu help us
.WTF/is/sensu?introduction
“New generation” of monitoring solutions
Open source with paid for Enterprise edition
Site: sensuapp.org
GitHub: github.com/sensu
IRC: freenode - #sensu
.WTF/is/sensu?what
Consistent way to describe a service check
Executes those checks as required
Reliably handles events (and metrics)
.WTF/is/sensu?why
●Tries to do one thing well; handle events
●Compatible with existing check scripts
●Large active open-source community
●Scales effectively
.WTF/is/sensu?experience
●Replaced nagios, crons etc
●Raised visibility of monitoring
●Devolved control to development
●340 (ish) hosts, vms, switches, firewalls etc
●Managed exclusively through Puppet
●Developed custom plugins and extensions
.WTF/is/sensu?architecture_simple
.WTF/is/sensu?how
The Sensu Standalone Check Process:
a. Sensu-Client runs a script with 1 line output and an
exit code
b. Sensu-Client converts event into JSON and puts on
RabbitMQ
c. Sensu-Server reads event and sends to handlers
d. Handlers process event, performing some action
.WTF/is/sensu?architecture_simple
You are here
.WTF/is/sensu?standalone_check
● Describes
○ what check to run
○ how to handle events
● Runs at a given interval (default
60s)
● sensu-client handles output and
emits events over message
brokers
● Can include custom
configuration which is included
in event sent to handlers
sensu::checks:
'sensu-server':
command: 'check-procs.rb -p bin/sensu-
server -c 1'
handlers: ['high', 'pagerduty']
custom:
runbook: 'https://wiki.ftr.com/x/4oqq'
tip: 'Check /var/log/sensu-server.log'
slack:
channels:
- '#craggyisland'
.WTF/is/sensu?runbook
URI to page summary of
Impacted services
Troubleshooting
Common problems
How to fix
Who to talk to
References to other information
.WTF/is/sensu?tip
Tweet length one-liner
Gets included in Pagerduty and Slack notices
Useful at 4am on a Sunday morning
.WTF/is/sensu?architecture_simple
You are here
.WTF/is/sensu?architecture_simple
You are here
.WTF/is/sensu?handler
● Process events
● Perform some (or no) action
● Typically used to send alerts or
emails
sensu::handler:
slack:
type: 'pipe'
command: 'slack.rb'
config:
webhook_token: 'SECRET/KEY'
bot_name: 'sensu'
channel: '#alerts'
pagerduty:
type: 'pipe'
command: 'pagerduty.rb'
severities: ['ok', 'critical']
config:
api_key: SECRET_TOKEN_HERE
.WTF/is/sensu?standalone_metrics
● The same as checks but...
● handlers: [‘metrics’]
○ A special handler for this
kind of result
● type: metric
○ Tells sensu to always send
the output to the handler
sensu::checks:
cpu-pcnt-usage-metrics:
command: 'cpu-pcnt-usage-metrics.rb'
handlers: ['metrics']
type: metric
.WTF/is/sensu?metric_example
ix-sensu01.cpu.user 70.92 1440425049
ix-sensu01.cpu.nice 0.00 1440425049
ix-sensu01.cpu.system 8.16 1440425049
ix-sensu01.cpu.idle 19.90 1440425049
ix-sensu01.cpu.iowait 0.00 1440425049
ix-sensu01.cpu.irq 0.00 1440425049
ix-sensu01.cpu.softirq 1.02 1440425049
ix-sensu01.cpu.steal 0.00 1440425049
ix-sensu01.cpu.guest 0.00 1440425049
Key Value Timestamp
.WTF/is/sensu?dashboards
● Uchiwa - github.com/sensu/uchiwa
● Mosaic - github.com/warmfusion/mosaic
● Sensu-Grid - github.com/alex-leonhardt/sensu-grid
.WTF/is/sensu?issues
●Uchiwa isn’t perfect
●Sensu-API can crash sometimes
●No maintained history (over 20 events)
●Check dependencies are handled on clients
●Redis for datastore
○Redundancy is a little harder (for me at least)
.WTF/is/sensu?wins
●Alerts into Slack channels
●Handles network partitions really well
●Easy to create new checks and handlers
.WTF/is/monitoring?further_reading
Programmatic Alert Correlation - Elik Eizenberg
youtu.be/EXk19d09n54
Effective Incident Communication - Scott Klein
youtu.be/ySSdqfZlC7Y
Search for Operability 2015 in YouTube
.WTF/whois?q=
self:
author: ‘Toby Jackson <toby.jackson@futurenet.com>’
role: ‘Operations Engineer’
twitter: ‘@warmfusion’
github: ‘github.com/warmfusion’
employer: ‘www.futureplc.com/yourfuturejob/’
Any Questions…?

More Related Content

What's hot

Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearlDavid Tibbs
 
How Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA InfrastructureHow Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA InfrastructureEvanKrall
 
Cf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteCf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteJeff Barrows
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...Puppet
 
Grafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesGrafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesPhilip Wernersbach
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Toolsm_richardson
 
Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015 Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015 Chef
 
Saltconf16 william-cannon b
Saltconf16 william-cannon bSaltconf16 william-cannon b
Saltconf16 william-cannon bWilliam Cannon
 
OSMC 2017 | Troubleshooting-icinga 2 by Thomas Widhalm
OSMC 2017 |  Troubleshooting-icinga 2 by Thomas WidhalmOSMC 2017 |  Troubleshooting-icinga 2 by Thomas Widhalm
OSMC 2017 | Troubleshooting-icinga 2 by Thomas WidhalmNETWAYS
 
Armada - the way to ship microservices
Armada - the way to ship microservicesArmada - the way to ship microservices
Armada - the way to ship microservicesGameDesire Company
 
The Open-Source Monitoring Landscape
The Open-Source Monitoring LandscapeThe Open-Source Monitoring Landscape
The Open-Source Monitoring LandscapeMike Merideth
 
Prometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual TalksPrometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual TalksSatoshi Suzuki
 
Push jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private ChefPush jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private ChefChef Software, Inc.
 
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Codemotion
 
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)SaltStack
 
Introduction to SaltStack
Introduction to SaltStackIntroduction to SaltStack
Introduction to SaltStackAymen EL Amri
 
2020 ADDO Spring Break OWASP ZAP Automation
2020 ADDO Spring Break OWASP ZAP Automation2020 ADDO Spring Break OWASP ZAP Automation
2020 ADDO Spring Break OWASP ZAP AutomationSimon Bennetts
 
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...SaltStack
 

What's hot (20)

Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearl
 
How Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA InfrastructureHow Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA Infrastructure
 
Cf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteCf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphite
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
 
Grafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesGrafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and Challenges
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Tools
 
Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015 Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015
 
Saltconf16 william-cannon b
Saltconf16 william-cannon bSaltconf16 william-cannon b
Saltconf16 william-cannon b
 
OSMC 2017 | Troubleshooting-icinga 2 by Thomas Widhalm
OSMC 2017 |  Troubleshooting-icinga 2 by Thomas WidhalmOSMC 2017 |  Troubleshooting-icinga 2 by Thomas Widhalm
OSMC 2017 | Troubleshooting-icinga 2 by Thomas Widhalm
 
Armada - the way to ship microservices
Armada - the way to ship microservicesArmada - the way to ship microservices
Armada - the way to ship microservices
 
The Open-Source Monitoring Landscape
The Open-Source Monitoring LandscapeThe Open-Source Monitoring Landscape
The Open-Source Monitoring Landscape
 
Prometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual TalksPrometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual Talks
 
Push jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private ChefPush jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private Chef
 
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
 
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
 
Introduction to SaltStack
Introduction to SaltStackIntroduction to SaltStack
Introduction to SaltStack
 
2020 ADDO Spring Break OWASP ZAP Automation
2020 ADDO Spring Break OWASP ZAP Automation2020 ADDO Spring Break OWASP ZAP Automation
2020 ADDO Spring Break OWASP ZAP Automation
 
What's new in chef 12
What's new in chef 12 What's new in chef 12
What's new in chef 12
 
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
 

Viewers also liked

AtlasCamp 2015: How HipChat ships at the speed of awesome
AtlasCamp 2015: How HipChat ships at the speed of awesomeAtlasCamp 2015: How HipChat ships at the speed of awesome
AtlasCamp 2015: How HipChat ships at the speed of awesomeAtlassian
 
Online Communities
Online CommunitiesOnline Communities
Online CommunitiesDawn Foster
 
AppSphere 15 - Containers and Microservices Create New Performance Challenges
AppSphere 15 - Containers and Microservices Create New Performance ChallengesAppSphere 15 - Containers and Microservices Create New Performance Challenges
AppSphere 15 - Containers and Microservices Create New Performance ChallengesAppDynamics
 
Data Visualization on the Tech Side
Data Visualization on the Tech SideData Visualization on the Tech Side
Data Visualization on the Tech SideMathieu Elie
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous PersistenceJervin Real
 
Bsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsBsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsSuraj Pratap
 
Combining sentences with the words although and despite
Combining sentences with the words although and despiteCombining sentences with the words although and despite
Combining sentences with the words although and despiteEmily Kissner
 
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsUSGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsMarcellus Drilling News
 
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017Codemotion
 
Reversing malware analysis training part2 introduction to windows internals
Reversing malware analysis training part2 introduction to windows internalsReversing malware analysis training part2 introduction to windows internals
Reversing malware analysis training part2 introduction to windows internalsCysinfo Cyber Security Community
 
Adaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin ContentAdaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin Contentcdelk
 
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...Sean Whalen
 
Micro Services - Small is Beautiful
Micro Services - Small is BeautifulMicro Services - Small is Beautiful
Micro Services - Small is BeautifulEberhard Wolff
 
Deploying services: automation with docker and ansible
Deploying services: automation with docker and ansibleDeploying services: automation with docker and ansible
Deploying services: automation with docker and ansibleJohn Zaccone
 
Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Ethan Dodge
 
Splunk Dynamic lookup
Splunk Dynamic lookupSplunk Dynamic lookup
Splunk Dynamic lookupSplunk
 

Viewers also liked (20)

Introduction to Volansys Technologies
Introduction to Volansys TechnologiesIntroduction to Volansys Technologies
Introduction to Volansys Technologies
 
AtlasCamp 2015: How HipChat ships at the speed of awesome
AtlasCamp 2015: How HipChat ships at the speed of awesomeAtlasCamp 2015: How HipChat ships at the speed of awesome
AtlasCamp 2015: How HipChat ships at the speed of awesome
 
Incident Response in the wake of Dear CEO
Incident Response in the wake of Dear CEOIncident Response in the wake of Dear CEO
Incident Response in the wake of Dear CEO
 
Online Communities
Online CommunitiesOnline Communities
Online Communities
 
AppSphere 15 - Containers and Microservices Create New Performance Challenges
AppSphere 15 - Containers and Microservices Create New Performance ChallengesAppSphere 15 - Containers and Microservices Create New Performance Challenges
AppSphere 15 - Containers and Microservices Create New Performance Challenges
 
Data Visualization on the Tech Side
Data Visualization on the Tech SideData Visualization on the Tech Side
Data Visualization on the Tech Side
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
Bsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsBsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue Teams
 
Resume
ResumeResume
Resume
 
Combining sentences with the words although and despite
Combining sentences with the words although and despiteCombining sentences with the words although and despite
Combining sentences with the words although and despite
 
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsUSGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
 
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
 
Reversing malware analysis training part2 introduction to windows internals
Reversing malware analysis training part2 introduction to windows internalsReversing malware analysis training part2 introduction to windows internals
Reversing malware analysis training part2 introduction to windows internals
 
Adaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin ContentAdaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin Content
 
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
 
Micro Services - Small is Beautiful
Micro Services - Small is BeautifulMicro Services - Small is Beautiful
Micro Services - Small is Beautiful
 
Deploying services: automation with docker and ansible
Deploying services: automation with docker and ansibleDeploying services: automation with docker and ansible
Deploying services: automation with docker and ansible
 
"Mini Texts"
"Mini Texts" "Mini Texts"
"Mini Texts"
 
Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015
 
Splunk Dynamic lookup
Splunk Dynamic lookupSplunk Dynamic lookup
Splunk Dynamic lookup
 

Similar to WTF is Sensu and Monitoring

Continuous Delivery at Snyk
Continuous Delivery at SnykContinuous Delivery at Snyk
Continuous Delivery at SnykAnton Drukh
 
Salesforce CI (Continuous Integration) - SFDX + Bitbucket Pipelines
Salesforce CI (Continuous Integration) - SFDX + Bitbucket PipelinesSalesforce CI (Continuous Integration) - SFDX + Bitbucket Pipelines
Salesforce CI (Continuous Integration) - SFDX + Bitbucket PipelinesAbhinav Gupta
 
Mobile Virtualization Management
Mobile Virtualization ManagementMobile Virtualization Management
Mobile Virtualization ManagementYaniv Bronhaim
 
Intermediate git
Intermediate gitIntermediate git
Intermediate gitDan Shrader
 
improving the performance of Rails web Applications
improving the performance of Rails web Applicationsimproving the performance of Rails web Applications
improving the performance of Rails web ApplicationsJohn McCaffrey
 
Automating Tests with Chrome DevTools Recorder
Automating Tests with Chrome DevTools RecorderAutomating Tests with Chrome DevTools Recorder
Automating Tests with Chrome DevTools RecorderApplitools
 
Rundeck's History and Future
Rundeck's History and FutureRundeck's History and Future
Rundeck's History and Futuredev2ops
 
MuleSoft Meetup | Reading Meetup Group | Hosted by Integral Zone
MuleSoft Meetup | Reading Meetup Group | Hosted by Integral ZoneMuleSoft Meetup | Reading Meetup Group | Hosted by Integral Zone
MuleSoft Meetup | Reading Meetup Group | Hosted by Integral ZoneIntegralZone
 
CLL19 - Acceptance Tests as Monitors
CLL19 - Acceptance Tests as MonitorsCLL19 - Acceptance Tests as Monitors
CLL19 - Acceptance Tests as MonitorsPhill Barber
 
Trunk based development
Trunk based developmentTrunk based development
Trunk based developmentgo_oh
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseChristian McHugh
 
Joomla Code Quality Control and Automation Testing
Joomla Code Quality Control and Automation TestingJoomla Code Quality Control and Automation Testing
Joomla Code Quality Control and Automation TestingShyam Sunder Verma
 
Big feature - small sprint
Big feature - small sprint Big feature - small sprint
Big feature - small sprint Igor Goldshmidt
 
Service workers - Forza lavoro al servizio della tua Performance
Service workers - Forza lavoro al servizio della tua PerformanceService workers - Forza lavoro al servizio della tua Performance
Service workers - Forza lavoro al servizio della tua PerformancePiero Bellomo
 
Secure Developer Access at Decisiv
Secure Developer Access at DecisivSecure Developer Access at Decisiv
Secure Developer Access at DecisivTeleport
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
OSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean GabèsOSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean GabèsNETWAYS
 
Sprint 45 review
Sprint 45 reviewSprint 45 review
Sprint 45 reviewManageIQ
 

Similar to WTF is Sensu and Monitoring (20)

An intro to git
An intro to gitAn intro to git
An intro to git
 
Continuous Delivery at Snyk
Continuous Delivery at SnykContinuous Delivery at Snyk
Continuous Delivery at Snyk
 
Salesforce CI (Continuous Integration) - SFDX + Bitbucket Pipelines
Salesforce CI (Continuous Integration) - SFDX + Bitbucket PipelinesSalesforce CI (Continuous Integration) - SFDX + Bitbucket Pipelines
Salesforce CI (Continuous Integration) - SFDX + Bitbucket Pipelines
 
Mobile Virtualization Management
Mobile Virtualization ManagementMobile Virtualization Management
Mobile Virtualization Management
 
Intermediate git
Intermediate gitIntermediate git
Intermediate git
 
improving the performance of Rails web Applications
improving the performance of Rails web Applicationsimproving the performance of Rails web Applications
improving the performance of Rails web Applications
 
Automating Tests with Chrome DevTools Recorder
Automating Tests with Chrome DevTools RecorderAutomating Tests with Chrome DevTools Recorder
Automating Tests with Chrome DevTools Recorder
 
Rundeck's History and Future
Rundeck's History and FutureRundeck's History and Future
Rundeck's History and Future
 
MuleSoft Meetup | Reading Meetup Group | Hosted by Integral Zone
MuleSoft Meetup | Reading Meetup Group | Hosted by Integral ZoneMuleSoft Meetup | Reading Meetup Group | Hosted by Integral Zone
MuleSoft Meetup | Reading Meetup Group | Hosted by Integral Zone
 
CLL19 - Acceptance Tests as Monitors
CLL19 - Acceptance Tests as MonitorsCLL19 - Acceptance Tests as Monitors
CLL19 - Acceptance Tests as Monitors
 
Ui Testing with Ghost Inspector
Ui Testing with Ghost InspectorUi Testing with Ghost Inspector
Ui Testing with Ghost Inspector
 
Trunk based development
Trunk based developmentTrunk based development
Trunk based development
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterprise
 
Joomla Code Quality Control and Automation Testing
Joomla Code Quality Control and Automation TestingJoomla Code Quality Control and Automation Testing
Joomla Code Quality Control and Automation Testing
 
Big feature - small sprint
Big feature - small sprint Big feature - small sprint
Big feature - small sprint
 
Service workers - Forza lavoro al servizio della tua Performance
Service workers - Forza lavoro al servizio della tua PerformanceService workers - Forza lavoro al servizio della tua Performance
Service workers - Forza lavoro al servizio della tua Performance
 
Secure Developer Access at Decisiv
Secure Developer Access at DecisivSecure Developer Access at Decisiv
Secure Developer Access at Decisiv
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
OSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean GabèsOSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean Gabès
 
Sprint 45 review
Sprint 45 reviewSprint 45 review
Sprint 45 review
 

Recently uploaded

Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceDelhi Call girls
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 

Recently uploaded (20)

Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 

WTF is Sensu and Monitoring

Editor's Notes

  1. Originally going to talk about Sensu, but thought there was move value in sharing some general observations about monitoring
  2. Still going to discuss Sensu, but going to start with WTF is monitoring
  3. Mixture of physical servers and virtual machines, docker containers (prototype) basic checks of hardware such as switches, routers, firewalls.
  4. Broken up into three core sections Going to discuss why you monitor and what you can look for, and how to best get that information out of your code and finally how Sensu can be used to achieve this in a scalable platfom Hold questions to the end - my timekeeping isn’t great, so try to avoid distracting me
  5. There are quite a few people that (should?) be interested in you environment and they each have their own motivations;
  6. statuspages are a great method of communicating with your more technical clients such as API users or perhaps business to business clients Easy to provide basic feedback to initially, but you may need to consider how you want to communicate with your paying customers Some further reading at the end might be useful CEOs and Managers are infrequent users of monitoring, but often ask harder questions about trends or aggregate values - don’t worry about these users right away. Engineers and Developers are are initial target audience and can be easily pleased Devs can create monitors for their own needs Engineers can quickly get grumpy and demand better alerts But monitoring can provide more information that what’s going on with your systems right now...
  7. Simply ask yourselves: “What impact does this service have if it breaks?”
  8. Customers can mean clients, advertisers, other developers, internal staff When developing checks for your systems consider how to recognise impact to their workflows Don’t rely on a single monitoring system - use a blended approach that provide different features and fault tolerance Pingdom or StatusCake to monitor from outside your site, use Sensu internally, but perhaps run some crons with simple messaging for dead-man-switches on core components Your platform probably has some asynchronous actions, maybe with humans involved Keep an eye that things that should happen, are happening Try and provide back pressue on queues to prevent overloading downstream systems Don’t run straight to PagerDuty to wake someone up when one of your 8 node cluster fails or if a server is starting to run low on disk You don’t make friends with your on-call team like that So where can you start?
  9. Walk before you can run by starting with the big obvious failure modes My servers gone offline The websites throwing 500 errors Jon just logged onto the live servers Think about how your systems have behaved in the past you’ve hopefully fixed the issues, but what symptoms can you look for in the future Who can help diagnose different parts of your infrastructure Some code is simply more fragile that others That clever broker mesh used to distribute content around the world - it’ll fail The shared filesystem that you use to lock the cluster - it’ll timeout That legacy website using obsolete libraries and obscure databases - it’ll crash It’s only a failure if you put your head in the sand and pretend it’ll all be fine
  10. Watch out for checks eating up CPU creating false positives Uptime - Do you like to reboot boxes every so often, or perhaps want to know if a machine just restarted Configuration Management state can be useful - for example, at Future we use Puppet to manage our servers, but we sometimes forget to change environments back after testing or phased deployment. Our monitoring system alerts us after a period of time to bring it back onto production.
  11. Avoid coupling your application stack to your monitoring software by trying to push messages directly Develop a convention for status pages that can be used by your team for all services Use json formatting - easy to parse, human friendly, lightweight HTTP Status codes are great for indicating basic state for simpler upstream systems - eg HAProxy Keep your checks lightweight - Don’t take your services offline by having a heavy status page Caching results is acceptable, but it’s probably wise to allow a cache clear or indication of status age in your API
  12. Consider what your cluster
  13. Check processes are running External check against balancers ensure you site is available for customers Internal checks against each server tell you when you have a partial failure Don’t wake me up.. probably Easy to implement - could probably do this without anything more complex than cronjobs and config management
  14. But in a scalable system you probably want to know a little more about your environment Balancers should have at least 50% capacity avaliable web servers should be not only returning pages, but doing so in a sensible time cronjobs are running as required Check with inference in most cases, but ideally use a dead-mans-switch But why only focus on the now...
  15. What about trends harder to measure starting to become relevant to managers
  16. To highlight the important points and now for some technology...
  17. Spoken about why spoken about what you can look for and now for a little bit about how you can use sensu to achieve this
  18. Installation process is straightforward, but involves a few steps RabbitMQ or Redis brokers need to be setup and configured as message transport Redis is needed for a keyvalue store Sensu itself is simple to install on clients - a ruby or jvm daemon Often excellent documentation, though sometimes features are better described in older versions for some reason IRC community very supportive - I can often be found there helping to answer questions, or asking some of my own
  19. Events are created in one of three ways: Checks running at a defined interval managed by the client Checks run on demand from the master controller Passively accepting events in json format from arbitrary sources
  20. Compatible with de-facto standards introduced by Nagios - exit codes and single line message output Can scale both by introducing additional sensu-masters to your environment Allows you to devolve control of monitoring checks to your development team
  21. The previous monitoring system at Future PLC used a very clever auto-configuring Nagios system Checks created on the fly based on our metrics feeds No-one really understood how it worked and even fewer wanted to A replacement was needed… We looked at a few options, but Sensu stood out as it shared some of our existing technologies (Ruby, RabbitMQ, Nagios checks) aligned with our goals of moving towards a hybrid scalable platform of potentially short lived services. We could make the technology scale - that’s comparitivly easy - throw in a few web balancers, broker meshes, virtual IP’s and you’re pretty much done. More difficult is scaling the creation of new checks, and management of existing ones. At Future, anyone can create new check scripts and add them into our monitoring system without assistance using Puppet. They can even raise alerts to out-of-hours support from their own configuration - but I’ve recommended having a conversation before that particular power is used. We’ve created and adapted existing checks, and worked on some bespoke plugins and extentions to solve some specific use cases we have. Working on an SNMPTrap extension - find in my GitHub repository.
  22. Sensu runs an client agent on your nodes (Or in a side-car container)
  23. Sensu runs an client agent on your nodes (Or in a side-car container)
  24. Described as json configuration files Shown here in yaml - taken from our puppet config
  25. Assume the brokers have sent the message, and the sensu-servers are ready to consume those events. The sensu-server process will read the event, and if the status is deemed to require handling, the event is passed to a handler script.
  26. Typically use ‘pipe’ type handlers where events are piped through STDOUT to forked processes Can also use TCP, UDP or transport types. handlers can define some configuration, but as shown in our example check, some of those values can be extended or modified by the check itself Allows for flexibility as you can allow developers to define the chat rooms to notify, or the nuances of your handler script from their clients without having to modify servers
  27. but another form of event includes a ‘metrics’ event Sensu can also be used to collect qualititive data from your platform, submitting those events as ‘metrics’ types The only real different between metrics and normal check types is that the handlers are invoked for every check result, and the output of the event is always sent to the handler The output of a metrics type can also include multiple lines of output - all of which are sent to the handlers. This lets you aggregate lots of key/value pairs into a single response
  28. Uchiwa has some quirks Doesn’t like being load-balanced - lists change ordering depending which sensu-master you hit Sensu-API is a little odd sometimes Sensu-Server restarts, the api sometimes doesn’t realise and needs a kicking itself - only affects Uchiwa really; monitoring itself is unaffected No real history of events Sensu is a ‘router’ of events; if you want reports of the last 7 days uptime, you need to look towards another tool Perhaps an ELK stack (its what we do) collecting sensu events Dependencies between services are checked when a result is generated, not when it is handled - so your backend server being in maintaince, may still alert from the haproxy cluster because setting up dependencies isn’t simple. Basically easy to live with, but for the enthusiastic look at my reading list for programmatic alert correlation
  29. Alert Correlation appears to be a complicated process of identifying related events and alerts from the stream of alerts into one or more incidents. This presentation describes the concepts more fully, and gives some suggestions on how to start correlating events Effective incident communication is from one of the founders of StatusPage.io and gives some really interesting pointers on how to properly communicate service events with your customers, or even internal staff.
  30. Hiring: 2 Junior Ops - Devs looking to get closer to infrastructure and architecture decisions and management PHP Devs - Working on our front and backend systems delivering content to millions of daily users Other roles - check the link here