This document discusses monitoring a CentOS stack with Prometheus. Prometheus is an open-source systems monitoring and alerting toolkit that collects metrics from configured targets at given intervals, evaluates rule expressions, displays time series data, and triggers alerts. It consists of multiple components including Prometheus for metrics collection and storage, exporters for exposing metrics, Grafana for visualization, and Alertmanager for alert routing.
4. Questions that come after:
It's up but it is performant?
It's down but for everyone?
Its is degraded but are the users impacted?
Is it even relevant?
5. Metrics Monitoring
e.g. traditionally graphite
Gather fine grained data at frequent interval
Make them useful by labelling them ; store
them
Analyze them to understand what is going on
6. Metrics ARE PART OF
monitoring
Do not maintain a metrics + a "traditional
monitoring" stack
Alert from metrics directly!
8. We are in the cloud era.
Here are some buzzwords for you
cloud, API, openstack, devops, docker, bimodal,
stateless, kubernetes, orchestration, automation,
serverless, docker, humanops, ansible, continuous
deployment, cri-o, jenkins, agile, docker, red hat,
containers, virtualization, provisionning, monitoring,
observability...
13. We need deserve better tools
Our customers ask us to respond fast, in
seconds
We make hundreds of operations per second
What is your monitoring frequency... 5
minutes?
16. Cloud Native
Easy to configure, deploy, maintain
Designed in multiple services
Container ready
Orchestration ready (dynamic config)
Fuzziness
17. Data Centric
A Metric in Prometheus has metadata:
myql_global_status_handlers_total{handler="tmp_write"} 1122
And lots of function to filter, change, remove...
those metadata while fetching them.
=> OpenMetrics.io
18. Open Source
Apache 2.0
Go
Support for multiple OS
Many "exporters":
https://github.com/prometheus/prometheus/wiki/Default-
port-allocations
19. Simple
1 service = 1 things
Takes care of its db (time based retention
and/or disk space based retention)
26. Exporters
Exporters expose metrics with an HTTP API
Bindings available for many languages
Exporters do not save data ; they are not
"proxies" and don't "cache" anything
41. What is the Alertmanager
doing?
Receives alerts
Group them
Inhibits them
Dispatches them
Deals with HA
42. How to alerts?
Email
Some vendors: Slack, Hipchat, VictorOps,
pagerduty, ...
Generic Webhook -> Plug in anything you want
43. High Availability
2 prometheus servers do exact the same job
They send alerts to Alertmanagers
Alertmanagers are clustered not to send the
same notification twice
44.
45. Grafana
Open Source (Apache 2.0)
Web app
Specialized in visualization
Pluggable
Multiple datasources: prometheus, graphite,
influxdb...
Has an API!
46. History of Grafana
Grafana is a fork of Kibana 3 ; used to be JS-
Driven.
Now fully featured, requires a database, multi-
projects/users support, etc...
54. Creating Grafana Dashboards
Takes time
Requires deep knowledge of the tools
Improved over time
Easy to share (json + online library)
Try grafonnet-lib!
55. Conclusion
Lots of data that can be explored in many ways
(subqueries are coming)
Trends and deviations are easy to calculate
Can monitor both business and technical
Very convenient to monitor any kind of stack