SlideShare a Scribd company logo
1 of 34
Download to read offline
Understanding and Extending
Prometheus AlertManager
Lee Calcote
calcotestudios.com/talks
Lee Calcote
linkedin.com/in/leecalcote
@lcalcote
blog.gingergeek.com
lee@calcotestudios.com
clouds, containers, infrastructure,
applications  and their management
calcotestudios.com/talks
Show of Hands
AlertManager
Prometheus
 is an alert...Alertmanager
@lcalcote
Purpose
ingester
grouper
de-duplicator
silencer
throttler
notifier
  Receivers
ˈnō-mən-ˌklā-chər
a brief Prometheus AlertManager construct review
match alerts to their receiver and
how often to notify
where and how to send alerts
 Routes
@lcalcote
- matches alerts with specific labels and prevents
them from being included in notifications.
 
 - suppress specific notifications when other
specific alerts are already firing.
 
 - categorizes alerts of similar nature into a single
notification.
Silencers
Inhibitors
Grouping
ˈnō-mən-ˌklā-chər
a brief Prometheus AlertManager construct review
Muting
Suppressing
Correlating
group_wait: 30s
group_by: ['alertname', 'cluster']
group_interval: 5m
@lcalcote
Inhibition
Multiple approaches to suppression
@lcalcote
repeat_interval
vs
Silences
vs
per routeglobalvia ui / api
Alerts
ALERT <alert name>
IF <PromQL vector expression>
FOR <duration>
LABELS { ... }
ANNOTATIONS { ... }
Supports clients other than
Prometheus
is notified when alerts
transition state
@lcalcote
a shared construct
Prometheus AlertManager
inactive
firing
pending
state transition
inactive
firing
notifications
!
Notification Integrations
@lcalcote
Notifying to Multiple Destinations
Use  to advance to next receiver.continue
route:
receiver: email_webhook
receivers:
- name: email_webhook
email_configs:
- to: 'lee@example.io'
webhook_configs:
- url: <webhook url here>
Use a  that goes to both destinations.receiver
route:
receiver: ops-team-all # default
routes:
- match:
severity: page
receiver: ops-team-b
continue: true
- match:
severity: critical
receiver: ops-team-a
receivers:
- name: ops-team-all
email_configs:
- to: ops-team-all@example.io
- name: ops-team-a
email_configs:
- to: ops-team-a@example.io
- name: ops-team-b
email_configs:
- to: ops-team-b@example.io
or
@lcalcote
Inhibitor
Dispatcher
Non-HA AlertManager Architecture
Silencer
de-duplication
Dispatcher sorts incoming alerts into
aggregation groups and assigns the
correct notifiers to each.
api
Alert Provider
UI
Silence Provider
store
de-duplication
subscribe
Router
batched
alerts
notification
pipeline
Notify Provider
checks for previously
sent notifications
Retry
RetryMaintenance
Script
!
@lcalcote
alerts
@lcalcote
High Availability
being introduced in 0.5
I gossip protocols.
built atop Weave Mesh
With HA, you no longer have to monitor the monitor.
 
Designed for an alert to be sent to all instances in the cluster.
 
All Prometheus instances send alerts to all Alertmanager instances.
 
Guarantees notifications to be sent at least once.
@lcalcote
AlertManager UI
@lcalcote
@lcalcote
Story:
As an Operator, I would like to not only see a list of firing alerts,
but also a list of all transpired alerts, so that I may have additional
context as the thresholding behavior for a given defined alert.
@lcalcote
Prologue:
Alert troubleshooting is improved when operators have a view of
what is firing, has recently fired, what is normal, but also go back
in time and see what fired an hour ago. Understanding firing order
assists in root cause analysis and identify problem areas.
 
Limitations:
1. AlertManager database (SQLite) is not intended to provide
long-term storage.
Acceptance Criteria:
1. Once fired, whether actively firing or not, alerts will be
displayed on the History page.
2. Optionally, fired alerts will be notified to a Slack channel.
Stretch:
Include pagination
Add a date range picker
Add a host filter
 
Environment
test setup
Random Sample Targets
$ git clone https://github.com/prometheus/client_golang.git
$ cd client_golang/examples/random
$ go get -d
$ go build
Fetch and compile the client library code example.
Start example targets in separate terminals.
$ ./random -listen-address=:8080
$ ./random -listen-address=:8081
$ ./random -listen-address=:8082
Be sure to create and run the  and
point it at your soon-to-be AlertManager:
random sample targets
@lcalcote
Prometheus and Alert Rules Setup
Follow the  to download, configure and run Prometheus.getting started instructions
$ ./prometheus -config.file=prometheus.yml -alertmanager.url=http://localhost:9093
ALERT instance_down
IF up == 0
FOR 5s
LABELS {severity="page"}
ANNOTATIONS {
DESCRIPTION="{{$labels.instance}} of job {{$labels.job}}
has been down for more than 5 seconds.",
SUMMARY="Instance {{$labels.instance}} down"}
/alert.rules
A simple alert rule that will fire when any given target is unreachable for longer than 5 seconds.
@lcalcote
!
...
# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:
- "alert.rules"
...
/prometheus.yml
Environment
development setup
@lcalcote
Grab Repos
$ git clone https://github.com/prometheus/alertmanager.git
Given that our user story includes making front-end changes to AlertManager,
ensure that you install a small utility to generate Go code from any file.
Clone AlertManager repo
Get, build and copy go-bindata into any directory on your PATH
$ go get -u github.com/jteeuwen/go-bindata/...
$ cd $GOPATH/src/github.com/jteeuwen/go-bindata/go-bindata
$ go build
Notification Integration
create an alert notification receiver.
 
route:
group_by: [cluster]
# If an alert isn't caught by a route, send it slack.
receiver: slack_general
routes:
# Send severity=slack alerts to slack.
- match:
severity: page
receiver: slack_general
receivers:
- name: slack_general
slack_configs:
- api_url: '<your-web-url-here>'
channel: '#<your-channel-name-here>'
send_resolved: true
Of the supported AlertManager receivers,
let’s opt for integrating Slack.
@lcalcote
@lcalcote
The  can
assist in building
routing trees.
visual editor
Build, Run, Test
Verify you have a functional development
environment by building and running the project:
$ make assets # invokes go-bindata to inject static web files
$ go build # compiles go code
$ ./alertmanager -config.file=slack.yml # runs alertmanager with the specified configuration
@lcalcote
$ curl -X POST http://localhost:9090/-/reload $ kill -HUP `pgrep alertmanager`
$ ./promtool check-config <config file> $ ./promtool check-rules <rules file>
Reload Prometheus or AlertManager configs
Validate Prometheus config and alert rules
@lcalcote
Test
If you choose to setup a Slack channel, you
should now see new alerts firing as and
when your random targets go up and down.
/ui/app/js/app.js
Changelog
/api.go
/ui/app/partials/history.html
Angular
HTML
Go
Go & SQL
/provider/provider.go
/provider/sqlite/sqlite.go
/provider/boltmem/boltmem.go
@lcalcote
All UI functionality should be addressable via API.
Let’s register a :
/api.go
new /history API endpoint
r.Get("/history", ihf("history", api.listAllAlerts))
func (api *API) listAllAlerts(w http.ResponseWriter, r *http.Request) {
alerts := api.alerts.GetAll()
defer alerts.Close()
With our /api/v1/history endpoint a newly addressable API endpoint, we’ll need to
build a function to handle requests made to it.
The  function will handle inbound
HTTP requests made to the new endpoint.
api.listAllAlerts
@lcalcote
1. Add  (e.g. GetAll() AlertIterator) to /provider/provider.go
2. Add a  to /provider/sqlite/sqlite.go
3. Add a to /provider/boltmem/boltmem.go
a new AlertIterator
new AlertProvider and SQL query
new AlertIterator and AlertProvider
With API endpoint, let’s turn our attention to the
backend for collecting the right recordset from our
data provider.
/provider
@lcalcote
/ui/app/js/app.js
angular.module('am.controllers').controller('NavCtrl',
function($scope, $location) {
$scope.items = [{
name: 'History',
url: 'history'
},
angular.module('am.services').factory('History',
function($resource) {
return $resource('', {}, {
'query': {
method: 'GET',
url: 'api/v1/history'
}
});
}
);
 NavCtrl for the :History menu item
as well as a :new History service
angular.module('am.controllers').controller('HistoryCtrl',
function($scope, History) {
$scope.refresh = function () {
History.query({},
function(data) {
$scope.groups = data.data;
console.log($scope.groups);
}, function(data) {
console.log(data.data);
})
}
$scope.refresh(); } );
and a :new History controller
angular.module('am.directives').directive('history',
function() {
return {
restrict: 'E',
scope: {
alert: '=',
group: '='
},
templateUrl: 'app/partials/history.html'
}; } );
Insert a :new History directive
@lcalcote
Finally, we’ll need a page in which to
view the transpired alerts. So, create a
new file, , under
/ui/app/partials.
 
history.html
History.html will simply format the
display a tabular recordset. A new
recordset will be retrieved from our data
provider.
/ui/app/partials/history.html
@lcalcote
Summary
This example enhancement provides a view
of transient history — that of the period that
the SQlite database holds.
 
AlertManager is not currently intended to
provide long-term storage.
 
Contributing is easier than you may think.
 
Reference
Alert History
fork
Alert History
tutorial
Resources
IRC:  on  
 
Mailing lists:
 – discussing Prometheus usage and community support
 – contributing to Prometheus development
 
 
 
 
 to file bugs and features requests
#prometheus irc.freenode.net
prometheus-users
prometheus-developers
@PrometheusIO
Prometheus repositories
@lcalcote
#
Lee Calcote
Thank you.
Questions?
clouds, containers, infrastructure,
applications  and their management
linkedin.com/in/leecalcote
@lcalcote
blog.gingergeek.com
lee@calcotestudios.com
calcotestudios.com/talks
yes, we're hiring

More Related Content

What's hot

Credential store using HashiCorp Vault
Credential store using HashiCorp VaultCredential store using HashiCorp Vault
Credential store using HashiCorp VaultMayank Patel
 
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...Yevgeniy Brikman
 
ArgoCD Meetup PPT final.pdf
ArgoCD Meetup PPT final.pdfArgoCD Meetup PPT final.pdf
ArgoCD Meetup PPT final.pdfamanmakwana3
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub ActionsKnoldus Inc.
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsJulian Mazzitelli
 
DevSecOps in Baby Steps
DevSecOps in Baby StepsDevSecOps in Baby Steps
DevSecOps in Baby StepsPriyanka Aash
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
 
Gitops: the kubernetes way
Gitops: the kubernetes wayGitops: the kubernetes way
Gitops: the kubernetes waysparkfabrik
 
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方Yoshiyasu SAEKI
 
Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault
Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault
Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault Outlyer
 
Monitoring the Hashistack with Prometheus
Monitoring the Hashistack with PrometheusMonitoring the Hashistack with Prometheus
Monitoring the Hashistack with PrometheusGrafana Labs
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDSunnyvale
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
 
Secret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s VaultSecret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s VaultAWS Germany
 
GitOps and ArgoCD
GitOps and ArgoCDGitOps and ArgoCD
GitOps and ArgoCDOmar Fathy
 
Improved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerImproved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerJulien Pivotto
 
DevSecOps Implementation Journey
DevSecOps Implementation JourneyDevSecOps Implementation Journey
DevSecOps Implementation JourneyDevOps Indonesia
 

What's hot (20)

Credential store using HashiCorp Vault
Credential store using HashiCorp VaultCredential store using HashiCorp Vault
Credential store using HashiCorp Vault
 
DevSecOps 101
DevSecOps 101DevSecOps 101
DevSecOps 101
 
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
 
ArgoCD Meetup PPT final.pdf
ArgoCD Meetup PPT final.pdfArgoCD Meetup PPT final.pdf
ArgoCD Meetup PPT final.pdf
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
 
DevSecOps in Baby Steps
DevSecOps in Baby StepsDevSecOps in Baby Steps
DevSecOps in Baby Steps
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
Gitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCDGitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCD
 
Gitops: the kubernetes way
Gitops: the kubernetes wayGitops: the kubernetes way
Gitops: the kubernetes way
 
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
 
Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault
Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault
Neil Saunders (Beamly) - Securing your AWS Infrastructure with Hashicorp Vault
 
Monitoring the Hashistack with Prometheus
Monitoring the Hashistack with PrometheusMonitoring the Hashistack with Prometheus
Monitoring the Hashistack with Prometheus
 
Introducing Vault
Introducing VaultIntroducing Vault
Introducing Vault
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Secret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s VaultSecret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s Vault
 
GitOps and ArgoCD
GitOps and ArgoCDGitOps and ArgoCD
GitOps and ArgoCD
 
Improved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerImproved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and Alertmanager
 
DevSecOps Implementation Journey
DevSecOps Implementation JourneyDevSecOps Implementation Journey
DevSecOps Implementation Journey
 

Similar to Understanding and Extending Prometheus AlertManager

Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaArvind Kumar G.S
 
Operator SDK for K8s using Go
Operator SDK for K8s using GoOperator SDK for K8s using Go
Operator SDK for K8s using GoCloudOps2005
 
Improving the Accumulo User Experience
 Improving the Accumulo User Experience Improving the Accumulo User Experience
Improving the Accumulo User ExperienceAccumulo Summit
 
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackOpenShift Origin
 
Monitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and GrafanaMonitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and GrafanaJulien Pivotto
 
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien Pivotto
OSMC 2017 | Monitoring  MySQL with Prometheus and Grafana by Julien PivottoOSMC 2017 | Monitoring  MySQL with Prometheus and Grafana by Julien Pivotto
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien PivottoNETWAYS
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Hortonworks
 
Container orchestration from theory to practice
Container orchestration from theory to practiceContainer orchestration from theory to practice
Container orchestration from theory to practiceDocker, Inc.
 
Opendaylight SDN Controller
Opendaylight SDN ControllerOpendaylight SDN Controller
Opendaylight SDN ControllerSumit Arora
 
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Codemotion
 
Akka Actors: an Introduction
Akka Actors: an IntroductionAkka Actors: an Introduction
Akka Actors: an IntroductionRoberto Casadei
 
A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to KubernetesPaul Czarkowski
 
Maven 2.0 - Improve your build patterns
Maven 2.0 - Improve your build patternsMaven 2.0 - Improve your build patterns
Maven 2.0 - Improve your build patternselliando dias
 
Multi-tenancy with Rails
Multi-tenancy with RailsMulti-tenancy with Rails
Multi-tenancy with RailsPaul Gallagher
 
What is the difference between struts 1 vs struts 2
What is the difference between struts 1 vs struts 2What is the difference between struts 1 vs struts 2
What is the difference between struts 1 vs struts 2Santosh Singh Paliwal
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011lennartkoopmann
 

Similar to Understanding and Extending Prometheus AlertManager (20)

Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Operator SDK for K8s using Go
Operator SDK for K8s using GoOperator SDK for K8s using Go
Operator SDK for K8s using Go
 
Improving the Accumulo User Experience
 Improving the Accumulo User Experience Improving the Accumulo User Experience
Improving the Accumulo User Experience
 
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
 
Monitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and GrafanaMonitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and Grafana
 
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien Pivotto
OSMC 2017 | Monitoring  MySQL with Prometheus and Grafana by Julien PivottoOSMC 2017 | Monitoring  MySQL with Prometheus and Grafana by Julien Pivotto
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien Pivotto
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0
 
Container orchestration from theory to practice
Container orchestration from theory to practiceContainer orchestration from theory to practice
Container orchestration from theory to practice
 
Opendaylight SDN Controller
Opendaylight SDN ControllerOpendaylight SDN Controller
Opendaylight SDN Controller
 
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
 
Akka Actors: an Introduction
Akka Actors: an IntroductionAkka Actors: an Introduction
Akka Actors: an Introduction
 
A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to Kubernetes
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 
Maven 2.0 - Improve your build patterns
Maven 2.0 - Improve your build patternsMaven 2.0 - Improve your build patterns
Maven 2.0 - Improve your build patterns
 
Multi-tenancy with Rails
Multi-tenancy with RailsMulti-tenancy with Rails
Multi-tenancy with Rails
 
What is the difference between struts 1 vs struts 2
What is the difference between struts 1 vs struts 2What is the difference between struts 1 vs struts 2
What is the difference between struts 1 vs struts 2
 
OpenStack Murano
OpenStack MuranoOpenStack Murano
OpenStack Murano
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011
 

More from Lee Calcote

Benchmarking Service Meshes - CNCF Networking WG
Benchmarking Service Meshes  - CNCF Networking WGBenchmarking Service Meshes  - CNCF Networking WG
Benchmarking Service Meshes - CNCF Networking WGLee Calcote
 
Service Meshes, but at what cost?
Service Meshes, but at what cost?Service Meshes, but at what cost?
Service Meshes, but at what cost?Lee Calcote
 
Establishing an Open Source Program Office
Establishing an Open Source Program OfficeEstablishing an Open Source Program Office
Establishing an Open Source Program OfficeLee Calcote
 
Istio: Using nginMesh as the service proxy
Istio: Using nginMesh as the service proxyIstio: Using nginMesh as the service proxy
Istio: Using nginMesh as the service proxyLee Calcote
 
CNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project NuclioCNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project NuclioLee Calcote
 
Load Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLoad Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLee Calcote
 
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...Lee Calcote
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeLee Calcote
 
Container World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container OrchestratorsContainer World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container OrchestratorsLee Calcote
 
Growing a Community - Leveraging Meetups to Educate, Grow and Facilitate
Growing a Community - Leveraging Meetups to Educate, Grow and FacilitateGrowing a Community - Leveraging Meetups to Educate, Grow and Facilitate
Growing a Community - Leveraging Meetups to Educate, Grow and FacilitateLee Calcote
 
Overlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container NetworkingOverlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container NetworkingLee Calcote
 
Container Summit Austin
Container Summit AustinContainer Summit Austin
Container Summit AustinLee Calcote
 
Dockercon 16 Recap
Dockercon 16 RecapDockercon 16 Recap
Dockercon 16 RecapLee Calcote
 
From Engines to Orchestrators
From Engines to OrchestratorsFrom Engines to Orchestrators
From Engines to OrchestratorsLee Calcote
 
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-orsCharacterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-orsLee Calcote
 
Characterizing and Contrasting Container Orchestrators
 Characterizing and Contrasting Container Orchestrators Characterizing and Contrasting Container Orchestrators
Characterizing and Contrasting Container OrchestratorsLee Calcote
 
Dockercon EU 2015 Recap
Dockercon EU 2015 RecapDockercon EU 2015 Recap
Dockercon EU 2015 RecapLee Calcote
 

More from Lee Calcote (17)

Benchmarking Service Meshes - CNCF Networking WG
Benchmarking Service Meshes  - CNCF Networking WGBenchmarking Service Meshes  - CNCF Networking WG
Benchmarking Service Meshes - CNCF Networking WG
 
Service Meshes, but at what cost?
Service Meshes, but at what cost?Service Meshes, but at what cost?
Service Meshes, but at what cost?
 
Establishing an Open Source Program Office
Establishing an Open Source Program OfficeEstablishing an Open Source Program Office
Establishing an Open Source Program Office
 
Istio: Using nginMesh as the service proxy
Istio: Using nginMesh as the service proxyIstio: Using nginMesh as the service proxy
Istio: Using nginMesh as the service proxy
 
CNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project NuclioCNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project Nuclio
 
Load Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLoad Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & Kubernetes
 
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtime
 
Container World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container OrchestratorsContainer World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container Orchestrators
 
Growing a Community - Leveraging Meetups to Educate, Grow and Facilitate
Growing a Community - Leveraging Meetups to Educate, Grow and FacilitateGrowing a Community - Leveraging Meetups to Educate, Grow and Facilitate
Growing a Community - Leveraging Meetups to Educate, Grow and Facilitate
 
Overlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container NetworkingOverlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container Networking
 
Container Summit Austin
Container Summit AustinContainer Summit Austin
Container Summit Austin
 
Dockercon 16 Recap
Dockercon 16 RecapDockercon 16 Recap
Dockercon 16 Recap
 
From Engines to Orchestrators
From Engines to OrchestratorsFrom Engines to Orchestrators
From Engines to Orchestrators
 
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-orsCharacterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
 
Characterizing and Contrasting Container Orchestrators
 Characterizing and Contrasting Container Orchestrators Characterizing and Contrasting Container Orchestrators
Characterizing and Contrasting Container Orchestrators
 
Dockercon EU 2015 Recap
Dockercon EU 2015 RecapDockercon EU 2015 Recap
Dockercon EU 2015 Recap
 

Recently uploaded

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profileakrivarotava
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 

Recently uploaded (20)

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profile
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 

Understanding and Extending Prometheus AlertManager

  • 1. Understanding and Extending Prometheus AlertManager Lee Calcote calcotestudios.com/talks
  • 2. Lee Calcote linkedin.com/in/leecalcote @lcalcote blog.gingergeek.com lee@calcotestudios.com clouds, containers, infrastructure, applications  and their management calcotestudios.com/talks
  • 6.   Receivers ˈnō-mən-ˌklā-chər a brief Prometheus AlertManager construct review match alerts to their receiver and how often to notify where and how to send alerts  Routes @lcalcote
  • 7. - matches alerts with specific labels and prevents them from being included in notifications.    - suppress specific notifications when other specific alerts are already firing.    - categorizes alerts of similar nature into a single notification. Silencers Inhibitors Grouping ˈnō-mən-ˌklā-chər a brief Prometheus AlertManager construct review Muting Suppressing Correlating group_wait: 30s group_by: ['alertname', 'cluster'] group_interval: 5m @lcalcote
  • 8. Inhibition Multiple approaches to suppression @lcalcote repeat_interval vs Silences vs per routeglobalvia ui / api
  • 9. Alerts ALERT <alert name> IF <PromQL vector expression> FOR <duration> LABELS { ... } ANNOTATIONS { ... } Supports clients other than Prometheus is notified when alerts transition state @lcalcote a shared construct Prometheus AlertManager inactive firing pending state transition inactive firing notifications !
  • 11. Notifying to Multiple Destinations Use  to advance to next receiver.continue route: receiver: email_webhook receivers: - name: email_webhook email_configs: - to: 'lee@example.io' webhook_configs: - url: <webhook url here> Use a  that goes to both destinations.receiver route: receiver: ops-team-all # default routes: - match: severity: page receiver: ops-team-b continue: true - match: severity: critical receiver: ops-team-a receivers: - name: ops-team-all email_configs: - to: ops-team-all@example.io - name: ops-team-a email_configs: - to: ops-team-a@example.io - name: ops-team-b email_configs: - to: ops-team-b@example.io or @lcalcote
  • 12. Inhibitor Dispatcher Non-HA AlertManager Architecture Silencer de-duplication Dispatcher sorts incoming alerts into aggregation groups and assigns the correct notifiers to each. api Alert Provider UI Silence Provider store de-duplication subscribe Router batched alerts notification pipeline Notify Provider checks for previously sent notifications Retry RetryMaintenance Script ! @lcalcote alerts
  • 13. @lcalcote High Availability being introduced in 0.5 I gossip protocols. built atop Weave Mesh With HA, you no longer have to monitor the monitor.   Designed for an alert to be sent to all instances in the cluster.   All Prometheus instances send alerts to all Alertmanager instances.   Guarantees notifications to be sent at least once. @lcalcote
  • 14.
  • 17. Story: As an Operator, I would like to not only see a list of firing alerts, but also a list of all transpired alerts, so that I may have additional context as the thresholding behavior for a given defined alert. @lcalcote Prologue: Alert troubleshooting is improved when operators have a view of what is firing, has recently fired, what is normal, but also go back in time and see what fired an hour ago. Understanding firing order assists in root cause analysis and identify problem areas.   Limitations: 1. AlertManager database (SQLite) is not intended to provide long-term storage. Acceptance Criteria: 1. Once fired, whether actively firing or not, alerts will be displayed on the History page. 2. Optionally, fired alerts will be notified to a Slack channel. Stretch: Include pagination Add a date range picker Add a host filter  
  • 19. Random Sample Targets $ git clone https://github.com/prometheus/client_golang.git $ cd client_golang/examples/random $ go get -d $ go build Fetch and compile the client library code example. Start example targets in separate terminals. $ ./random -listen-address=:8080 $ ./random -listen-address=:8081 $ ./random -listen-address=:8082 Be sure to create and run the  and point it at your soon-to-be AlertManager: random sample targets @lcalcote
  • 20. Prometheus and Alert Rules Setup Follow the  to download, configure and run Prometheus.getting started instructions $ ./prometheus -config.file=prometheus.yml -alertmanager.url=http://localhost:9093 ALERT instance_down IF up == 0 FOR 5s LABELS {severity="page"} ANNOTATIONS { DESCRIPTION="{{$labels.instance}} of job {{$labels.job}} has been down for more than 5 seconds.", SUMMARY="Instance {{$labels.instance}} down"} /alert.rules A simple alert rule that will fire when any given target is unreachable for longer than 5 seconds. @lcalcote ! ... # Load and evaluate rules in this file every 'evaluation_interval' seconds. rule_files: - "alert.rules" ... /prometheus.yml
  • 22. @lcalcote Grab Repos $ git clone https://github.com/prometheus/alertmanager.git Given that our user story includes making front-end changes to AlertManager, ensure that you install a small utility to generate Go code from any file. Clone AlertManager repo Get, build and copy go-bindata into any directory on your PATH $ go get -u github.com/jteeuwen/go-bindata/... $ cd $GOPATH/src/github.com/jteeuwen/go-bindata/go-bindata $ go build
  • 23. Notification Integration create an alert notification receiver.   route: group_by: [cluster] # If an alert isn't caught by a route, send it slack. receiver: slack_general routes: # Send severity=slack alerts to slack. - match: severity: page receiver: slack_general receivers: - name: slack_general slack_configs: - api_url: '<your-web-url-here>' channel: '#<your-channel-name-here>' send_resolved: true Of the supported AlertManager receivers, let’s opt for integrating Slack. @lcalcote
  • 24. @lcalcote The  can assist in building routing trees. visual editor
  • 25. Build, Run, Test Verify you have a functional development environment by building and running the project: $ make assets # invokes go-bindata to inject static web files $ go build # compiles go code $ ./alertmanager -config.file=slack.yml # runs alertmanager with the specified configuration @lcalcote $ curl -X POST http://localhost:9090/-/reload $ kill -HUP `pgrep alertmanager` $ ./promtool check-config <config file> $ ./promtool check-rules <rules file> Reload Prometheus or AlertManager configs Validate Prometheus config and alert rules
  • 26. @lcalcote Test If you choose to setup a Slack channel, you should now see new alerts firing as and when your random targets go up and down.
  • 28. @lcalcote All UI functionality should be addressable via API. Let’s register a : /api.go new /history API endpoint r.Get("/history", ihf("history", api.listAllAlerts)) func (api *API) listAllAlerts(w http.ResponseWriter, r *http.Request) { alerts := api.alerts.GetAll() defer alerts.Close() With our /api/v1/history endpoint a newly addressable API endpoint, we’ll need to build a function to handle requests made to it. The  function will handle inbound HTTP requests made to the new endpoint. api.listAllAlerts
  • 29. @lcalcote 1. Add  (e.g. GetAll() AlertIterator) to /provider/provider.go 2. Add a  to /provider/sqlite/sqlite.go 3. Add a to /provider/boltmem/boltmem.go a new AlertIterator new AlertProvider and SQL query new AlertIterator and AlertProvider With API endpoint, let’s turn our attention to the backend for collecting the right recordset from our data provider. /provider
  • 30. @lcalcote /ui/app/js/app.js angular.module('am.controllers').controller('NavCtrl', function($scope, $location) { $scope.items = [{ name: 'History', url: 'history' }, angular.module('am.services').factory('History', function($resource) { return $resource('', {}, { 'query': { method: 'GET', url: 'api/v1/history' } }); } );  NavCtrl for the :History menu item as well as a :new History service angular.module('am.controllers').controller('HistoryCtrl', function($scope, History) { $scope.refresh = function () { History.query({}, function(data) { $scope.groups = data.data; console.log($scope.groups); }, function(data) { console.log(data.data); }) } $scope.refresh(); } ); and a :new History controller angular.module('am.directives').directive('history', function() { return { restrict: 'E', scope: { alert: '=', group: '=' }, templateUrl: 'app/partials/history.html' }; } ); Insert a :new History directive
  • 31. @lcalcote Finally, we’ll need a page in which to view the transpired alerts. So, create a new file, , under /ui/app/partials.   history.html History.html will simply format the display a tabular recordset. A new recordset will be retrieved from our data provider. /ui/app/partials/history.html
  • 32. @lcalcote Summary This example enhancement provides a view of transient history — that of the period that the SQlite database holds.   AlertManager is not currently intended to provide long-term storage.   Contributing is easier than you may think.   Reference Alert History fork Alert History tutorial
  • 33. Resources IRC:  on     Mailing lists:  – discussing Prometheus usage and community support  – contributing to Prometheus development          to file bugs and features requests #prometheus irc.freenode.net prometheus-users prometheus-developers @PrometheusIO Prometheus repositories @lcalcote #
  • 34. Lee Calcote Thank you. Questions? clouds, containers, infrastructure, applications  and their management linkedin.com/in/leecalcote @lcalcote blog.gingergeek.com lee@calcotestudios.com calcotestudios.com/talks yes, we're hiring