2. Why to even Monitor
your sytem ?
● Monitoring a computer system is just as important as the system itself.
● Monitoring allows for proactive response rather than reactive, data security
and data gathering and the overall good health of a computer system.
● While monitoring does not fix problems, it does lead to more stable and reliable
computer systems.
● For example, monitoring may alert a System Administrator that a hard drive
in a server is degraded. The System Administrator is alerted of this and
swaps out the degraded hard drive with a new one. Without monitoring, the
degraded hard drive could turn into a failed hard drive causing an extended
outage and possible data loss.
3. Monitoring Tools
- Monitoring tools are used to continuously keep track of the status of the
system in use, in order to have the earliest warning of failures, defects or
problems and to improve them.
- There are monitoring tools for servers, networks, databases, security,
performance, website and internet usage, and applications.
-
➔ Features
➔ To log real-time and historical information.
➔ To find optimal settings.
➔ To monitor the number of users on a network.
➔ To monitor network traffic (either in real time or covering a given length of time of
operation with the analysis performed afterwards).
➔ To identify the problems and send an alert message to the administrator (e.g.
network administrator).
4. Tools Required
Grafana
An Open Source
visualization and
analytics software
Prometheus
A free software
application used for
event monitoring
and alerting.
Node Exporter
A Linux Metric
Exporter for
Prometheus
5. Prometheus
- Prometheus is a free software application used for event monitoring and alerting. It
records real-time metrics in a time series database built using a HTTP pull model, with
flexible queries and real-time alerting.
- Prometheus was created to monitor highly dynamic container environments like
kubernetes docker swarm etc however it can also be used in a traditional non
container infrastructure where you have just bare servers with applications deployed
directly on them.
- Features :
● A multi-dimensional data model with time series data identified by metric name
and key/value pairs
● PromQL, a flexible query language to leverage this dimensionality
● No reliance on distributed storage; single server nodes are autonomous
● Time series collection happens via a pull model over HTTP
● Pushing time series is supported via an intermediary gateway
● Targets are discovered via service discovery or static configuration
● Multiple modes of graphing and dashboarding support
8. Node Exporter
- Prometheus pulls metrics data from the targets from an
HTTP endpoint which by default is host address
slash metrics and for that to work the targets must
expose => /metrics endpoint .
- Data available at slash metrics endpoint must be in the
format that Prometheus understands and many services don't have native
Prometheus endpoints so extra components are required.
- Exporter is basically a script or service that fetches metrics from target and
converts them into a format Prometheus can understand and exposes this
converted data at its own slash metrics endpoint where Prometheus can
scrape them.
- Node Exporter is the Prometheus exporter for Linux servers.
10. Setup and Configuration -1 (Prometheus and Node Exporter)
● First, download and add the GPG key with the following command:
$ wget https://s3-eu-west-1.amazonaws.com/
deb.robustperception.io/41EFC99D.gpg | sudo apt-key add -
● Next, update the repository and install Prometheus with the following command:
$ sudo apt-get update -y
$ sudo apt-get install prometheus prometheus-node-exporter prometheus-pushgateway
prometheus-alertmanager -y
● Once the installation is completed, start Prometheus service and enable it to start on boot time with the
following command:
$ sudo systemctl start prometheus
$ sudo systemctl enable prometheus
● You can also check the status of Prometheus service with the following command:
$ sudo systemctl status prometheus
11. ● Configure prometheus.yml file to scrape metrics from node exporter which runs on PORT=9100.
job_name: node
# If prometheus-node-exporter is installed, grab stats about the local
# machine by default.
static_configs:
- targets: ['localhost:9100']
● Also change scrape_interval in the same config file so load on server is reduced.
global:
scrape_interval: 60s # By default, scrape targets every 15 seconds.
evaluation_interval: 60s # By default, scrape targets every 15 seconds.
Setup and Configuration -2 (Prometheus and Node Exporter)
12. Grafana
- Grafana is open source visualization and analytics software. It allows you to
query, visualize, alert on, and explore your metrics no matter where they are
stored. In general, it provides you with tools to turn your time-series
database (TSDB) data into beautiful graphs and visualizations.
- Grafana connects with every possible data source, commonly referred to as
databases such as Graphite, Prometheus, Influxdb, ElasticSearch, MySQL,
PostgreSQL etc.
- Dashboard - The dashboards contain a gamut of visualization options such
as geo maps, heat maps, histograms, all the variety of charts & graphs which
a business typically requires to study data.
See a demo board on - https://play.grafana.org/
13. Setup and Configuration
● Installation
$ sudo apt-get install -y apt-transport-https
$ sudo apt-get install -y software-properties-common wget
$ wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get install grafana-enterprise
● Setup
○ Start the server with systemd
$ sudo systemctl daemon-reload
$ sudo systemctl start grafana-server
$ sudo systemctl status grafana-server
○ Configure Grafana to start at boot
$ sudo systemctl enable grafana-server.service
- By default Grafana runs at PORT 3000 which can be changed by editing
/etc/grafana/grafana.ini
● Change http_port to your desired port.
[server] http_port=8080
14. Good luck!
We hope you’ll use these tips to go out and
deliver a robust monitoring for your product or
service!
Reference :
● https://youtu.be/h4Sl21AKiDg
● https://prometheus.io/docs/prometheus/latest
/getting_started/