The revolt against SQL continues at a steady but considerably slower pace. Bespoke database software seems to crop up daily in the name of performance or functionality. This talk will examine the ever growing field of monitoring systems and their respective databases, and look in depth as to how Postgres can be used in a number of these places. Systems of this nature are typically tasked with collecting and storing metrics from your infrastructure, drawing pretty graphs, and nagging you when things break.
Forms of data stored by these systems are nothing to be afraid of - they often include:
- Time series metrics - the history of a measurement over time, e.g. temperatures
- Logs - unstructured text emitted by applications, operating systems and hardware
- Events - schema-less but well structured notifications
An assertion of this talk is that for a majority of use cases, Postgres is more than capable of storing all of this data. We will attempt to replace numerous well known pieces of software with just one Postgres database. Of course we are told to use the right tool for the job, but having to learn and operate a single tool is a huge operational advantage.
We’ll get quite technical in this talk, take a look the data models and access patterns required, and how this can be fitted into the general purpose environment of Postgres. Additionally, it is always constructive to look at what can be problematic, and not just focus on the positives, and why many turn to other bespoke solutions.
12. Monitoring Requirements
Fault finding and alerting
Notify me when a server or service is
unavailable, a disk needs replacing, ...
Fault post-mortem, pre-emption
Why did the outage occur and what can we
do to prevent it next time
13. Monitoring Requirements
Utilisation and efficiency analysis
Is all the hardware we own being used?
Is it being used efficiently?
Performance monitoring and profiling
How long are my web/database requests?
14. Monitoring Requirements
Auditing (security, billing)
Tracking users use of system
Auditing access to systems or resources
Decision making, future planning
What is expected growth in data, or users?
What of the current system is most used?
16. Existing Tools
Checking and Alerting
Agents check on machines or services
Report centrally, notify users via dashboard
Store history of events in database
39. Existing Tools
Commendable “right tool for the job” attitude, but…
How about Postgres?
Fewer points of failure
Fewer places to backup
Fewer redundancy protocols
One set of consistent data semantics
Re-use existing operational knowledge
51. Postgres for Metrics
Combine Series
average over all
for {series=cpu}
[time range/interval]
Read Series
for each {type}
for {series=cpu}
[time range/interval]
54. Postgres for Metrics
"metric": {
"timestamp": 1232141412,
"name": "cpu.percent",
"value": 42,
"dimensions": { "hostname": "dev-01" },
"value_meta": { … }
}
JSON Ingest Format
Known, well defined structure
Varying set of dimensions key/values
55. Postgres for Metrics
CREATE TABLE measurements (
timestamp TIMESTAMPTZ,
name VARCHAR,
value FLOAT8,
dimensions JSONB,
value_meta JSON
);
Basic Denormalised Schema
Straightforward mapping onto input data
Data model for all schemas
56. Postgres for Metrics
SELECT
TIME_ROUND(timestamp, 60) AS timestamp,
AVG(value) AS avg
FROM
measurements
WHERE
timestamp BETWEEN '2015-01-01Z00:00:00'
AND '2015-01-01Z01:00:00'
AND name = 'cpu.percent'
AND dimensions @> '{"hostname": "dev-01"}'::JSONB
GROUP BY
timestamp
Single Series Query
One hour window | Single hostname
Measurements every 60 second interval
57. Postgres for Metrics
SELECT
TIME_ROUND(timestamp, 60) AS timestamp,
AVG(value) AS avg,
dimensions ->> 'hostname' AS hostname
FROM
measurements
WHERE
timestamp BETWEEN '2015-01-01Z00:00:00'
AND '2015-01-01Z01:00:00'
AND name = 'cpu.percent'
GROUP BY
timestamp, hostname
Group Multi-Series Query
One hour window | Every hostname
Measurements every 60 second interval
58. Postgres for Metrics
SELECT
TIME_ROUND(timestamp, 60) AS timestamp,
AVG(value) AS avg
FROM
measurements
WHERE
timestamp BETWEEN '2015-01-01Z00:00:00'
AND '2015-01-01Z01:00:00'
AND name = 'cpu.percent'
GROUP BY
timestamp
All Multi-Series Query
One hour window | Every hostname
Measurements every 60 second interval
60. Postgres for Metrics
SELECT DISTINCT
JSONB_OBJECT_KEYS(dimensions)
AS d_name
WHERE
name = 'cpu.percent'
FROM
measurements
Dimension Name List Query
(for specific metric)
61. Postgres for Metrics
SELECT DISTINCT
dimensions ->> 'hostname'
AS d_value
WHERE
name = 'cpu.percent'
AND dimensions ? 'hostname'
FROM
measurements
Dimension Value List Query
(for specific metric and dimension)
63. Postgres for Metrics
CREATE TABLE measurements (
timestamp TIMESTAMPTZ,
name VARCHAR,
value FLOAT8,
dimensions JSONB,
value_meta JSON
);
CREATE INDEX ON measurements
(name, timestamp);
CREATE INDEX ON measurements USING GIN
(dimensions);
Indexes
Covers all necessary query terms
Using single GIN saves space, but slower
64. Postgres for Metrics
● Series Queries
● All, Group, Specific
● Varying Time Window/Interval
5m|15s, 1h|15s, 1h|300s, 6h|300s, 24h|300s
● Listing Queries
● Metric Names, Dimension Names & Values
● All, Partial
65. Postgres for Metrics
Single
Group
All
0 2000 4000 6000 8000 10000 12000
"Denormalised" Series Queries
5m (15s)
1h (15s)
1h (300s)
6h (300s)
24h (300s)
Duration (ms)
66. Postgres for Metrics
Single
Group
All
0 500 1000 1500 2000 2500
"Denormalised" Series Queries
5m (15s)
1h (15s)
1h (300s)
6h (300s)
24h (300s)
Duration (ms)
69. Postgres for Metrics
CREATE TABLE measurement_values (
timestamp TIMESTAMPTZ,
metric_id INT,
value FLOAT8,
value_meta JSON
);
CREATE TABLE metrics (
id SERIAL,
name VARCHAR,
dimensions JSONB,
);
Normalised Schema
Reduces duplication of data
Pre-built set of distinct metric definitions
70. Postgres for Metrics
CREATE FUNCTION get_metric_id (in_name VARCHAR, in_dims JSONB)
RETURNS INT LANGUAGE plpgsql AS $_$
DECLARE
out_id INT;
BEGIN
SELECT id INTO out_id FROM metrics AS m
WHERE m.name = in_name AND m.dimensions = in_dims;
IF NOT FOUND THEN
INSERT INTO metrics ("name", "dimensions") VALUES
(in_name, in_dims) RETURNING id INTO out_id;
END IF;
RETURN out_id;
END; $_$;
Normalised Schema
Function to use at insert time
Finds existing metric_id or allocates new
71. Postgres for Metrics
CREATE VIEW measurements AS
SELECT *
FROM measurement_values
INNER JOIN
metrics ON (metric_id = id);
CREATE INDEX metrics_idx ON
metrics (name, dimensions);
CREATE INDEX measurements_idx ON
measurement_values (metric_id, timestamp);
Normalised Schema
Same queries, use view to join
Extra index to help normalisation step
77. Postgres for Metrics
CREATE TABLE summary_values_5m (
timestamp TIMESTAMPTZ,
metric_id INT,
value_sum FLOAT8,
value_count FLOAT8,
value_min FLOAT8,
value_max FLOAT8,
UNIQUE (metric_id, timestamp)
);
Summarised Schema
Pre-compute every 5m (300s) interval
Functions to be applied must be known
78. Postgres for Metrics
CREATE FUNCTION update_summarise () RETURNS TRIGGER
LANGUAGE plpgsql AS $_$
BEGIN
INSERT INTO summary_values_5m VALUES (
TIME_ROUND(NEW.timestamp, 300), NEW.metric_id,
NEW.value, 1, NEW.value, NEW.value)
ON CONFLICT (metric_id, timestamp)
DO UPDATE SET
value_sum = value_sum + EXCLUDED.value_sum,
value_count = value_count + EXCLUDED.value_count,
value_min = LEAST (value_min, EXCLUDED.value_min),
value_max = GREATEST(value_max, EXCLUDED.value_max);
RETURN NULL;
END; $_$;
Summarised Schema
Entry for each metric/rounded time period
Update existing entries by aggregating
79. Postgres for Metrics
CREATE TRIGGER update_summarise_trigger
AFTER INSERT ON measurement_values
FOR EACH ROW
EXECUTE PROCEDURE update_summarise ();
CREATE VIEW summary_5m AS
SELECT *
FROM
summary_values_5m INNER JOIN metrics
ON (metric_id = id);
Summarised Schema
Trigger applies row to summary table
View mainly for convenience when querying
80. Postgres for Metrics
SELECT
TIME_ROUND(timestamp, 300) AS timestamp,
AVG(value) AS avg
FROM
measurements
WHERE
timestamp BETWEEN '2015-01-01Z00:00:00'
AND '2015-01-01Z06:00:00'
AND name = 'cpu.percent'
GROUP BY
timestamp
Combined Series Query
Six hour window | Every hostname
Measurements every 300 second interval
81. Postgres for Metrics
SELECT
TIME_ROUND(timestamp, 300) AS timestamp,
SUM(value_sum) / SUM(value_count) AS avg
FROM
summary_5m
WHERE
timestamp BETWEEN '2015-01-01Z00:00:00'
AND '2015-01-01Z06:00:00'
AND name = 'cpu.percent'
GROUP BY
timestamp
Combined Series Query
Use pre-aggregated summary table
Mostly the same; extra fiddling for AVG
87. Postgres for Metrics
● Need coarser summaries for wider
queries (e.g. 30m summaries)
● Need to partition data by day to:
● Retain ingest rate due to indexes
● Optimise dropping old data
● Much better ways to produce summaries
to optimise ingest, specifically:
● Process rows in batches of interval size
● Process asynchronous to ingest transaction
90. Postgres for Log Searching
Requirements
● Central log storage
● Trivially searchable
● Time bounded
● Filter ‘dimensions’
● Interactive query
times (<100ms)
91. Postgres for Log Searching
"log": {
"timestamp": 1232141412,
"message":
"Connect from 172.16.8.1:52690 to 172.16.8.10:5000 (keystone/HTTP)",
"dimensions": {
"severity": 6,
"facility": 16,
"pid": "39762",
"program": "haproxy"
"hostname": "dev-controller-0"
},
}
Log Ingest Format
Typically sourced from rsyslog
Varying set of dimensions key/values
92. Postgres for Log Searching
CREATE TABLE logs (
timestamp TIMESTAMPTZ,
message VARCHAR,
dimensions JSONB
);
Basic Schema
Straightforward mapping of source data
Allow for maximum dimension flexibility
93. Postgres for Log Searching
connection AND program:haproxy
Query Example
Kibana/Elastic style using PG-FTS
SELECT *
FROM logs
WHERE
TO_TSVECTOR('english', message)
@@ TO_TSQUERY('connection')
AND dimensions @> '{"program":"haproxy"}';
94. Postgres for Log Searching
CREATE INDEX ON logs
USING GIN
(TO_TSVECTOR('english', message));
CREATE INDEX ON logs
USING GIN
(dimensions);
Indexes
Enables fast text search on ‘message’
& Fast filtering based on ‘dimensions’
103. Postgres for Log Parsing
CREATE TABLE logs (
timestamp TIMESTAMPTZ,
message VARCHAR,
dimensions JSONB
);
Log Schema – Goals:
Parse message against set of patterns
Add extracted information as dimensions
104. Postgres for Log Parsing
Patterns Table
Store pattern to match and field names
CREATE TABLE patterns (
regex VARCHAR,
field_names VARCHAR[]
);
INSERT INTO patterns (regex, fields_names) VALUES (
'Connect from '
|| '(d+.d+.d+.d+):(d+) to (d+.d+.d+.d+):(d+)'
|| ' ((w+)/(w+))',
'{src_ip,src_port,dest_ip,dest_port,service,protocol}'
);
105. Postgres for Log Parsing
Log Processing
Apply all configured patterns to new rows
CREATE FUNCTION process_log () RETURNS TRIGGER
LANGUAGE PLPGSQL AS $_$
DECLARE
m JSONB; p RECORD;
BEGIN
FOR p IN SELECT * FROM patterns LOOP
m := JSONB_OBJECT(p.field_names,
REGEXP_MATCHES(NEW.message, p.regex));
IF m IS NOT NULL THEN
NEW.dimensions := NEW.dimensions || m
END IF;
END LOOP;
RETURN NEW;
END; $_$;
106. Postgres for Log Parsing
CREATE TRIGGER process_log_trigger
BEFORE INSERT ON logs
FOR EACH ROW
EXECUTE PROCEDURE process_log ();
Log Processing Trigger
Apply patterns as messages and extend
dimensions, as inserted into logs table
109. Requirements
● Offload data burden
from producers
● Persist as soon as
possible to avoid loss
● Handle high velocity
burst loads
● Data does not need
to be queryable
Postgres for Queueing
116. Conclusion… ?
● I view Postgres as a very flexible
“data persistence toolbox”
● ...which happens to use SQL
● Batteries not always included
● That doesn’t mean it’s hard
● Operational advantages of using
general purpose tools can be huge
● Use & deploy what you know & trust