SlideShare a Scribd company logo
1 of 50
Download to read offline
PostgreSQL + Redis
Andrew Dunstan
andrew@dunslane.net
andrew.dunstan@pgexperts.com
Topics
● What is Redis?
● The Redis Foreign Data Wrapper
● The Redis Command wrapper for Postgres
● Case study – a high performance Ad server
using Postgres and Redis
What is Redis?
● High performance in-memory key/value data
store
Redis is easy to use
● Almost no configuration
● On Fedora
sudo yum install redis
sudo systemctl enable redis.service
sudo systemctl start redis.service
redis-cli
Redis keys
● Are just strings
Redis data values
● Values can be scalars
● Strings
● Integers
● Values can be structured
● Lists
● Sets
● Ordered sets
● Hashes – name value pairs
– c.f. Hstore
Simple command set
● Nothing like SQL, table joins
● Command set is large but most commands only
take 2 or 3 parameters
● http://redis.io/commands
Examples - adding values
● SET mykey myvalue
● HMSET myhashkey prop1 val1 prop2 val2
● SADD mysetkey val1 val2 val3
● LPUSH mylist val1 val2
● ZADD myzsetkey 1 val1 5 val2
No creation command
● You create an object by setting or adding to it
● Almost schema-less
● Can't use a command for one object to another
Redis keys all live in a single global
namespace
● No schemas
● No separation by object type
● Very common pattern is to use fine grained
keys, like (for a web session)
web:111a7c9ff5afa0a7eb598b2c719c7975
● KEYS command can find keys by pattern:
● KEYS web:*
– Dangerous
How Redis users do “tables”
● They use a prefix:
● INCR hits:2013.05.25
● They can find all these by doing
● KEYS hits:*
● Or they keep a set with all the keys for a given
type of data
● SADD hitkeyset hits:2013.05.25
● The application has to make use of these keys –
Redis itself won't
Redis Client library
● “hiredis”
● Moderately simple
● https://github.com/redis/hiredis
Redis Foreign Data Wrapper
● https://github.com/pg-redis-fdw/redis_fdw
● Originally written by Dave Page
● Brought up to date and extended by me
Originally
● Only supported scalar data
● No support for segmenting namespace or use
of key sets
Updates by me
● All data types supported
● Table key prefixes supported
● Table key sets supported
● Array data returned as a PostgreSQL array
literal
Hash tables
● Most important type
● Most like PostgreSQL tables
● Best to define the table as having array of text
for second column
● Turn that into json, hstore or a record.
Example
● CREATE FOREIGN TABLE web_sessions(
key text,
values text[])
SERVER localredis
OPTIONS (tabletype hash,
tablekeyprefix 'web:');
SELECT * from web_sessions;
Use with hstore
● CREATE TYPE websession AS (
id text,
browser text,
username text);
SELECT populate_record(null::websession,
hstore(values))
FROM websessions;
Use with json_object
● https://bitbucket.org/qooleot/json_object
● CREATE EXTENSION json_object;
SELECT json_object(values)
FROM websessions;
Key prefix vs Key Set
● Key sets are much faster
● Ad server could not meet performance goals
until it switched to using key sets
● Recommended by Redis docs
Using a key set to filter rows
● Sort of “where” clause
● Put the keys of the entries you want in a set
somehow
● Can use command wrapper
● Define a new foreign table that uses that set
as the keyset
9.3 notes
● In 9.3 there is json_populate_record()
● Could avoid use of hstore
● For post 9.3, would be a good idea to have a
function converting an array of key value pairs
to a record directly
Brand new – Singleton Key tables
● Each object is a table, not a row
● Sets and lists come back as single field rows
● Ordered sets come back as one or two field
rows
– second field can be score
● Hashes come back as rows of key/value
Coming soon
● Writable tables
● Supported in upcoming release 9.3
Redis Command Wrapper
● Fills in the missing gaps in functionality
● Sponsored by IVC: http://www.ivc.com
● https://bitbucket.org/qooleot/redis_wrapper
Redis wrapper functionality
● Thin layer over hiredis library
● Four basic functions
● redis_connect()
● redis_disconnect()
● redis_command()
● redis_command_argv()
redis_connect()
● First argument is “handle”
● Remaining arguments are all optional
● con_host text DEFAULT '127.0.0.1'::text
● con_port integer DEFAULT 6379
● con_pass text DEFAULT ''::text
● con_db integer DEFAULT 0
● ignore_duplicate boolean DEFAULT false
Redis wrapper connections are
persistent
● Unlike FDW package, where they are made at
the beginning of each table fetch
● Makes micro operations faster
redis_command and
redis_command_argv
● Thin layers over similarly named functions in
client library
● redis_command has max 4 arguments after
command string – for more use
redis_command_argv
● Might switch from VARIADIC text[] to
VARIADIC “any”
Uses
● Push data into redis
● Redis utility statements from within Postgres
Higher level functions
● redis_push_record
● con_num integer
● data record
● push_keys boolean
● key_set text
● key_prefix text
● key_fields text[]
Why use Redis?
● Did I mention it's FAST?
● But not safe
Our use case
● An ad server for the web
● If Redis crashes, not a tragedy
● If it's slow, it's a tragedy
Ad Server Project by IVC
http://www.ivc.com
Remaining slides are mostly info from IVC
System Goals
● Serve 10,000 ads per second per application server
cpu
● Use older existing hardware
● 5 ms for Postgres database to filter from 100k+ total
ads to ~ 30 that can fit a page and meet business
criteria
● 5 ms to filter to 1-5 best ads per page using statistics
from Redis for freshness, revenue maximization etc.
● Record ad requests, confirmations and clicks.
● 24x7 operation with automatic fail over
Physical View
802.3ad
/4
/2ea
/4 /4 /4
/4
/4
Cisco 3750 stacked
1G HSRP
Xen Hosts
SLES 11.2
Dell R810
128G
Intel e6540
24cores
SLES 11.2
Dell 2950
32G
Intel e5430
8 cores
Redundancy View
www.draw-shapes..de
www.draw-shapes.de
Cisco HSRP
Keepalived
NGINX
Node
Sentinel
Tier 1 Client Tier 2 Web Tier 3
Application
Tier 4
Database
Shorewall
Keepalived
Redis
Multiple
Instances
Postgres 9.2
Sentinel
TransactionDB
Pgpool
Hot Replication
Business DB
Hot Replication
Data Warehouse DB
Hot Replication
Skytools
Londiste3
Postgres databases
● 6 Postgres databases
● Two for business model – master and streaming hot
standby (small VM)
● Two for serving ads – master and streaming hot
standby (physical Dell 2950)
● Two for for storing clicks and impressions – master
and hot standby (physical Del 2950)
● Fronted by redundant pg pool load balancers with fail
over and automated db fail over.
Business DB
● 30+ tables
● Example tables: ads, advertisers, publishers, ip
locations
● Small number of users that manipulate the data (<
100)
● Typical application and screens
● Joining too slow to serve ads
● Tables get materialized into 2 tables in the ad serving
database
● Two tables
● First has ip ranges so we know where the user is
coming from. Ad serving is often by country, region
etc.
● Second has ad sizes, ad types, campaigns,
keywords, channels, advertisers etc.
● Postgres inet type and index was a must have to be
successful for table one
● Tsquery/tsvector, boxes, arrays were all a must have
for table two (with associated index types)
Ad Serving Database
Ad serving Database
● Materialized and copied from Business
database every 3 minutes
● Indexes are created and new tables are
vacuum analyzed then renamed.
● Performance goals were met.
● We doubt this could be done without Postgres
data types and associated indexes
● Thanks
Recording Ad requests/confirmations
and clicks
● At 10k/sec/cpu recording ads one row at a time +
updates on confirmation is too slow
● Approach: record in Redis, update in Redis and once
every six minutes we batch load from Redis to
Postgres. - FDW was critical.
● Partitioning (inheritance) with constraint exclusion to
segregate data by day using nightly batch job. One
big table with a month's worth of data would not
work.
● Table partitioning is not cheap in the leading
commercial product.
● Thanks
Recording DB continued.
● Used heavily for reporting.
● Statistics tables (number of clicks, impressions
etc.) are calculated every few minutes on
today's data
● Calculated nightly for the whole day tables
● For reporting we needed some business data
so we selectively replicate business tables in
the ad recording database using Skytools. DB
linking tables is too slow when joining.
Recording DB cont'd
● Another usage is fraud detection.
● Medium and long term frequency fraud
detection is one type of fraud that this
database is used for.
Redis
● In memory Database.
● Rich type support.
● Multiple copies and replication.
● Real time and short term fraud detection
● Dynamic pricing
● Statistical best Ad decision making
● Initial place to record and batch to Postgres
●
Runs on VM with 94Gb of dedicated RAM.
Redis cont'd
● FDW and commands reduce the amount of
code we had to write dramatically
● FDW good performance characteristics.
● Key success factor: In memory redis DB +
postgres relational DB.
Postgres – Redis interaction
● Pricing data is pushed to Redis from Business
DB via command wrapper
● Impression and Click data is pulled from Redis
into Recording DB via Redis FDW
Current Status
● In production with 4 significant customers
since March 1
● Scaling well
Conclusions
● Postgres' rich data types and associated
indexes were absolutely essential
● Redis + Postgres with good FDW integration
was the second key success factor
● Node.js concurrency was essential in getting
good application throughput
● Open source allowed the system to be built for
less than 2% of the cost of a competing
commercial system
Questions?

More Related Content

What's hot

Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101Dvir Volk
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesMydbops
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in SparkShiao-An Yuan
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale Hakka Labs
 
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
Toro DB- Open-source, MongoDB-compatible database,  built on top of PostgreSQLToro DB- Open-source, MongoDB-compatible database,  built on top of PostgreSQL
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQLInMobi Technology
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkDvir Volk
 
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon
 
Centralized + Unified Logging
Centralized + Unified LoggingCentralized + Unified Logging
Centralized + Unified LoggingGabor Kozma
 
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Mydbops
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigSelena Deckelmann
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationEDB
 
Extending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsExtending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsDatabricks
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLJim Mlodgenski
 
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...PostgreSQL-Consulting
 
Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Rajeev Rastogi (KRR)
 

What's hot (20)

Really Big Elephants: PostgreSQL DW
Really Big Elephants: PostgreSQL DWReally Big Elephants: PostgreSQL DW
Really Big Elephants: PostgreSQL DW
 
Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best Practices
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in Spark
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
Toro DB- Open-source, MongoDB-compatible database,  built on top of PostgreSQLToro DB- Open-source, MongoDB-compatible database,  built on top of PostgreSQL
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
 
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase Update
 
Centralized + Unified Logging
Centralized + Unified LoggingCentralized + Unified Logging
Centralized + Unified Logging
 
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
 
Get to know PostgreSQL!
Get to know PostgreSQL!Get to know PostgreSQL!
Get to know PostgreSQL!
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets big
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
Extending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsExtending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session Extensions
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
 
Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2
 
Learning postgresql
Learning postgresqlLearning postgresql
Learning postgresql
 

Viewers also liked

MongoDB 3.0.0 vs 2.6.x vs 2.4.x Benchmark
MongoDB 3.0.0 vs 2.6.x vs 2.4.x BenchmarkMongoDB 3.0.0 vs 2.6.x vs 2.4.x Benchmark
MongoDB 3.0.0 vs 2.6.x vs 2.4.x Benchmark承翰 蔡
 
Redis - for duplicate detection on real time stream
Redis - for duplicate detection on real time streamRedis - for duplicate detection on real time stream
Redis - for duplicate detection on real time streamCodemotion
 
Building an API in Node with HapiJS
Building an API in Node with HapiJSBuilding an API in Node with HapiJS
Building an API in Node with HapiJSLoc Nguyen
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachJeremy Zawodny
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQLKonstantin Gredeskoul
 
PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6Tomas Vondra
 

Viewers also liked (6)

MongoDB 3.0.0 vs 2.6.x vs 2.4.x Benchmark
MongoDB 3.0.0 vs 2.6.x vs 2.4.x BenchmarkMongoDB 3.0.0 vs 2.6.x vs 2.4.x Benchmark
MongoDB 3.0.0 vs 2.6.x vs 2.4.x Benchmark
 
Redis - for duplicate detection on real time stream
Redis - for duplicate detection on real time streamRedis - for duplicate detection on real time stream
Redis - for duplicate detection on real time stream
 
Building an API in Node with HapiJS
Building an API in Node with HapiJSBuilding an API in Node with HapiJS
Building an API in Node with HapiJS
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 
PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6
 

Similar to PostgreSQL and Redis - talk at pgcon 2013

An Introduction to Redis for .NET Developers.pdf
An Introduction to Redis for .NET Developers.pdfAn Introduction to Redis for .NET Developers.pdf
An Introduction to Redis for .NET Developers.pdfStephen Lorello
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous PersistenceJervin Real
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineNicolas Morales
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB plc
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopDataWorks Summit
 
Redis Developers Day 2014 - Redis Labs Talks
Redis Developers Day 2014 - Redis Labs TalksRedis Developers Day 2014 - Redis Labs Talks
Redis Developers Day 2014 - Redis Labs TalksRedis Labs
 
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and PerlBrett Estrade
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud EraMydbops
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceHBaseCon
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce HBaseCon
 
Introduction to NoSql
Introduction to NoSqlIntroduction to NoSql
Introduction to NoSqlOmid Vahdaty
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyGuillaume Lefranc
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDSDenish Patel
 
An Introduction to Redis for Developers.pdf
An Introduction to Redis for Developers.pdfAn Introduction to Redis for Developers.pdf
An Introduction to Redis for Developers.pdfStephen Lorello
 
Etl confessions pg conf us 2017
Etl confessions   pg conf us 2017Etl confessions   pg conf us 2017
Etl confessions pg conf us 2017Corey Huinker
 
RESTful with Drupal - in-s and out-s
RESTful with Drupal - in-s and out-sRESTful with Drupal - in-s and out-s
RESTful with Drupal - in-s and out-sKalin Chernev
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive Omid Vahdaty
 

Similar to PostgreSQL and Redis - talk at pgcon 2013 (20)

An Introduction to Redis for .NET Developers.pdf
An Introduction to Redis for .NET Developers.pdfAn Introduction to Redis for .NET Developers.pdf
An Introduction to Redis for .NET Developers.pdf
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop Engine
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance Optimization
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
 
Redis Developers Day 2014 - Redis Labs Talks
Redis Developers Day 2014 - Redis Labs TalksRedis Developers Day 2014 - Redis Labs Talks
Redis Developers Day 2014 - Redis Labs Talks
 
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and Perl
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud Era
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
Introduction to NoSql
Introduction to NoSqlIntroduction to NoSql
Introduction to NoSql
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDS
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
An Introduction to Redis for Developers.pdf
An Introduction to Redis for Developers.pdfAn Introduction to Redis for Developers.pdf
An Introduction to Redis for Developers.pdf
 
Etl confessions pg conf us 2017
Etl confessions   pg conf us 2017Etl confessions   pg conf us 2017
Etl confessions pg conf us 2017
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
RESTful with Drupal - in-s and out-s
RESTful with Drupal - in-s and out-sRESTful with Drupal - in-s and out-s
RESTful with Drupal - in-s and out-s
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
 

Recently uploaded

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Recently uploaded (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

PostgreSQL and Redis - talk at pgcon 2013

  • 1. PostgreSQL + Redis Andrew Dunstan andrew@dunslane.net andrew.dunstan@pgexperts.com
  • 2. Topics ● What is Redis? ● The Redis Foreign Data Wrapper ● The Redis Command wrapper for Postgres ● Case study – a high performance Ad server using Postgres and Redis
  • 3. What is Redis? ● High performance in-memory key/value data store
  • 4. Redis is easy to use ● Almost no configuration ● On Fedora sudo yum install redis sudo systemctl enable redis.service sudo systemctl start redis.service redis-cli
  • 5. Redis keys ● Are just strings
  • 6. Redis data values ● Values can be scalars ● Strings ● Integers ● Values can be structured ● Lists ● Sets ● Ordered sets ● Hashes – name value pairs – c.f. Hstore
  • 7. Simple command set ● Nothing like SQL, table joins ● Command set is large but most commands only take 2 or 3 parameters ● http://redis.io/commands
  • 8. Examples - adding values ● SET mykey myvalue ● HMSET myhashkey prop1 val1 prop2 val2 ● SADD mysetkey val1 val2 val3 ● LPUSH mylist val1 val2 ● ZADD myzsetkey 1 val1 5 val2
  • 9. No creation command ● You create an object by setting or adding to it ● Almost schema-less ● Can't use a command for one object to another
  • 10. Redis keys all live in a single global namespace ● No schemas ● No separation by object type ● Very common pattern is to use fine grained keys, like (for a web session) web:111a7c9ff5afa0a7eb598b2c719c7975 ● KEYS command can find keys by pattern: ● KEYS web:* – Dangerous
  • 11. How Redis users do “tables” ● They use a prefix: ● INCR hits:2013.05.25 ● They can find all these by doing ● KEYS hits:* ● Or they keep a set with all the keys for a given type of data ● SADD hitkeyset hits:2013.05.25 ● The application has to make use of these keys – Redis itself won't
  • 12. Redis Client library ● “hiredis” ● Moderately simple ● https://github.com/redis/hiredis
  • 13. Redis Foreign Data Wrapper ● https://github.com/pg-redis-fdw/redis_fdw ● Originally written by Dave Page ● Brought up to date and extended by me
  • 14. Originally ● Only supported scalar data ● No support for segmenting namespace or use of key sets
  • 15. Updates by me ● All data types supported ● Table key prefixes supported ● Table key sets supported ● Array data returned as a PostgreSQL array literal
  • 16. Hash tables ● Most important type ● Most like PostgreSQL tables ● Best to define the table as having array of text for second column ● Turn that into json, hstore or a record.
  • 17. Example ● CREATE FOREIGN TABLE web_sessions( key text, values text[]) SERVER localredis OPTIONS (tabletype hash, tablekeyprefix 'web:'); SELECT * from web_sessions;
  • 18. Use with hstore ● CREATE TYPE websession AS ( id text, browser text, username text); SELECT populate_record(null::websession, hstore(values)) FROM websessions;
  • 19. Use with json_object ● https://bitbucket.org/qooleot/json_object ● CREATE EXTENSION json_object; SELECT json_object(values) FROM websessions;
  • 20. Key prefix vs Key Set ● Key sets are much faster ● Ad server could not meet performance goals until it switched to using key sets ● Recommended by Redis docs
  • 21. Using a key set to filter rows ● Sort of “where” clause ● Put the keys of the entries you want in a set somehow ● Can use command wrapper ● Define a new foreign table that uses that set as the keyset
  • 22. 9.3 notes ● In 9.3 there is json_populate_record() ● Could avoid use of hstore ● For post 9.3, would be a good idea to have a function converting an array of key value pairs to a record directly
  • 23. Brand new – Singleton Key tables ● Each object is a table, not a row ● Sets and lists come back as single field rows ● Ordered sets come back as one or two field rows – second field can be score ● Hashes come back as rows of key/value
  • 24. Coming soon ● Writable tables ● Supported in upcoming release 9.3
  • 25. Redis Command Wrapper ● Fills in the missing gaps in functionality ● Sponsored by IVC: http://www.ivc.com ● https://bitbucket.org/qooleot/redis_wrapper
  • 26. Redis wrapper functionality ● Thin layer over hiredis library ● Four basic functions ● redis_connect() ● redis_disconnect() ● redis_command() ● redis_command_argv()
  • 27. redis_connect() ● First argument is “handle” ● Remaining arguments are all optional ● con_host text DEFAULT '127.0.0.1'::text ● con_port integer DEFAULT 6379 ● con_pass text DEFAULT ''::text ● con_db integer DEFAULT 0 ● ignore_duplicate boolean DEFAULT false
  • 28. Redis wrapper connections are persistent ● Unlike FDW package, where they are made at the beginning of each table fetch ● Makes micro operations faster
  • 29. redis_command and redis_command_argv ● Thin layers over similarly named functions in client library ● redis_command has max 4 arguments after command string – for more use redis_command_argv ● Might switch from VARIADIC text[] to VARIADIC “any”
  • 30. Uses ● Push data into redis ● Redis utility statements from within Postgres
  • 31. Higher level functions ● redis_push_record ● con_num integer ● data record ● push_keys boolean ● key_set text ● key_prefix text ● key_fields text[]
  • 32. Why use Redis? ● Did I mention it's FAST? ● But not safe
  • 33. Our use case ● An ad server for the web ● If Redis crashes, not a tragedy ● If it's slow, it's a tragedy
  • 34. Ad Server Project by IVC http://www.ivc.com Remaining slides are mostly info from IVC
  • 35. System Goals ● Serve 10,000 ads per second per application server cpu ● Use older existing hardware ● 5 ms for Postgres database to filter from 100k+ total ads to ~ 30 that can fit a page and meet business criteria ● 5 ms to filter to 1-5 best ads per page using statistics from Redis for freshness, revenue maximization etc. ● Record ad requests, confirmations and clicks. ● 24x7 operation with automatic fail over
  • 36. Physical View 802.3ad /4 /2ea /4 /4 /4 /4 /4 Cisco 3750 stacked 1G HSRP Xen Hosts SLES 11.2 Dell R810 128G Intel e6540 24cores SLES 11.2 Dell 2950 32G Intel e5430 8 cores
  • 37. Redundancy View www.draw-shapes..de www.draw-shapes.de Cisco HSRP Keepalived NGINX Node Sentinel Tier 1 Client Tier 2 Web Tier 3 Application Tier 4 Database Shorewall Keepalived Redis Multiple Instances Postgres 9.2 Sentinel TransactionDB Pgpool Hot Replication Business DB Hot Replication Data Warehouse DB Hot Replication Skytools Londiste3
  • 38. Postgres databases ● 6 Postgres databases ● Two for business model – master and streaming hot standby (small VM) ● Two for serving ads – master and streaming hot standby (physical Dell 2950) ● Two for for storing clicks and impressions – master and hot standby (physical Del 2950) ● Fronted by redundant pg pool load balancers with fail over and automated db fail over.
  • 39. Business DB ● 30+ tables ● Example tables: ads, advertisers, publishers, ip locations ● Small number of users that manipulate the data (< 100) ● Typical application and screens ● Joining too slow to serve ads ● Tables get materialized into 2 tables in the ad serving database
  • 40. ● Two tables ● First has ip ranges so we know where the user is coming from. Ad serving is often by country, region etc. ● Second has ad sizes, ad types, campaigns, keywords, channels, advertisers etc. ● Postgres inet type and index was a must have to be successful for table one ● Tsquery/tsvector, boxes, arrays were all a must have for table two (with associated index types) Ad Serving Database
  • 41. Ad serving Database ● Materialized and copied from Business database every 3 minutes ● Indexes are created and new tables are vacuum analyzed then renamed. ● Performance goals were met. ● We doubt this could be done without Postgres data types and associated indexes ● Thanks
  • 42. Recording Ad requests/confirmations and clicks ● At 10k/sec/cpu recording ads one row at a time + updates on confirmation is too slow ● Approach: record in Redis, update in Redis and once every six minutes we batch load from Redis to Postgres. - FDW was critical. ● Partitioning (inheritance) with constraint exclusion to segregate data by day using nightly batch job. One big table with a month's worth of data would not work. ● Table partitioning is not cheap in the leading commercial product. ● Thanks
  • 43. Recording DB continued. ● Used heavily for reporting. ● Statistics tables (number of clicks, impressions etc.) are calculated every few minutes on today's data ● Calculated nightly for the whole day tables ● For reporting we needed some business data so we selectively replicate business tables in the ad recording database using Skytools. DB linking tables is too slow when joining.
  • 44. Recording DB cont'd ● Another usage is fraud detection. ● Medium and long term frequency fraud detection is one type of fraud that this database is used for.
  • 45. Redis ● In memory Database. ● Rich type support. ● Multiple copies and replication. ● Real time and short term fraud detection ● Dynamic pricing ● Statistical best Ad decision making ● Initial place to record and batch to Postgres ● Runs on VM with 94Gb of dedicated RAM.
  • 46. Redis cont'd ● FDW and commands reduce the amount of code we had to write dramatically ● FDW good performance characteristics. ● Key success factor: In memory redis DB + postgres relational DB.
  • 47. Postgres – Redis interaction ● Pricing data is pushed to Redis from Business DB via command wrapper ● Impression and Click data is pulled from Redis into Recording DB via Redis FDW
  • 48. Current Status ● In production with 4 significant customers since March 1 ● Scaling well
  • 49. Conclusions ● Postgres' rich data types and associated indexes were absolutely essential ● Redis + Postgres with good FDW integration was the second key success factor ● Node.js concurrency was essential in getting good application throughput ● Open source allowed the system to be built for less than 2% of the cost of a competing commercial system