SlideShare a Scribd company logo
1 of 38
A Postgres-XC
Distributed Key-Value Store
Mason Sharp
April 15, 2013
CC License: Attribution-NonCommercial-ShareAlike
Who Am I?
Mason Sharp
●
Original architect of Stado / GridSQL
●
One of original architects of Postgres-XC
●
Former architect at EnterpriseDB
●
Co-organizer of NYC PostgreSQL User Group
●
Co-founder and CTO of
Agenda
●
Why use a key-value store?
●
PostgreSQL features
●
XML
●
hstore
●
JSON
●
Postgres-XC Overview
●
Measurements: MongoDB versus Postgres-XC
Agenda
●
Why use a key-value store?
●
PostgreSQL features
●
XML
●
hstore
●
JSON
●
Postgres-XC Overview
●
Measurements: MongoDB versus Postgres-XC
Why Use a Key-Value Store?
●
Document oriented vs. row oriented
●
Unstructured data
●
Semi-structured data
●
Self-describing / schema-less
●
Uses Tags
●
Dynamic attributes for different objects
●
Dwight Merriman, CEO 10gen (paraphrasing):
●
“Some customers use MongoDB just for the schema-
less features. They don't need the scalability and
run on one single server” (!)
●
“Easier for developers” (...)
Why Use a Key-Value Store? (2)
●
Key-value makes for an easy distributed store
●
Multiple servers
●
In-memory
●
No complicated schema changes
●
But PostgreSQL's ALTER TABLE exclusive locks
may be brief
●
Need to be “web-scale”
●
Perception that it scales better
●
What if it no longer fits in memory?
●
A series of unfortunate anecdotes
PostgreSQL
Document Store Capabilities
XML
●
--with-libxml at build time
●
Native data type
●
CREATE TABLE foo (myid int, data xml)
●
Validation
INSERT INTO foo VALUES (2, '<aaa');
ERROR: invalid XML content
Detail: line 1: Couldn't find end of Start Tag
aaa line 1
●
Xpath
●
Mapping & Export functions
hstore
●
Contrib module
●
CREATE EXTENSION hstore
●
Key/value pairs
●
Data type
hstore
CREATE TABLE foo (myid int, hdata hstore);
INSERT INTO foo VALUES (10,
'"name"=>"fred", "department"=>"IT"');
hstore
SELECT hdata->'name' FROM foo WHERE id = 10;
?column?
----------
fred
(1 row)
# Extract all department values where it is an attribute
SELECT hdata->'department'
FROM foo
WHERE hdata ? 'department';
Hstore Manipulation
●
Concatenate
'a=>b, c=>d'::hstore || 'c=>x, d=>q'::hstore
"a"=>"b", "c"=>"x", "d"=>"q"
●
Delete element
delete('a=>1,b=>2','b')
"a"=>"1"
hstore
# Get a list of unique keys
SELECT DISTINCT (each(hdata)).key
FROM foo
hstore - Indexes
●
Btree index only helps with '='
●
Gin and gist indexes will help with operators
●
@> left operand contains right
●
? contains key
●
?& contains all keys in array
●
?| contains at least one key in array
●
Can create index on custom function
●
Extract a particular key value
JSON
●
JavaScript Object Notation
●
PostgreSQL 9.2 basic support
●
array_to_json
●
row_to_json
Note: Postgres-XC 1.0.2 based on PostgreSQL
9.1, will be based on 9.2 soon
JSON – looking ahead to
PostgreSQL 9.3
●
PostgreSQL 9.3
●
json_agg
●
hstore_to_json
●
hstore_to_json_loose
●
… and much more
http://www.postgresql.org/docs/devel/static/
functions-json.html
Composite Type
CREATE TYPE address AS (
street TEXT,
city TEXT,
state TEXT,
zip CHAR(10));
CREATE TABLE customer (
full_name TEXT,
mail_address address);
row_to_json
test1=# select row_to_json(customer) from
customer;
{"full_name":"Joe Lee",
"mail_address": {
"street":"100 Broad Street",
"city":"Red Bank",
"state":"NJ",
"zip":"07701 "}
}
19
●
PostgreSQL-based database cluster
Same API to Apps as PostgreSQL
• Same drivers
●
Symmetric Multi-headed Cluster
No master, no slave
• Not just PostgreSQL replication.
• Application can read/write to any coordinator server
Consistent database view to all the transactions
• Complete ACID property to all the transactions in the cluster
●
Scales both for Write and Read
Sep 20, 2012 Postgres-XC 20
Sep 20, 2012 Postgres-XC 21
Postgres-XC Cluster
Coordinator
Data Node
PG-XC Server
Coordinator
Data Node
Coordinator
Data Node
Coordinator
Data Node
・・・・・
Communication amongPG-XC servers
Add PG-XC servers as
needed
Global Transaction
Manager
Application can connect to any server to have the same database view and service.
GTM
PG-XC Server PG-XC Server PG-XC Server
Coordinator Overview
●
Based on PostgreSQL
●
Accepts connections from clients
●
Parses and plans requests
●
Interacts with Global Transaction Manager
●
Uses pooler for Data Node connections
●
Sends down XIDs and snapshots to Data Nodes
●
Collects results and returns to client
●
Uses two phase commit if necessary
22
Data Node Overview
●
Based on PostgreSQL
●
Where user created data is actually stored
●
Coordinators (not clients) connects to Data
Nodes
●
Accepts XID and snapshots from Coordinator
●
The rest is fairly similar to vanilla PostgreSQL
23
Sep 20, 2012 Postgres-XC 24
Global Transaction Manager
Cluster nodesGTM
XID
Snapshot
Timestamp
Sequence values
GTM Overview
●
Issues Transaction IDs (XIDs)
●
Issues Snapshots
●
Issues Timestamps
●
Issues Sequences
●
Based on PostgreSQL procarray code
●
Multi-threaded
25
GTM Proxy
●
Runs on other nodes
●
Groups requests together
●
Reduces number of connections to GTM
●
Reduces traffic to GTM
26
Sep 20, 2012 Postgres-XC 27
Summary
● Coordinator
● Visible to apps
● SQL analysis, planning, execution
● Connection pooling
● Datanode (or simply “NODE”)
● Actual database store
● Local SQL execution
● GTM (Global Transaction Manager)
● Provides consistent database view to transactions
– GXID (Global Transaction ID)
– Snapshot (List of active transactions)
– Other global values such as SEQUENCE
● GTM Proxy, integrates server-local transaction requirement for performance
Postgres-XC core, based upon
vanilla PostgreSQL
Share same binary
May want to colocate
Different binaries
MongoDB vs Postgres-XC
Performance Comparison
●
Three data nodes (16GB RAM each)
●
Postgres-XC also used a coordinator
●
Adds latency
●
Out-of-the-box default configuration
●
No replicas
Insert Comparison – single thread
●
0 – 1M Rows
●
MongoDB: 7m 06s
●
Postgres-XC: 131m 1s
●
Postgres-XC COPY: 43s
●
10M – 20M Rows
●
MongoDB: 64m 48
●
Postgres-XC: 354m 56s
GTM in XC adds a lot of latency hurting
single-threaded performance
Read Comparison
(shorter is better)
1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
2.5
MongoDB
Postgres-XC
Rows (millions)
Time(seconds)
Update Comparison – single thread
50 GB, single thread
●
1000 Updates by partitioned key
●
MongoDB: 43s
●
Postgres-XC: 1m 6s
●
1000 Updates by indexed non-partitioned key
●
MongoDB: 7m 55s
●
Postgres-XC: 1m 54s
Non-partitioned index-based faster in XC
Update Concurrency on Key
Possible Future Tests
●
Insert,Select concurrency test (important)
●
Mixed workload
●
Measure in-memory and not in-memory
●
Impact of replicas for availability
●
MongoDB replicas
●
Postgres-XC streaming replication
●
Have seen about 15% perf drop for two sync slaves
●
MongoDB Write-Concern durability settings (try
journaled)
●
Hstore
Other PostgreSQL Results?
●
Christophe Pettus:
wiki.postgresql.org/images/b/b4/Pg-as-nosql-
pgday-fosdem-2013.pdf
●
Single laptop-based tests, but interesting
●
Summary
●
PostgreSQL has schema-less functionality built-
in and can act as a key-value store
●
Postgres-XC can scale this out horizontally to
multiple servers
●
MongoDB performs much better for low
concurrency for inserts
●
In XC, use COPY or multiple threads to populate
●
Postgres-XC performs better for non-partitioned
indexed access
●
Postgres-XC can perform about the same to
MongoDB for reads
Summary (2)
If Postgres-XC generally performs similarly to
MongoDB, why not use XC and
●
Stick with ACID
●
Feel secure with PostgreSQL maturity
●
Leverage PostgreSQL features and community
Thank You
Mason Sharp
mason@stormdb.com
@mason_db
Content Attribution
●
Postgres-XC Development Group
●
Koichi Suzuki
●
Michael Paquier
●
Ashutosh Bapat
●
Pavan Deolasee
●
Christophe Pettus
●
Mason Sharp
●
...

More Related Content

What's hot

Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFS
GlusterFS
 

What's hot (20)

Migrating to postgresql
Migrating to postgresqlMigrating to postgresql
Migrating to postgresql
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelona
 
Gluster d2
Gluster d2Gluster d2
Gluster d2
 
Storage as a Service with Gluster
Storage as a Service with GlusterStorage as a Service with Gluster
Storage as a Service with Gluster
 
SQL, NoSQL, NewSQL? What's a developer to do?
SQL, NoSQL, NewSQL? What's a developer to do?SQL, NoSQL, NewSQL? What's a developer to do?
SQL, NoSQL, NewSQL? What's a developer to do?
 
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
Red Hat Gluster Storage - Direction, Roadmap and Use-CasesRed Hat Gluster Storage - Direction, Roadmap and Use-Cases
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices: A Deep DiveCeph Block Devices: A Deep Dive
Ceph Block Devices: A Deep Dive
 
Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFS
 
YDAL Barcelona
YDAL BarcelonaYDAL Barcelona
YDAL Barcelona
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB Server
 
Gluster.community.day.2013
Gluster.community.day.2013Gluster.community.day.2013
Gluster.community.day.2013
 
Gluster Storage
Gluster StorageGluster Storage
Gluster Storage
 
Pgxc scalability pg_open2012
Pgxc scalability pg_open2012Pgxc scalability pg_open2012
Pgxc scalability pg_open2012
 
Disperse xlator ramon_datalab
Disperse xlator ramon_datalabDisperse xlator ramon_datalab
Disperse xlator ramon_datalab
 
Gluster overview & future directions vault 2015
Gluster overview & future directions vault 2015Gluster overview & future directions vault 2015
Gluster overview & future directions vault 2015
 
GlusterFS And Big Data
GlusterFS And Big DataGlusterFS And Big Data
GlusterFS And Big Data
 
MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017
 
Gluster.next feb-2016
Gluster.next feb-2016Gluster.next feb-2016
Gluster.next feb-2016
 

Viewers also liked

Viewers also liked (8)

Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPostgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL Cluster
 
1
11
1
 
Flexible Indexing with Postgres
Flexible Indexing with PostgresFlexible Indexing with Postgres
Flexible Indexing with Postgres
 
How the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksHow the Postgres Query Optimizer Works
How the Postgres Query Optimizer Works
 
Distributed Postgres
Distributed PostgresDistributed Postgres
Distributed Postgres
 
Multimaster
MultimasterMultimaster
Multimaster
 
Aerospike: Key Value Data Access
Aerospike: Key Value Data AccessAerospike: Key Value Data Access
Aerospike: Key Value Data Access
 
Lightbend Lagom: Microservices Just Right
Lightbend Lagom: Microservices Just RightLightbend Lagom: Microservices Just Right
Lightbend Lagom: Microservices Just Right
 

Similar to Postgres-XC as a Key Value Store Compared To MongoDB

MongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes Platform
MongoDB
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
Krivoy Rog IT Community
 

Similar to Postgres-XC as a Key Value Store Compared To MongoDB (20)

There is Javascript in my SQL
There is Javascript in my SQLThere is Javascript in my SQL
There is Javascript in my SQL
 
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data typePostgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
 
Lean and mean MongoDB
Lean and mean MongoDBLean and mean MongoDB
Lean and mean MongoDB
 
Grails 101
Grails 101Grails 101
Grails 101
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
 
NoSQL solutions
NoSQL solutionsNoSQL solutions
NoSQL solutions
 
Building RESTtful services in MEAN
Building RESTtful services in MEANBuilding RESTtful services in MEAN
Building RESTtful services in MEAN
 
Intro Couchdb
Intro CouchdbIntro Couchdb
Intro Couchdb
 
An Introduction to Postgresql
An Introduction to PostgresqlAn Introduction to Postgresql
An Introduction to Postgresql
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
MongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes Platform
 
No sql bigdata and postgresql
No sql bigdata and postgresqlNo sql bigdata and postgresql
No sql bigdata and postgresql
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
An evening with Postgresql
An evening with PostgresqlAn evening with Postgresql
An evening with Postgresql
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patterns
 
Solr Power FTW: Powering NoSQL the World Over
Solr Power FTW: Powering NoSQL the World OverSolr Power FTW: Powering NoSQL the World Over
Solr Power FTW: Powering NoSQL the World Over
 
Introduction to new high performance storage engines in mongodb 3.0
Introduction to new high performance storage engines in mongodb 3.0Introduction to new high performance storage engines in mongodb 3.0
Introduction to new high performance storage engines in mongodb 3.0
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
Benchx: An XQuery benchmarking web application
Benchx: An XQuery benchmarking web application Benchx: An XQuery benchmarking web application
Benchx: An XQuery benchmarking web application
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Postgres-XC as a Key Value Store Compared To MongoDB

  • 1. A Postgres-XC Distributed Key-Value Store Mason Sharp April 15, 2013 CC License: Attribution-NonCommercial-ShareAlike
  • 2. Who Am I? Mason Sharp ● Original architect of Stado / GridSQL ● One of original architects of Postgres-XC ● Former architect at EnterpriseDB ● Co-organizer of NYC PostgreSQL User Group ● Co-founder and CTO of
  • 3. Agenda ● Why use a key-value store? ● PostgreSQL features ● XML ● hstore ● JSON ● Postgres-XC Overview ● Measurements: MongoDB versus Postgres-XC
  • 4. Agenda ● Why use a key-value store? ● PostgreSQL features ● XML ● hstore ● JSON ● Postgres-XC Overview ● Measurements: MongoDB versus Postgres-XC
  • 5. Why Use a Key-Value Store? ● Document oriented vs. row oriented ● Unstructured data ● Semi-structured data ● Self-describing / schema-less ● Uses Tags ● Dynamic attributes for different objects ● Dwight Merriman, CEO 10gen (paraphrasing): ● “Some customers use MongoDB just for the schema- less features. They don't need the scalability and run on one single server” (!) ● “Easier for developers” (...)
  • 6. Why Use a Key-Value Store? (2) ● Key-value makes for an easy distributed store ● Multiple servers ● In-memory ● No complicated schema changes ● But PostgreSQL's ALTER TABLE exclusive locks may be brief ● Need to be “web-scale” ● Perception that it scales better ● What if it no longer fits in memory? ● A series of unfortunate anecdotes
  • 8. XML ● --with-libxml at build time ● Native data type ● CREATE TABLE foo (myid int, data xml) ● Validation INSERT INTO foo VALUES (2, '<aaa'); ERROR: invalid XML content Detail: line 1: Couldn't find end of Start Tag aaa line 1 ● Xpath ● Mapping & Export functions
  • 9. hstore ● Contrib module ● CREATE EXTENSION hstore ● Key/value pairs ● Data type
  • 10. hstore CREATE TABLE foo (myid int, hdata hstore); INSERT INTO foo VALUES (10, '"name"=>"fred", "department"=>"IT"');
  • 11. hstore SELECT hdata->'name' FROM foo WHERE id = 10; ?column? ---------- fred (1 row) # Extract all department values where it is an attribute SELECT hdata->'department' FROM foo WHERE hdata ? 'department';
  • 12. Hstore Manipulation ● Concatenate 'a=>b, c=>d'::hstore || 'c=>x, d=>q'::hstore "a"=>"b", "c"=>"x", "d"=>"q" ● Delete element delete('a=>1,b=>2','b') "a"=>"1"
  • 13. hstore # Get a list of unique keys SELECT DISTINCT (each(hdata)).key FROM foo
  • 14. hstore - Indexes ● Btree index only helps with '=' ● Gin and gist indexes will help with operators ● @> left operand contains right ● ? contains key ● ?& contains all keys in array ● ?| contains at least one key in array ● Can create index on custom function ● Extract a particular key value
  • 15. JSON ● JavaScript Object Notation ● PostgreSQL 9.2 basic support ● array_to_json ● row_to_json Note: Postgres-XC 1.0.2 based on PostgreSQL 9.1, will be based on 9.2 soon
  • 16. JSON – looking ahead to PostgreSQL 9.3 ● PostgreSQL 9.3 ● json_agg ● hstore_to_json ● hstore_to_json_loose ● … and much more http://www.postgresql.org/docs/devel/static/ functions-json.html
  • 17. Composite Type CREATE TYPE address AS ( street TEXT, city TEXT, state TEXT, zip CHAR(10)); CREATE TABLE customer ( full_name TEXT, mail_address address);
  • 18. row_to_json test1=# select row_to_json(customer) from customer; {"full_name":"Joe Lee", "mail_address": { "street":"100 Broad Street", "city":"Red Bank", "state":"NJ", "zip":"07701 "} }
  • 19. 19 ● PostgreSQL-based database cluster Same API to Apps as PostgreSQL • Same drivers ● Symmetric Multi-headed Cluster No master, no slave • Not just PostgreSQL replication. • Application can read/write to any coordinator server Consistent database view to all the transactions • Complete ACID property to all the transactions in the cluster ● Scales both for Write and Read
  • 20. Sep 20, 2012 Postgres-XC 20
  • 21. Sep 20, 2012 Postgres-XC 21 Postgres-XC Cluster Coordinator Data Node PG-XC Server Coordinator Data Node Coordinator Data Node Coordinator Data Node ・・・・・ Communication amongPG-XC servers Add PG-XC servers as needed Global Transaction Manager Application can connect to any server to have the same database view and service. GTM PG-XC Server PG-XC Server PG-XC Server
  • 22. Coordinator Overview ● Based on PostgreSQL ● Accepts connections from clients ● Parses and plans requests ● Interacts with Global Transaction Manager ● Uses pooler for Data Node connections ● Sends down XIDs and snapshots to Data Nodes ● Collects results and returns to client ● Uses two phase commit if necessary 22
  • 23. Data Node Overview ● Based on PostgreSQL ● Where user created data is actually stored ● Coordinators (not clients) connects to Data Nodes ● Accepts XID and snapshots from Coordinator ● The rest is fairly similar to vanilla PostgreSQL 23
  • 24. Sep 20, 2012 Postgres-XC 24 Global Transaction Manager Cluster nodesGTM XID Snapshot Timestamp Sequence values
  • 25. GTM Overview ● Issues Transaction IDs (XIDs) ● Issues Snapshots ● Issues Timestamps ● Issues Sequences ● Based on PostgreSQL procarray code ● Multi-threaded 25
  • 26. GTM Proxy ● Runs on other nodes ● Groups requests together ● Reduces number of connections to GTM ● Reduces traffic to GTM 26
  • 27. Sep 20, 2012 Postgres-XC 27 Summary ● Coordinator ● Visible to apps ● SQL analysis, planning, execution ● Connection pooling ● Datanode (or simply “NODE”) ● Actual database store ● Local SQL execution ● GTM (Global Transaction Manager) ● Provides consistent database view to transactions – GXID (Global Transaction ID) – Snapshot (List of active transactions) – Other global values such as SEQUENCE ● GTM Proxy, integrates server-local transaction requirement for performance Postgres-XC core, based upon vanilla PostgreSQL Share same binary May want to colocate Different binaries
  • 28. MongoDB vs Postgres-XC Performance Comparison ● Three data nodes (16GB RAM each) ● Postgres-XC also used a coordinator ● Adds latency ● Out-of-the-box default configuration ● No replicas
  • 29. Insert Comparison – single thread ● 0 – 1M Rows ● MongoDB: 7m 06s ● Postgres-XC: 131m 1s ● Postgres-XC COPY: 43s ● 10M – 20M Rows ● MongoDB: 64m 48 ● Postgres-XC: 354m 56s GTM in XC adds a lot of latency hurting single-threaded performance
  • 30. Read Comparison (shorter is better) 1 2 3 4 5 6 7 8 9 10 0 0.5 1 1.5 2 2.5 MongoDB Postgres-XC Rows (millions) Time(seconds)
  • 31. Update Comparison – single thread 50 GB, single thread ● 1000 Updates by partitioned key ● MongoDB: 43s ● Postgres-XC: 1m 6s ● 1000 Updates by indexed non-partitioned key ● MongoDB: 7m 55s ● Postgres-XC: 1m 54s Non-partitioned index-based faster in XC
  • 33. Possible Future Tests ● Insert,Select concurrency test (important) ● Mixed workload ● Measure in-memory and not in-memory ● Impact of replicas for availability ● MongoDB replicas ● Postgres-XC streaming replication ● Have seen about 15% perf drop for two sync slaves ● MongoDB Write-Concern durability settings (try journaled) ● Hstore
  • 34. Other PostgreSQL Results? ● Christophe Pettus: wiki.postgresql.org/images/b/b4/Pg-as-nosql- pgday-fosdem-2013.pdf ● Single laptop-based tests, but interesting ●
  • 35. Summary ● PostgreSQL has schema-less functionality built- in and can act as a key-value store ● Postgres-XC can scale this out horizontally to multiple servers ● MongoDB performs much better for low concurrency for inserts ● In XC, use COPY or multiple threads to populate ● Postgres-XC performs better for non-partitioned indexed access ● Postgres-XC can perform about the same to MongoDB for reads
  • 36. Summary (2) If Postgres-XC generally performs similarly to MongoDB, why not use XC and ● Stick with ACID ● Feel secure with PostgreSQL maturity ● Leverage PostgreSQL features and community
  • 38. Content Attribution ● Postgres-XC Development Group ● Koichi Suzuki ● Michael Paquier ● Ashutosh Bapat ● Pavan Deolasee ● Christophe Pettus ● Mason Sharp ● ...