SlideShare a Scribd company logo
1 of 20
H I M A N C H A L I
T E C H L E A D D B E @ I N M O B I
PostgreSQL as NoSQL
What’s on the plate today ?
 What is the requirement for Schemaless database ?
 Data type and volume that has to be handled.
 How can we use PostgreSQL to store Schemaless
data ?
 Performance of PostgreSQL Schemaless option.
What is Schemaless data ?
 It doesn’t mean unstructured, but it’s structured over
each document
 Each Document is a pair of key-values in a
hierarchical structure of arrays
 The application needs to be aware about this
structure and handle it if it’s not getting as expected
 Modern Schemaless DBs don’t use SQL and hence
are termed NoSQL DB. (confusing term)
Requirement for Schemaless data
 Introduction of new reporting data for merchant
where dimension is not fixed
 Currently there are 15 known fields, where data
varies from 5-15 for different merchants
 In future , chances of new dimension addition
 Data size to start with is 1 million per hour
Issue with current schema DB
 Fixed schema definition
 Too many null values
 Adding new column will need schema changes
 May be updates in old rows will be required
So, should we move to NoSQL ??
If PostgreSQL can solve it !!
 3 different ways provided by PostgreSQL for
Document types
 XML
 hstore
 JSON
hstore : Smart contrib module
 A hierarchical storage type for PostgreSQL
 Key Value store with ACID compliance
 Maps String Keys to String Values or other hstore values
 Rich in its own functions like..
 h -> “a”
 h?->”a”
 h@>”a->2”
 Indexing available : GiST and GIN
 Indexes whole hierarchy , not just a key
 Expression indexes are also supported
hstore continues….
CREATE TABLE my_store
(
id character varying(1024) NOT NULL,
doc hstore,
CONSTRAINT my_store_pkey PRIMARY KEY (id)
);
CREATE INDEX my_store_doc_idx_gist
ON my_store
USING gist(doc) ;
SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘created_at=>23/12/2013’;
SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘is_active=>:t’ AND doc ? ‘has_address’
ORDER BY doc -> ‘created_at’ DESC;
SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘is_active=>:t’ AND doc ?| ARRAY[‘has_address’, ‘has_payoption’] ;
JSON
 Storage type of famous NoSQL DBs like MongoDB ,
CouchDB
 PostgreSQL introduced JSON type in 9.2; Stored as
text but with validation for JSON (--To be fixed in
9.4)
 Function like row_to_json , array_to_json
 But JSON Production and Processing absent
 9.3 Came with many new features
 to_json(any)
 Json_agg(record)
 Many Extraction Functions and operators
JSON continues…
 JSONB being introduced in 9.4
 JSONB Indexing
 Tables can be unlogged for further performance
increase at the cost of reliability
hstore and json
 hstore_to_json(hstore)
 hstore_to_json_loose(hstore)
Some numbers..
 Data used : id number, other fields text
 Record 2 Million , average 64bytes each
 Read CSV File, parse into appropriate format , Insert into DB
 Write Speed in Records/Sec
 hstore : 4600 r/s
 hstore(GiST) : 3000 r/s
 hstore (GIN) : 1700 r/s
 JSON : 4600 r/s
 MongoDB : 4000 r/s
• DB Size (in MB table/indexes)
 hstore : 300/0
 hstore(GiST) : 300/71
 hstore(GIN) : 300/700
 JSON : 300/60
 MongoDB : 1800/200
Write Speed
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Records/Second
Records/Second
Data Size
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Index(MB)
Table(MB)
 Select query on primary key
 Fetch Time in Milliseconds
 hstore : 320
 hstore(GiST) : 190
 hstore(GIN) : 180
 JSON : 20
 MongoDB : 40
 Select query on Name (filter for 100 names)
 Fetch Time in Milliseconds
 hstore : 350
 hstore(GiST) : 150
 Hstore(GIN) : 140
 JSON : 10000
 MongoDB : 450
Select Query Fetch on PK
0
50
100
150
200
250
300
350
MilliSeconds
Select Query Fetch on Text
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
MilliSeconds
MilliSeconds
Some conclusions..
 PostgreSQL can be used as schemaless DB
 PostgreSQL’s relational data storage is very efficient.
 Build indexes using expressions on common used fields
 In hstore GiST index is much more efficient than GIN in our
case
 GiST/GIN accelerates every field not just the Primary Key
 GIN indexes are best for static data because lookups are
faster.
 For dynamic data, GiST indexes are faster to update.
 Specifically, GiST indexes are very good for dynamic data and
fast if the number of unique words (lexemes) is under
100,000.
So what you got in PostgreSQL
 Use the features of relational DB in your Schemaless
world !
 Use Constraints
 Use Transactions
 Use Indexing
 Do Joins on keys
 And with all these get NoSQL(Schemaless data) requirement
fulfilled as well.
 Save the new DB migration cost and time
 ---
 As these are PostgreSQL specific features, migration to
other RDBMS not possible.
Questions???
No offence to NoSQL DBs 
Thank You !!
himamahi09@gmail.com

More Related Content

What's hot

Scaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyScaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyOliver Seemann
 
MySQL Performance Schema in Action
MySQL Performance Schema in Action MySQL Performance Schema in Action
MySQL Performance Schema in Action Mydbops
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...Ashnikbiz
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Ashnikbiz
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity PlanningNorberto Leite
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSMax Neunhöffer
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudMongoDB
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB
 
MongoDB performance
MongoDB performanceMongoDB performance
MongoDB performanceMydbops
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)Chinmay Kulkarni
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PGPgDay.Seoul
 
Webinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationWebinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationMongoDB
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeMichael Stack
 
Asko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture HighloadAsko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture HighloadOntico
 

What's hot (20)

Scaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyScaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case study
 
MySQL Performance Schema in Action
MySQL Performance Schema in Action MySQL Performance Schema in Action
MySQL Performance Schema in Action
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
 
re:dash is awesome
re:dash is awesomere:dash is awesome
re:dash is awesome
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOS
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
 
MongoDB performance
MongoDB performanceMongoDB performance
MongoDB performance
 
No sql
No sqlNo sql
No sql
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
 
Deep Dive on ArangoDB
Deep Dive on ArangoDBDeep Dive on ArangoDB
Deep Dive on ArangoDB
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
 
Webinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationWebinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail Application
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
 
Asko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture HighloadAsko Oja Moskva Architecture Highload
Asko Oja Moskva Architecture Highload
 

Similar to PostgreSQL as NoSQL

MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaperRajesh Kumar
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsSrinivas Mutyala
 
Json to hive_schema_generator
Json to hive_schema_generatorJson to hive_schema_generator
Json to hive_schema_generatorPayal Jain
 
Elephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to HadoopElephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to HadoopStuart Ainsworth
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introductionantoinegirbal
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTORiccardo Zamana
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyGuillaume Lefranc
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBMarco Segato
 

Similar to PostgreSQL as NoSQL (20)

MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
 
Mysql using php
Mysql using phpMysql using php
Mysql using php
 
Json to hive_schema_generator
Json to hive_schema_generatorJson to hive_schema_generator
Json to hive_schema_generator
 
Beyond Relational Databases
Beyond Relational DatabasesBeyond Relational Databases
Beyond Relational Databases
 
Sql Basics And Advanced
Sql Basics And AdvancedSql Basics And Advanced
Sql Basics And Advanced
 
unit 1 ppt.pptx
unit 1 ppt.pptxunit 1 ppt.pptx
unit 1 ppt.pptx
 
Elephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to HadoopElephant in the room: A DBA's Guide to Hadoop
Elephant in the room: A DBA's Guide to Hadoop
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction
 
Sql
SqlSql
Sql
 
MongoDB
MongoDBMongoDB
MongoDB
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
MYSQL.ppt
MYSQL.pptMYSQL.ppt
MYSQL.ppt
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
 

Recently uploaded

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Recently uploaded (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

PostgreSQL as NoSQL

  • 1. H I M A N C H A L I T E C H L E A D D B E @ I N M O B I PostgreSQL as NoSQL
  • 2. What’s on the plate today ?  What is the requirement for Schemaless database ?  Data type and volume that has to be handled.  How can we use PostgreSQL to store Schemaless data ?  Performance of PostgreSQL Schemaless option.
  • 3. What is Schemaless data ?  It doesn’t mean unstructured, but it’s structured over each document  Each Document is a pair of key-values in a hierarchical structure of arrays  The application needs to be aware about this structure and handle it if it’s not getting as expected  Modern Schemaless DBs don’t use SQL and hence are termed NoSQL DB. (confusing term)
  • 4. Requirement for Schemaless data  Introduction of new reporting data for merchant where dimension is not fixed  Currently there are 15 known fields, where data varies from 5-15 for different merchants  In future , chances of new dimension addition  Data size to start with is 1 million per hour
  • 5. Issue with current schema DB  Fixed schema definition  Too many null values  Adding new column will need schema changes  May be updates in old rows will be required So, should we move to NoSQL ??
  • 6. If PostgreSQL can solve it !!  3 different ways provided by PostgreSQL for Document types  XML  hstore  JSON
  • 7. hstore : Smart contrib module  A hierarchical storage type for PostgreSQL  Key Value store with ACID compliance  Maps String Keys to String Values or other hstore values  Rich in its own functions like..  h -> “a”  h?->”a”  h@>”a->2”  Indexing available : GiST and GIN  Indexes whole hierarchy , not just a key  Expression indexes are also supported
  • 8. hstore continues…. CREATE TABLE my_store ( id character varying(1024) NOT NULL, doc hstore, CONSTRAINT my_store_pkey PRIMARY KEY (id) ); CREATE INDEX my_store_doc_idx_gist ON my_store USING gist(doc) ; SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at FROM my_store WHERE doc @> ‘created_at=>23/12/2013’; SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at FROM my_store WHERE doc @> ‘is_active=>:t’ AND doc ? ‘has_address’ ORDER BY doc -> ‘created_at’ DESC; SELECT doc -> ‘text’ as merchant, doc -> ‘created_at’ as created_at FROM my_store WHERE doc @> ‘is_active=>:t’ AND doc ?| ARRAY[‘has_address’, ‘has_payoption’] ;
  • 9. JSON  Storage type of famous NoSQL DBs like MongoDB , CouchDB  PostgreSQL introduced JSON type in 9.2; Stored as text but with validation for JSON (--To be fixed in 9.4)  Function like row_to_json , array_to_json  But JSON Production and Processing absent  9.3 Came with many new features  to_json(any)  Json_agg(record)  Many Extraction Functions and operators
  • 10. JSON continues…  JSONB being introduced in 9.4  JSONB Indexing  Tables can be unlogged for further performance increase at the cost of reliability
  • 11. hstore and json  hstore_to_json(hstore)  hstore_to_json_loose(hstore)
  • 12. Some numbers..  Data used : id number, other fields text  Record 2 Million , average 64bytes each  Read CSV File, parse into appropriate format , Insert into DB  Write Speed in Records/Sec  hstore : 4600 r/s  hstore(GiST) : 3000 r/s  hstore (GIN) : 1700 r/s  JSON : 4600 r/s  MongoDB : 4000 r/s • DB Size (in MB table/indexes)  hstore : 300/0  hstore(GiST) : 300/71  hstore(GIN) : 300/700  JSON : 300/60  MongoDB : 1800/200
  • 15.  Select query on primary key  Fetch Time in Milliseconds  hstore : 320  hstore(GiST) : 190  hstore(GIN) : 180  JSON : 20  MongoDB : 40  Select query on Name (filter for 100 names)  Fetch Time in Milliseconds  hstore : 350  hstore(GiST) : 150  Hstore(GIN) : 140  JSON : 10000  MongoDB : 450
  • 16. Select Query Fetch on PK 0 50 100 150 200 250 300 350 MilliSeconds
  • 17. Select Query Fetch on Text 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 MilliSeconds MilliSeconds
  • 18. Some conclusions..  PostgreSQL can be used as schemaless DB  PostgreSQL’s relational data storage is very efficient.  Build indexes using expressions on common used fields  In hstore GiST index is much more efficient than GIN in our case  GiST/GIN accelerates every field not just the Primary Key  GIN indexes are best for static data because lookups are faster.  For dynamic data, GiST indexes are faster to update.  Specifically, GiST indexes are very good for dynamic data and fast if the number of unique words (lexemes) is under 100,000.
  • 19. So what you got in PostgreSQL  Use the features of relational DB in your Schemaless world !  Use Constraints  Use Transactions  Use Indexing  Do Joins on keys  And with all these get NoSQL(Schemaless data) requirement fulfilled as well.  Save the new DB migration cost and time  ---  As these are PostgreSQL specific features, migration to other RDBMS not possible.
  • 20. Questions??? No offence to NoSQL DBs  Thank You !! himamahi09@gmail.com