SlideShare a Scribd company logo
1 of 28
Download to read offline
gluent.com 1
SQL	
  in	
  the	
  Hybrid	
  World
Tanel	
  Poder
a	
  long	
  time	
  computer	
  performance	
  geek
gluent.com 2
Intro:	
  About	
  me
• Tanel	
  Põder
• Oracle	
  Database	
  Performance	
  geek	
  (18+	
  years)
• Exadata	
  Performance	
  geek
• Linux	
  Performance	
  geek
• Hadoop	
  Performance	
  geek
• CEO	
  &	
  co-­‐founder:
Expert	
  Oracle	
  Exadata	
  
book
(2nd edition	
  is	
  out	
  now!)
Instant	
  
promotion
gluent.com 3
The	
  Hybrid	
  World
GluentHadoop
Gluent
MSSQL
Postgres
NoSQL	
  
DBs
Big	
  Data	
  
Sources
Oracle
App	
  
X
App	
  
Y
App	
  
Z
gluent.com 4
Why	
  this	
  talk?
• Various	
  NoSQL	
  databases	
  and	
  Hadoop	
  projects	
  are	
  very	
  hot!
• Is	
  this	
  stuff	
  for	
  real?
• I’m	
  mainly	
  talking	
  about	
  traditional	
  enterprises
• Not	
  small	
  startups
• Not	
  web/mobile	
  giants
• This	
  is	
  not	
  an	
  academic	
  analysis	
  of	
  data	
  consistency	
  theorems
• A	
  basic	
  reasoning	
  from	
  a	
  database	
  professional’s	
  point	
  of	
  view…
But	
  hey,	
  it’s	
  2016!
gluent.com 5
My	
  data	
  engineering	
  career
Software	
  
Developer
Production	
  
DBA
Software	
  
Architect
Development	
  
DBA Troubleshooter,	
  
Problem	
  solver
My	
  problem-­‐
solving	
  career	
  
history	
  has	
  given	
  
great	
  insight	
  into	
  
how	
  things	
  actually	
  
work	
  (and	
  fail)
(And	
  why	
  it	
  is	
  relevant	
  to	
  this	
  talk)
Where	
  rubber	
  
meets	
  the	
  road
gluent.com 6
NoSQL
gluent.com 7
Typical properties	
  associated	
  with	
  NoSQL
1. Schemaless	
  /	
  dynamic	
  schema
• Dump	
  any	
  fields	
  into	
  (and	
  under)	
  any	
  records
2. Doesn’t	
  do	
  joins
• Everything	
  related	
  to	
  an	
  item	
  should	
  physically	
  be	
  stored	
  with	
  the	
  item
3. Horizontally	
  scalable	
  distributed	
  systems
• Data	
  in	
  a	
  “document”	
  doesn’t	
  span	
  cluster	
  nodes
• So,	
  nodes	
  don’t	
  coordinate	
  with	
  each	
  other	
  much
4. Not	
  ACID	
  compliant
• Consistent	
  or	
  “eventually	
  consistent”
• No	
  multi-­‐row	
  /	
  multi-­‐document	
  transactions
Not	
  all	
  (NoSQL)	
  
databases	
  are	
  the	
  
same!	
  No	
  single	
  
definition	
  of	
  NoSQL!
As	
  otherwise	
  the	
  
cluster	
  nodes	
  would	
  
have	
  to	
  coordinate	
  
with	
  each	
  other	
  (and	
  
two-­‐phase	
  commit)
gluent.com 8
Real	
  Life:	
  Early	
  “NoSQL”	
  drivers	
  (2004)
1. Adding	
  a	
  column	
  in	
  a	
  development	
  RDBMS takes	
  2	
  days
• Developer	
  has	
  to	
  file	
  a	
  form	
  or	
  two	
  for	
  Dev	
  DBA
2. Getting	
  this	
  column	
  into	
  production	
  RDBMS	
  takes	
  2	
  weeks
• Structural	
  table	
  changes	
  may	
  break	
  things,	
  can’t	
  release	
  often
3. Not	
  a	
  technical	
  issue,	
  but	
  a	
  process	
  limitation
• App	
  Dev	
  not	
  moving	
  fast	
  enough	
  for	
  business
gluent.com 9
Real	
  Life:	
  Early	
  “NoSQL”	
  drivers
4. Developers	
  didn’t	
  like	
  the	
  organizational	
  slowness,	
  so:
• Reaction:	
  Using	
  catch-­‐all	
  XML	
  columns	
  and	
  generic	
  entity-­‐attribute-­‐
value	
  designs	
  for	
  “dynamic”	
  data	
  models
• Result:	
  RDBMS	
  Performance	
  problems	
  with	
  real	
  datasets
• Conclusion:	
  “RDBMS	
  are	
  too	
  slow	
  for	
  modern	
  applications”
• Actual	
  reality:	
  You	
  should	
  use	
  relational	
  databases	
  in	
  a	
  relational	
  way
gluent.com 10
“Joins	
  are	
  slow!”
Compared	
  to	
  what?
gluent.com 11
Real	
  Life:	
  Joins	
  are	
  slow!	
  (MySQL	
  edition)
• Circa	
  year	
  2008
• Claim:	
  All	
  joins	
  are	
  slow!
• Look	
  deeper:	
  Joining	
  two	
  small	
  tables	
  in	
  MySQL	
  
is	
  slow
• Look	
  deeper:	
  MySQL	
  didn’t	
  have	
  hash	
  joins	
  back	
  then
• Look	
  deeper:	
  Joining	
  unindexed	
  tables	
  with	
  nested	
  loops
• Actual	
  Reality:	
  Optimize	
  physical	
  design,	
  use	
  a	
  more	
  powerful	
  RDBMS
By	
  now,	
  MySQL	
  has	
  
buffered	
  (block)	
  
nested	
  loop	
  joins	
  
and	
  MariaDB
buffered	
  hash	
  joins
gluent.com 12
Joins	
  are	
  slow!	
  (modern	
  scale-­‐out	
  edition)
• Store	
  all	
  related	
  records	
  of	
  a	
  document	
  in	
  a	
  single	
  server	
  node	
  
in	
  a	
  nested	
  structure
• Orders under	
  a	
  Customer document
• Order_items stored	
  under	
  an	
  Order document
• Data	
  “pre-­‐joined”	
  on	
  write:	
  De-­‐normalization	
  (embedding)
• Physical	
  co-­‐location	
  to	
  avoid	
  extra	
  IO	
  +	
  talking	
  to	
  other	
  cluster	
  nodes
• Document	
  Schema	
  design	
  important	
  for	
  performance!
Customer (Name: Tanel, Address: Somestreet 12-34, SomeCity, ZZ)
Order #123 (Order Date: 2015-12-01)
Order Item #1 (Quantity: 1)
Product (Name: Vacuum Cleaner)
Order Item #2 (Quantity: 3)
Product (Name: Air Filter)
All	
  the	
  ”nested”	
  data	
  
related	
  to	
  a	
  
customer	
  contained	
  
in	
  a	
  single	
  physical	
  
location	
  &	
  server
Plus	
  mirroring	
  to	
  
other	
  server(s)	
  of	
  
course
gluent.com 13
Document	
  data	
  retrieval	
  (simplified)
Customer
“Tanel”
Order	
  #123
Item	
  #2
Product	
  Y
Customer
“Ted”
Customer
“Joe”
Item	
  #1
Product	
  Y
Data	
  “pre-­‐joined”	
  in	
  
a	
  nested	
  structure
gluent.com 14
Relational	
  data	
  retrieval	
  (simplified)
Customers Orders
Order	
  
Items
Products
Retrieval	
  must	
  visit	
  
multiple	
  separate	
  
(indexes)	
  and	
  tables
gluent.com 15
Joins	
  in	
  MongoDB	
  3.2
• Source:	
  https://www.mongodb.com/blog/post/joins-­‐and-­‐other-­‐aggregation-­‐
enhancements-­‐coming-­‐in-­‐mongodb-­‐3-­‐2-­‐part-­‐1-­‐of-­‐3-­‐introduction
gluent.com 16
Queries	
  in	
  MongoDB	
  3.2
• Source:	
  https://www.mongodb.com/blog/post/joins-­‐and-­‐other-­‐aggregation-­‐
enhancements-­‐coming-­‐in-­‐mongodb-­‐3-­‐2-­‐part-­‐1-­‐of-­‐3-­‐introduction
gluent.com 17
Filtering	
  in	
  MongoDB	
  3.2
• Query	
  &	
  filter	
  an	
  array	
  of	
  elements:
• MongoDB	
  Documentation:	
  
https://docs.mongodb.org/v2.6/reference/method/db.collection.find/#examples
SELECT
FROM	
  bios
WHERE	
  
award	
  =	
  ‘Turing	
  Award’	
  
AND	
  year	
  >	
  1980	
  ?
gluent.com 18
Joins	
  in	
  MongoDB	
  3.2
• Source:	
  https://www.mongodb.com/blog/post/joins-­‐and-­‐other-­‐aggregation-­‐
enhancements-­‐coming-­‐in-­‐mongodb-­‐3-­‐2-­‐part-­‐1-­‐of-­‐3-­‐introduction
gluent.com 19
Joins	
  in	
  Cassandra
• Source:	
  http://www.datastax.com/2015/03/how-­‐to-­‐do-­‐joins-­‐in-­‐apache-­‐cassandra-­‐
and-­‐datastax-­‐enterprise
gluent.com 20
Joins	
  in	
  Cassandra
Spark	
  SQL,	
  
Tableau,	
  etc
The	
  join	
  is	
  performed	
  
in	
  application	
  layer	
  
(Spark,	
  Tableau),	
  not	
  
pushed	
  into	
  DB	
  layer
User
gluent.com 21
Hadoop!
SQL	
  on
gluent.com 22
Why	
  Hadoop?
• Scalability	
  in	
  Software!
• Open	
  Data	
  Formats
• Future-­‐proof!
• One	
  Data,	
  Many	
  Engines!
Hive Impala Spark Presto Giraph
gluent.com 23
Scalability	
  vs.	
  Features	
  (Hive	
  example)
$ hive (version 1.2.1 HDP 2.3.1)
hive> SELECT SUM(duration)
> FROM call_detail_records
> WHERE
> type = 'INTERNET‘
> OR phone_number IN ( SELECT phone_number
> FROM customer_details
> WHERE region = 'R04' );
FAILED: SemanticException [Error 10249]: Line 5:17
Unsupported SubQuery Expression 'phone_number':
Only SubQuery expressions that are top level
conjuncts are allowed
We	
  Oracle	
  users	
  
have	
  been	
  spoiled	
  
with	
  very	
  
sophisticated	
  SQL	
  
engine	
  for	
  years	
  :)
gluent.com 24
Scalability	
  vs.	
  Features	
  (Impala	
  example)
$ impala-shell
SELECT SUM(order_total)
FROM orders
WHERE order_mode='online'
OR customer_id IN (SELECT customer_id FROM customers
WHERE customer_class = 'Prime');
Query: select SUM(order_total) FROM orders WHERE
order_mode='online' OR customer_id IN (SELECT
customer_id FROM customers WHERE customer_class =
'Prime')
ERROR: AnalysisException: Subqueries in OR predicates are
not supported: order_mode = 'online' OR customer_id IN
(SELECT customer_id FROM soe.customers WHERE
customer_class = 'Prime')
Cloudera:	
  CDH5.5	
  
Impala	
  2.3.0
gluent.com 25
Scalability	
  vs.	
  Features	
  (Hive	
  example)
$ hive (version 1.2.1 HDP 2.3.1)
hive> SELECT COUNT(*) FROM sales10m
> WHERE
> cust_id IN (SELECT cust_id FROM c)
> AND prod_id IN (SELECT prod_id FROM p);
FAILED: SemanticException [Error 10249]: Line 1:85
Unsupported SubQuery Expression 'prod_id': Only 1
SubQuery expression is supported.
These	
  are	
  deliberately	
  very	
  
simple	
  examples.	
  Not	
  even	
  
talking	
  about	
  complex	
  
window	
  functions	
   etc
gluent.com 26
Engineering	
  Tradeoffs
• Speed	
  of	
  Application	
  Delivery
• Application	
  Speed	
  &	
  Efficiency
• Workload	
  Scalability
• Application	
  Reliability
• Cost!
gluent.com 27
Hadoop,	
  NoSQL,	
  RDBMS	
  One	
  way	
  to	
  look	
  at	
  it
Hadoop
Data	
  Lake
RDBMS
Complex	
  
Transactions
Analytics
All	
  Data!	
  
Scalable,
Flexible
Sophisticated,
Out-­‐of-­‐the-­‐box,
Mature
“NoSQL”	
  
Transaction	
  
Ingest
Scalable	
  
Simple
Transactions
Unified	
  SQL	
  access
gluent.com 28
Thanks!
http://blog.tanelpoder.com
tanel@tanelpoder.com
@tanelpoder
We	
  are	
  hiring	
  developers	
  &	
  data	
  engineers!!!
http://gluent.com

More Related Content

What's hot

OOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AASOOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AAS
Kyle Hailey
 

What's hot (20)

Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...
 
Oracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known FeaturesOracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known Features
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
 
Connecting Hadoop and Oracle
Connecting Hadoop and OracleConnecting Hadoop and Oracle
Connecting Hadoop and Oracle
 
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
 
Making MySQL highly available using Oracle Grid Infrastructure
Making MySQL highly available using Oracle Grid InfrastructureMaking MySQL highly available using Oracle Grid Infrastructure
Making MySQL highly available using Oracle Grid Infrastructure
 
PDB Provisioning with Oracle Multitenant Self Service Application
PDB Provisioning with Oracle Multitenant Self Service ApplicationPDB Provisioning with Oracle Multitenant Self Service Application
PDB Provisioning with Oracle Multitenant Self Service Application
 
Oracle Database 12c Release 2 - New Features On Oracle Database Exadata Expr...
Oracle Database 12c Release 2 - New Features On Oracle Database Exadata  Expr...Oracle Database 12c Release 2 - New Features On Oracle Database Exadata  Expr...
Oracle Database 12c Release 2 - New Features On Oracle Database Exadata Expr...
 
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft EngineerPLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
 
Crating a Robust Performance Strategy
Crating a Robust Performance StrategyCrating a Robust Performance Strategy
Crating a Robust Performance Strategy
 
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
Collaborate 2019 - How to Understand an AWR Report
Collaborate 2019 - How to Understand an AWR ReportCollaborate 2019 - How to Understand an AWR Report
Collaborate 2019 - How to Understand an AWR Report
 
Performance Management in Oracle 12c
Performance Management in Oracle 12cPerformance Management in Oracle 12c
Performance Management in Oracle 12c
 
Simplifying EBS 12.2 ADOP - Collaborate 2019
Simplifying EBS 12.2 ADOP - Collaborate 2019   Simplifying EBS 12.2 ADOP - Collaborate 2019
Simplifying EBS 12.2 ADOP - Collaborate 2019
 
OOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AASOOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AAS
 
Awr doag
Awr doagAwr doag
Awr doag
 
Oracle Insert Statements for DBAs and Developers
Oracle Insert Statements for DBAs and DevelopersOracle Insert Statements for DBAs and Developers
Oracle Insert Statements for DBAs and Developers
 
Oracle Database Performance Tuning: The Not SQL Option
Oracle Database Performance Tuning: The Not SQL OptionOracle Database Performance Tuning: The Not SQL Option
Oracle Database Performance Tuning: The Not SQL Option
 
Oracle Data Redaction - EOUC
Oracle Data Redaction - EOUCOracle Data Redaction - EOUC
Oracle Data Redaction - EOUC
 

Similar to SQL in the Hybrid World

Kent-Graziano-Intro-to-Datavault_short.pdf
Kent-Graziano-Intro-to-Datavault_short.pdfKent-Graziano-Intro-to-Datavault_short.pdf
Kent-Graziano-Intro-to-Datavault_short.pdf
abhaybansal43
 
AOUG_11Nov2016_Challenges_with_EBS12_2
AOUG_11Nov2016_Challenges_with_EBS12_2AOUG_11Nov2016_Challenges_with_EBS12_2
AOUG_11Nov2016_Challenges_with_EBS12_2
Sean Braymen
 

Similar to SQL in the Hybrid World (20)

Building data pipelines at Shopee with DEC
Building data pipelines at Shopee with DECBuilding data pipelines at Shopee with DEC
Building data pipelines at Shopee with DEC
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2
 
Framing the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLFraming the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQL
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business Outcomes
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
My C.V
My C.VMy C.V
My C.V
 
Kent-Graziano-Intro-to-Datavault_short.pdf
Kent-Graziano-Intro-to-Datavault_short.pdfKent-Graziano-Intro-to-Datavault_short.pdf
Kent-Graziano-Intro-to-Datavault_short.pdf
 
Software Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableSoftware Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuable
 
React.js at Cortex
React.js at CortexReact.js at Cortex
React.js at Cortex
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
React & redux
React & reduxReact & redux
React & redux
 
AOUG_11Nov2016_Challenges_with_EBS12_2
AOUG_11Nov2016_Challenges_with_EBS12_2AOUG_11Nov2016_Challenges_with_EBS12_2
AOUG_11Nov2016_Challenges_with_EBS12_2
 
Architectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyArchitectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and Consistently
 
Architectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyArchitectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and Consistently
 
Dynamic DDL: Adding structure to streaming IoT data on the fly
Dynamic DDL: Adding structure to streaming IoT data on the flyDynamic DDL: Adding structure to streaming IoT data on the fly
Dynamic DDL: Adding structure to streaming IoT data on the fly
 
From ddd to DDD : My journey from data-driven development to Domain-Driven De...
From ddd to DDD : My journey from data-driven development to Domain-Driven De...From ddd to DDD : My journey from data-driven development to Domain-Driven De...
From ddd to DDD : My journey from data-driven development to Domain-Driven De...
 
Neethu_Abraham
Neethu_AbrahamNeethu_Abraham
Neethu_Abraham
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 

More from Tanel Poder

More from Tanel Poder (12)

Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
 
Modern Linux Performance Tools for Application Troubleshooting
Modern Linux Performance Tools for Application TroubleshootingModern Linux Performance Tools for Application Troubleshooting
Modern Linux Performance Tools for Application Troubleshooting
 
GNW01: In-Memory Processing for Databases
GNW01: In-Memory Processing for DatabasesGNW01: In-Memory Processing for Databases
GNW01: In-Memory Processing for Databases
 
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools short
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
 
Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention Troubleshooting
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance TuningOracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance Tuning
 

Recently uploaded

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Recently uploaded (20)

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

SQL in the Hybrid World

  • 1. gluent.com 1 SQL  in  the  Hybrid  World Tanel  Poder a  long  time  computer  performance  geek
  • 2. gluent.com 2 Intro:  About  me • Tanel  Põder • Oracle  Database  Performance  geek  (18+  years) • Exadata  Performance  geek • Linux  Performance  geek • Hadoop  Performance  geek • CEO  &  co-­‐founder: Expert  Oracle  Exadata   book (2nd edition  is  out  now!) Instant   promotion
  • 3. gluent.com 3 The  Hybrid  World GluentHadoop Gluent MSSQL Postgres NoSQL   DBs Big  Data   Sources Oracle App   X App   Y App   Z
  • 4. gluent.com 4 Why  this  talk? • Various  NoSQL  databases  and  Hadoop  projects  are  very  hot! • Is  this  stuff  for  real? • I’m  mainly  talking  about  traditional  enterprises • Not  small  startups • Not  web/mobile  giants • This  is  not  an  academic  analysis  of  data  consistency  theorems • A  basic  reasoning  from  a  database  professional’s  point  of  view… But  hey,  it’s  2016!
  • 5. gluent.com 5 My  data  engineering  career Software   Developer Production   DBA Software   Architect Development   DBA Troubleshooter,   Problem  solver My  problem-­‐ solving  career   history  has  given   great  insight  into   how  things  actually   work  (and  fail) (And  why  it  is  relevant  to  this  talk) Where  rubber   meets  the  road
  • 7. gluent.com 7 Typical properties  associated  with  NoSQL 1. Schemaless  /  dynamic  schema • Dump  any  fields  into  (and  under)  any  records 2. Doesn’t  do  joins • Everything  related  to  an  item  should  physically  be  stored  with  the  item 3. Horizontally  scalable  distributed  systems • Data  in  a  “document”  doesn’t  span  cluster  nodes • So,  nodes  don’t  coordinate  with  each  other  much 4. Not  ACID  compliant • Consistent  or  “eventually  consistent” • No  multi-­‐row  /  multi-­‐document  transactions Not  all  (NoSQL)   databases  are  the   same!  No  single   definition  of  NoSQL! As  otherwise  the   cluster  nodes  would   have  to  coordinate   with  each  other  (and   two-­‐phase  commit)
  • 8. gluent.com 8 Real  Life:  Early  “NoSQL”  drivers  (2004) 1. Adding  a  column  in  a  development  RDBMS takes  2  days • Developer  has  to  file  a  form  or  two  for  Dev  DBA 2. Getting  this  column  into  production  RDBMS  takes  2  weeks • Structural  table  changes  may  break  things,  can’t  release  often 3. Not  a  technical  issue,  but  a  process  limitation • App  Dev  not  moving  fast  enough  for  business
  • 9. gluent.com 9 Real  Life:  Early  “NoSQL”  drivers 4. Developers  didn’t  like  the  organizational  slowness,  so: • Reaction:  Using  catch-­‐all  XML  columns  and  generic  entity-­‐attribute-­‐ value  designs  for  “dynamic”  data  models • Result:  RDBMS  Performance  problems  with  real  datasets • Conclusion:  “RDBMS  are  too  slow  for  modern  applications” • Actual  reality:  You  should  use  relational  databases  in  a  relational  way
  • 10. gluent.com 10 “Joins  are  slow!” Compared  to  what?
  • 11. gluent.com 11 Real  Life:  Joins  are  slow!  (MySQL  edition) • Circa  year  2008 • Claim:  All  joins  are  slow! • Look  deeper:  Joining  two  small  tables  in  MySQL   is  slow • Look  deeper:  MySQL  didn’t  have  hash  joins  back  then • Look  deeper:  Joining  unindexed  tables  with  nested  loops • Actual  Reality:  Optimize  physical  design,  use  a  more  powerful  RDBMS By  now,  MySQL  has   buffered  (block)   nested  loop  joins   and  MariaDB buffered  hash  joins
  • 12. gluent.com 12 Joins  are  slow!  (modern  scale-­‐out  edition) • Store  all  related  records  of  a  document  in  a  single  server  node   in  a  nested  structure • Orders under  a  Customer document • Order_items stored  under  an  Order document • Data  “pre-­‐joined”  on  write:  De-­‐normalization  (embedding) • Physical  co-­‐location  to  avoid  extra  IO  +  talking  to  other  cluster  nodes • Document  Schema  design  important  for  performance! Customer (Name: Tanel, Address: Somestreet 12-34, SomeCity, ZZ) Order #123 (Order Date: 2015-12-01) Order Item #1 (Quantity: 1) Product (Name: Vacuum Cleaner) Order Item #2 (Quantity: 3) Product (Name: Air Filter) All  the  ”nested”  data   related  to  a   customer  contained   in  a  single  physical   location  &  server Plus  mirroring  to   other  server(s)  of   course
  • 13. gluent.com 13 Document  data  retrieval  (simplified) Customer “Tanel” Order  #123 Item  #2 Product  Y Customer “Ted” Customer “Joe” Item  #1 Product  Y Data  “pre-­‐joined”  in   a  nested  structure
  • 14. gluent.com 14 Relational  data  retrieval  (simplified) Customers Orders Order   Items Products Retrieval  must  visit   multiple  separate   (indexes)  and  tables
  • 15. gluent.com 15 Joins  in  MongoDB  3.2 • Source:  https://www.mongodb.com/blog/post/joins-­‐and-­‐other-­‐aggregation-­‐ enhancements-­‐coming-­‐in-­‐mongodb-­‐3-­‐2-­‐part-­‐1-­‐of-­‐3-­‐introduction
  • 16. gluent.com 16 Queries  in  MongoDB  3.2 • Source:  https://www.mongodb.com/blog/post/joins-­‐and-­‐other-­‐aggregation-­‐ enhancements-­‐coming-­‐in-­‐mongodb-­‐3-­‐2-­‐part-­‐1-­‐of-­‐3-­‐introduction
  • 17. gluent.com 17 Filtering  in  MongoDB  3.2 • Query  &  filter  an  array  of  elements: • MongoDB  Documentation:   https://docs.mongodb.org/v2.6/reference/method/db.collection.find/#examples SELECT FROM  bios WHERE   award  =  ‘Turing  Award’   AND  year  >  1980  ?
  • 18. gluent.com 18 Joins  in  MongoDB  3.2 • Source:  https://www.mongodb.com/blog/post/joins-­‐and-­‐other-­‐aggregation-­‐ enhancements-­‐coming-­‐in-­‐mongodb-­‐3-­‐2-­‐part-­‐1-­‐of-­‐3-­‐introduction
  • 19. gluent.com 19 Joins  in  Cassandra • Source:  http://www.datastax.com/2015/03/how-­‐to-­‐do-­‐joins-­‐in-­‐apache-­‐cassandra-­‐ and-­‐datastax-­‐enterprise
  • 20. gluent.com 20 Joins  in  Cassandra Spark  SQL,   Tableau,  etc The  join  is  performed   in  application  layer   (Spark,  Tableau),  not   pushed  into  DB  layer User
  • 22. gluent.com 22 Why  Hadoop? • Scalability  in  Software! • Open  Data  Formats • Future-­‐proof! • One  Data,  Many  Engines! Hive Impala Spark Presto Giraph
  • 23. gluent.com 23 Scalability  vs.  Features  (Hive  example) $ hive (version 1.2.1 HDP 2.3.1) hive> SELECT SUM(duration) > FROM call_detail_records > WHERE > type = 'INTERNET‘ > OR phone_number IN ( SELECT phone_number > FROM customer_details > WHERE region = 'R04' ); FAILED: SemanticException [Error 10249]: Line 5:17 Unsupported SubQuery Expression 'phone_number': Only SubQuery expressions that are top level conjuncts are allowed We  Oracle  users   have  been  spoiled   with  very   sophisticated  SQL   engine  for  years  :)
  • 24. gluent.com 24 Scalability  vs.  Features  (Impala  example) $ impala-shell SELECT SUM(order_total) FROM orders WHERE order_mode='online' OR customer_id IN (SELECT customer_id FROM customers WHERE customer_class = 'Prime'); Query: select SUM(order_total) FROM orders WHERE order_mode='online' OR customer_id IN (SELECT customer_id FROM customers WHERE customer_class = 'Prime') ERROR: AnalysisException: Subqueries in OR predicates are not supported: order_mode = 'online' OR customer_id IN (SELECT customer_id FROM soe.customers WHERE customer_class = 'Prime') Cloudera:  CDH5.5   Impala  2.3.0
  • 25. gluent.com 25 Scalability  vs.  Features  (Hive  example) $ hive (version 1.2.1 HDP 2.3.1) hive> SELECT COUNT(*) FROM sales10m > WHERE > cust_id IN (SELECT cust_id FROM c) > AND prod_id IN (SELECT prod_id FROM p); FAILED: SemanticException [Error 10249]: Line 1:85 Unsupported SubQuery Expression 'prod_id': Only 1 SubQuery expression is supported. These  are  deliberately  very   simple  examples.  Not  even   talking  about  complex   window  functions   etc
  • 26. gluent.com 26 Engineering  Tradeoffs • Speed  of  Application  Delivery • Application  Speed  &  Efficiency • Workload  Scalability • Application  Reliability • Cost!
  • 27. gluent.com 27 Hadoop,  NoSQL,  RDBMS  One  way  to  look  at  it Hadoop Data  Lake RDBMS Complex   Transactions Analytics All  Data!   Scalable, Flexible Sophisticated, Out-­‐of-­‐the-­‐box, Mature “NoSQL”   Transaction   Ingest Scalable   Simple Transactions Unified  SQL  access
  • 28. gluent.com 28 Thanks! http://blog.tanelpoder.com tanel@tanelpoder.com @tanelpoder We  are  hiring  developers  &  data  engineers!!! http://gluent.com