Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NewSQL vs NoSQL for New OLTP

Historically, enterprises used traditional RDBMs for on-line transaction processing (OLTP) applications. We affectionately call these systems OldSQL, because they are largely legacy code bases, originally written decades ago. New OLTP applications have more extreme performance requirements than the Old OLTP applications of yesteryears. These are caused by web users directly submitting transactions rather than using a professional terminal operator and by mobile devices (and sensors) enabling transaction submission from many more locations. In a considerable number of modern applications (multiplayer games, risk analysis in electronic trading, gambling, social networks, etc.) OldSQL is cracking under the volume of interactions. This talk contrasts two alternatives to OldSQL in this area:

NoSQL, where both SQL and ACID transactions are jettisoned for better performance
NewSQL, where SQL and ACID are retained, and better performance is delivered through innovative architectures

NewSQL vs NoSQL for New OLTP

  1. 1. the NewSQL database you’ll never outgrowOldSQL vs. NoSQL vs. NewSQL on New OLTPMichael Stonebraker, CTOVoltDB, Inc.
  2. 2. Old OLTP  Remember how we used to buy airplane tickets in the 1980s + By telephone + Through an intermediary (professional terminal operator)  Commerce at the speed of the intermediary  In 1985, 1,000 transactions per second was considered an incredible stretch goal!!!! + HPTS (1985)VoltDB 2
  3. 3. How has OLTP Changed in 25 Years? The internet + Client is no longer a professional terminal operator + Instead Aunt Martha is using the web herself + Sends volume through the roofVoltDB 3
  4. 4. How has OLTP Changed in 25 Years? PDAs and sensors + Your cell phone is a transaction originator + Everything is being geo-positioned by sensors (marathon runners, your car, ….) + Sends volume through the roofVoltDB 4
  5. 5. How has OLTP Changed in 25 Years? The definitions + “Online” no longer exclusively means a human operator — The oncoming data tsunami is often device and system-generated + “Transaction” now transcends the traditional business transaction — High-throughput ACID write operations are a new requirement + “HA” and “durability” are now core database requirementsVoltDB 5
  6. 6. Example NewOLTP Use Cases Data High-frequency Lower-frequency Source operations operations Write/index all trades, store Show consolidated risk Capital markets tick data across traders Call initiation request Real-time authorization Fraud detection/analysis Inbound HTTP Visitor logging, analysis, Traffic pattern analytics requests alerting Rank scores: Online game •Defined intervals Leaderboard lookups •Player “bests” Real-time ad trading Match form factor, Report ad performance systems placement criteria, bid/ask from exhaust stream Mobile device location Location updates, QoS, Analytics on transactions sensor transactionsVoltDB VoltDB 6 6 6
  7. 7. New OLTP Challenges New OLTP and You You need to ingest the firehose in real time You need to process, validate, enrich and respond in real-time You often need real-time analyticsVoltDB VoltDB 7 7 7
  8. 8. Solution Choices  OldSQL + Legacy RDBMS vendors  NoSQL + Give up SQL and ACID for performance  NewSQL + Preserve SQL and ACID + Get performance from a new architectureVoltDB 8
  9. 9. OldSQL Traditional SQL vendors (the “elephants”) + Code lines dating from the 1980’s + “bloatware” + Not very good at anything — Can be beaten by at least an order of magnitude in every vertical market I know of + Mediocre performance on New OLTP — At low velocity it doesn’t matter — Otherwise you get to tear your hair outVoltDB 9
  10. 10. Reality Check  TPC-C CPU cycles  On the Shore DBMS prototype  Elephants should be similarVoltDB 10
  11. 11. The Elephants  Are slow because they spend all of their time on overhead!!! + Not on useful work  Would have to re-architect their legacy code to do betterVoltDB 11
  12. 12. To Go a Lot Faster You Have to……  Focus on overhead + Better B-trees affects only 4% of the path length  Get rid of ALL major sources of overhead + Main memory deployment – gets rid of buffer pool — Leaving other 75% of overhead intact — i.e. win is 25%VoltDB 12
  13. 13. NoSQL  Give up SQL  Give up ACIDVoltDB 13
  14. 14. Give Up SQL? Compiler translates SQL at compile time into a sequence of low level operations Similar to what the NoSQL products make you program in your application 30 years of RDBMS experience + Hard to beat the compiler + High level languages are good (data independence, less code, …) + Stored procedures are good! — One round trip from app to DBMS rather than one one round trip per record — Move the code to the data, not the other way aroundVoltDB 14
  15. 15. Give Up ACID? If you have multi-record transactions + Need ACID or have to program it in user-level code + Rolling your own is messy, brittle, error-prone Compensating transactions = pain + They’re people intensive and expensive + They cause customer/user dissatisfaction + Non-commutative single-record transactions (e.g. buy an item) simply failVoltDB 15
  16. 16. NoSQL Summary  Appropriate for non-transactional systems  Appropriate for single record transactions that are commutative  Not a good fit for New OLTP  Use the right tool for the job Interesting … Two recently-proposed NoSQL language standards – CQL and UnQL – are amazingly similar to (you guessed it!) SQLVoltDB 16
  17. 17. NewSQL  SQL  ACID  Performance and scalability through modern innovative software architectureVoltDB 17
  18. 18. NewSQL  Needs something other than traditional record level locking (1st big source of overhead) + timestamp order + MVCC + Your good idea goes hereVoltDB 18
  19. 19. NewSQL  Needs a solution to buffer pool overhead (2nd big source of overhead) + Main memory (at least for data that is not cold) + Some other way to reduce buffer pool costVoltDB 19
  20. 20. NewSQL  Needs a solution to latching for shared data structures (3rd big source of overhead) + Some innovative use of B-trees + Single-threading + Your good idea goes hereVoltDB 20
  21. 21. NewSQL  Needs a solution to write-ahead logging (4th big source of overhead) + Obvious answer is built-in replication and failover + New OLTP views this as a requirement anyway  Some details + On-line failover? + On-line failback? + LAN network partitioning? + WAN network partitioning?VoltDB 21
  22. 22. Node Speed Really Matters Application Requirement: 1MM operations/sec at scale Operations/sec/node 10,000 Operations/sec/node 100,000 Nodes (without HA) 100 Nodes (without HA) 10 HA Replication factor 2 HA Replication factor 2 Nodes (including HA) 300 Nodes (including HA) 30  Greater scale through reduced network traffic  Reduced probability of hardware failures  Lower monetary and environmental costs A high-scaling RDBMS must combine transparent distribution, built-in durability and blazing node speedVoltDB 22
  23. 23. A NewSQL Example – VoltDB  Main-memory storage  Single threaded, run Xacts to completion + No locking + No latching  Built-in HA and durability + No logVoltDB 23
  24. 24. Yabut: What About Multicore?  For A K-core CPU, divide memory into K (non overlapping) buckets  i.e. convert multi-core to K single coresVoltDB 24
  25. 25. Where all the time goes… revisited Before VoltDBVoltDB 25
  26. 26. How Scalable is VoltDB?“ VoltDB is very scalable; it should scale to 120 partitions, 39 servers, and 1.6 million complex transactions per second at over 300 CPU cores. SGI Announces Support and Record Benchmarks for VoltDB Database High Performance SGI® Rackable™ Baron Schwartz Cluster Delivers Over Three Million Chief Performance Architect Transactions per Second Percona Vanilla Hardware Pumped-up HardwareVoltDB 26
  27. 27. Current VoltDB Status  Runs a subset of SQL (which is getting larger)  On VoltDB clusters (in memory on commodity gear)  No WAN support yet + Discussions abound on exactly what to do here  45X a popular OldSQL DBMS on TPC-C  5-7X Cassandra on VoltDB K-V layer  Clearly note this is an open source system!VoltDB 27
  28. 28. Summary Old OLTP New OLTP  Too slow OldSQL for New OLTP  Does not scale  Lacks consistency guarantees NoSQL for New OLTP  Low-level interface  Fast, scalable and consistent NewSQL for New OLTP  Supports SQLVoltDB 28
  29. 29. the NewSQL database you’ll never outgrow Thank You