This presentation, given by Dave Rosenthal at NoSQL Now! 2013, presents the case for why he believes NoSQL databases will need to support ACID transactions in order for developers to more easily build, deploy, and scale applications in the future.
2. NoSQL‘s Motivation
Make it easy to build and deploy
applications.
Ease of scaling and operation
Fault tolerance
Many data models
Good price/performance
X ACID transactions
3. What if we had ACID?
Good for financial applications?
Big performance hit?
Sacrifice availability?
Nope… When NoSQL has ACID, it
opens up a very different path.
5. Bugs don‘t appear under concurrency
• ACID means isolation.
• Reason locally rather than globally.
– If every transaction maintains an
invariant, then multiple clients running
any combination of concurrent
transactions also maintain that invariant.
• The impact of each client is isolated.
6. Isolation means strong abstractions
• Example interface:
– storeUser(name, SSN)
– getName(SSN)
– getSSN(name)
• Invariant: N == getName(getSSN(N))
– Always works with single client.
– Without ACID: Fails with concurrent clients.
– With ACID: Works with concurrent clients.
8. Examples of ―easy‖
SQL database in one day
Indexed table layer (3 days * 1 intern)
Fractal spatial index in 200 lines:
9. Remove/decouple features from the DB
With strong abstractions, features can be
moved from the DB to more flexible code.
Examples:
– Indexing
– More efficient data structures (e.g. using
pointers/indirection)
– Query language
10. Remove/decouple data models
• A NoSQL database with ACID can
provide polyglot data models and APIs.
– Key-value, graph, column-oriented,
document, relational, publish-subscribe,
spatial, blobs, ORMs, analytics, etc…
• Without requiring separate physical
databases. This is a huge ops win.
14. Databases in 2008
NoSQL emerges to replace scalable
sharding/caching solutions that had already
thrown out consistency.
• BigTable
• Dynamo
• Voldemort
• Cassandra
16. The CAP2008 theorem
―Data inconsistency in large-scale
reliable distributed systems has to be
tolerated … [for performance and to
handle faults]‖
- Werner Vogles (CTO Amazon.com)
17. The CAP2008 theorem
―The availability property means that
the system is ‗online‘ and the client of
the system can expect to receive a
response for its request.‖
- Wrong descriptions all over the
web
18. CAP2008 Conclusions?
• Scaling requires distributed design
• Distributed requires high availability
• Availability requires no C
So, if we want scalability we have to
give up C, a cornerstone of ACID,
right?
20. Fast forward to CAP2013
―Why ’2 out of 3’ is misleading‖
―CAP prohibits… perfect
availability‖
- Eric Brewer
21. Fast forward to CAP2013
―Achieving strict consistency can come
at a cost in update or read latency,
and may result in lower throughput…‖
- Werner Vogles (Amazon CTO)
22. Fast forward to CAP2013
―…it is better to have application
programmers deal with performance
problems due to overuse of transactions
as bottlenecks arise, rather than always
coding around the lack of transactions.―
- Google (Spanner)
23. The ACID NoSQL plan
• Maintain both scalability and fault tolerance
• Leverage CAP2013 and deliver a CP system
with true global ACID transactions
• Enable abstractions and many data models
• Deliver high per-node performance
26. Bolt-on approach
Bolt transactions on top of a database
without transactions.
• Upside: Elegance.
• Downsides:
– Nerd trap
– Performance. ―…integrating multiple layers has
its advantages: integrating concurrency control
with replication reduces the cost of commit wait
in Spanner, for example‖ -Google
NoSQL
TRANSACTIONS /
LOCKING
28. Transactional building block approach
Use non-scalable transactional DBs as
components of a cluster.
• Upside: Local transactions are fast
• Downside: Distributed transactions
across machines are hard to make fast,
and are messy (timeouts required)
30. Decomposition approach
Decompose the processing pipeline of a
traditional ACID DB into individual stages.
• Stages:
– Accept client transactions
– Apply concurrency control
– Write to transaction logs
– Update persistent data representation
• Upside: Performance
• Downside: ―Ugly‖ and complex architecture
needs to solve tough problems for each stage
32. Disconnected operation challenge
• Offline sync is a real application need
Solution:
• Doing it in the DB layer is terrible
• Can (and should) be solved by the app,
E.g. by buffering mutations, sync‘ing
when connected
33. Split brain challenge
• Any consistent database need a fault-tolerance
source of ―ground truth‖
• Must prevent database from splitting into two
independent parts
Solution :
• Using thoughtfully chosen Paxos nodes can yield
high availability, even for drastic failure scenarios
• Paxos is not required for each transaction
35. Correctness challenge
• MaybeDB:
– Set(key, value) – Might set key to value
– Get(key) – Get a value that key was set to
Solution:
• The much stronger ACID contract
requires vastly more powerful tools for
testing
36. Implementation language challenge
We need new tools!
Goal Language
Many asynchronous
communicating processes
Erlang?
Engineering for reliability and
fault tolerance of large clusters
while maintaining correctness
Simulation
Fast algorithms; efficient I/O C++
43. Flow performance
―Write a ring benchmark. Create N processes in a
ring. Send a message round the ring M times so that
a total of N * M messages get sent. Time how long
this takes for different values of N and M. Write a
similar program in some other programming language
you are familiar with. Compare the results. Write a
blog, and publish the results on the internet!‖
- Joe Armstrong (author of ―Programming Erlang‖)
45. Flow enables testability
• ―Lithium‖ testing framework
• Simulate all physical interfaces
• Simulate failures modes
• Deterministic (!) simulation of entire
system
Simulation is the key for correctness.
49. Layers
• An open-source ecosystem
• Common NoSQL data models
• Graph database (implements BluePrints
2.4 standard)
• Zookeeper-like coordination layer
• Celery (distributed task queue) layer
• Many others…
50. SQL Layer
• A full SQL database in a layer!
• Akiban acquisition
• Unique ―table group‖ concept can
physically store related tables in an
efficient ―object structure‖
• Architecture: stateless, local server
51. Performance results
• Reads of cacheable data are ½ the
speed of memcached—with full
consistency!
• Random uncacheable reads of 4k ranges
saturate network bandwidth
• A 24-machine cluster processing 100%
cross-node transactions saturates its
SSDs at 890,000 op/s
52. The big performance result
• Vogels: “Achieving strict consistency can
come at a cost in update or read
latency, and may result in lower
throughput…”
• Ok, so, how much?
– Only ~10%!
– Transaction isolation—the ―intuitive
bottleneck‖ is accomplished in less than
one core.
53. A vision for NoSQL
• The next generation should maintain
– Scalability and fault tolerance
– High performance
• While adding
– ACID transactions
– Data model flexibility