NoSQL overview presentation with details on Riak and CouchDB.
Presented at Qbranch CODE Night 2010-04-15.
Thanks to @frli01 for arranging and @xlson for invitation.
1. Not only SQL
Mårten Gustafson
Qbranch CODE tech-meet @ 2010-04-15
2. What?
“NoSQL is a movement promoting a loosely
defined class of non-relational data stores that
break with a long history of relational
databases” - Wikipedia
3. What?
“NoSQL is a movement promoting a loosely
defined class of non-relational data stores
that break with a long history of relational
databases” - Wikipedia
Not a single technique
Not a single type of data
Not a single type of use case
5. What’s out there?
Storage type License Implemented in
Amazon Dynamo Key/Value n/a ?
Cassandra Columnfamily ASL 2.0 Java
CouchDB Document ASL 2.0 Erlang
Dynomite Key/Value BSD/MIT-style Erlang
HBase Columnfamily ASL 2.0 Java
MongoDB Document AGPL v3.0 C++
Neo4J Graph AGPL v3.0 / Comm Java
Riak Key/Value ASL 2.0 Erlang
Redis Key/Value BSD/MIT-style C
Scalaris Key/Value ASL 2.0 Erlang
Tokyo Cabinet Key/Value LGPL C
Voldemort Key/Value ASL 2.0 Java
7. Distribution
Masterless Master/Slave Hot standby
Amazon Dynamo X
Cassandra X
CouchDB X
Dynomite X
HBase ?
MongoDB X X
Neo4J*
Riak X
Redis X
Scalaris X
Tokyo Cabinet
Voldemort X
* Neo4J HA coming “soon”
8. Distribution
Masterless Master/Slave
ie
Hot standbyw
Amazon Dynamo X
d v
Cassandra X
ifie
l
CouchDB X
Dynomite X
m p
i
HBase ?
MongoDB
y s X X
Neo4J*
e r
v
Riak X
a
Redis X
is
Scalaris X
h i s
Tokyo Cabinet
Voldemort X
T * Neo4J HA coming “soon”
10. Of the web
“...Django may be built for the Web, but
CouchDB is built of the Web. I’ve never seen
software that so completely embraces the
philosophies behind HTTP. CouchDB
makes Django look old-school in the same way
that Django makes ASP look outdated”
- http://jacobian.org/writing/of-the-web/
11. Of the web
“...CouchDB may succeeded, and it may fail; who
knows. I’m sure of one thing, though — this is
what the software of the future looks like”
- http://jacobian.org/writing/of-the-web/
32. Riak “stuff”
Bucket
Container/keyspace.
Determines number of
replicas for its contents
33. Riak “stuff”
Consistent Hashing
Key hashing technique
used to distribute keys
on the ring
Bucket
Container/keyspace.
Determines number of
replicas for its contents
34. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff”
Gossiping
Consistent Hashing
Key hashing technique
used to distribute keys
on the ring
Bucket
Container/keyspace.
Determines number of
replicas for its contents
35. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff”
Gossiping
Consistent Hashing
Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Covering for a
Bucket failed “neighbor”
node while gone
Container/keyspace.
Determines number of
replicas for its contents
36. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Gossiping Links
Consistent Hashing
Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Covering for a
Bucket failed “neighbor”
node while gone
Container/keyspace.
Determines number of
replicas for its contents
37. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Gossiping Links
Consistent Hashing
Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone
Container/keyspace. efficient summary about
Determines number of objects. Gossiped.
replicas for its contents
38. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Gossiping Links
Node
Consistent Hashing One server. Runs
vnodes which claims
partitions.
Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone
Container/keyspace. efficient summary about
Determines number of objects. Gossiped.
replicas for its contents
39. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Partition
Gossiping Links
One slice (part) of the ring.
Node
Consistent Hashing One server. Runs
vnodes which claims
partitions.
Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone
Container/keyspace. efficient summary about
Determines number of objects. Gossiped.
replicas for its contents
40. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Partition
Gossiping Links
One slice (part) of the ring.
Node
Auto correction of
out-of-date objects
Consistent Hashing One server. Runs
vnodes which claims
partitions.
Read Repair Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone
Container/keyspace. efficient summary about
Determines number of objects. Gossiped.
replicas for its contents
41. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Partition
Gossiping Links
One slice (part) of the ring.
Node
Auto correction of
out-of-date objects
Consistent Hashing One server. Runs
vnodes which claims
partitions.
Read Repair Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone Number of copies
Container/keyspace. efficient summary about of the same object
Determines number of
replicas for its contents
objects. Gossiped. Replica in the cluster
42. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Partition
Gossiping Links
The complete “space”, One slice (part) of the ring.
divided into partitions which
are claimed by vnodes
Ring Node
Auto correction of
out-of-date objects
Consistent Hashing One server. Runs
vnodes which claims
partitions.
Read Repair Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone Number of copies
Container/keyspace. efficient summary about of the same object
Determines number of
replicas for its contents
objects. Gossiped. Replica in the cluster
43. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Partition
Gossiping Links
The complete “space”, One slice (part) of the ring.
divided into partitions which
Vector Clock
are claimed by vnodes
Conflic detection
technique for objects.
Ring Node
Auto correction of
out-of-date objects
Consistent Hashing One server. Runs
vnodes which claims
partitions.
Read Repair Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone Number of copies
Container/keyspace. efficient summary about of the same object
Determines number of
replicas for its contents
objects. Gossiped. Replica in the cluster
44. Shares state, bucket
and ring knowledge
in the cluster
Riak “stuff” Allows retrieval of
“weakly” linked objects
Partition
Gossiping Links
The complete “space”, One slice (part) of the ring.
Vnode
divided into partitions which
Vector Clock
are claimed by vnodes
Conflic detection
technique for objects.
Ring Node Runs in a node
and claims one
Auto correction of
out-of-date objects
Consistent Hashing One server. Runs
vnodes which claims
partition on the
ring
partitions.
Read Repair Key hashing technique
used to distribute keys
on the ring Hinted Handoff
Merkle Tree Covering for a
Bucket Data structure for
failed “neighbor”
node while gone Number of copies
Container/keyspace. efficient summary about of the same object
Determines number of
replicas for its contents
objects. Gossiped. Replica in the cluster
45. Riak - Takeaways
• No single point of failure
• Choose your levels for:
• availability
• consistency
• partition tolerance
46. But wait, there’s more...
• Binary data + Content-Type = whatever
• MP3’s, Images, Text, ...
• Map/Reduce
• Local data, parallel
50. World view
One document == JSON
One document == One record
Many documents == One database
Many databases == One instance
No schema
51. World view
Documents can
have attachments (binary + mime type)
be rendered differently (HTML, XML)
52. A document
Key, either you
choose it or CouchDB
does it for you
{
"_id": "b098445d587b1f347e48e1a79301de02",
"_rev": "1-80bfd8302e0f08eec2396c8107cafc19",
"platform": {
"browser": "mozilla",
"version": "1.9.1.8"
},
"timestamp": 1270131033337
Revision
} number
63. CouchDB “stuff”
Append only
Hence, won’t corrupt
its data files
64. CouchDB “stuff”
MVCC
Multi version concurrency control.
Writers do not block readers.
Readers do not block writers. Append only
Hence, won’t corrupt
its data files
65. CouchDB “stuff”
BDCRR
MVCC Bi-directional, conflict
resolving, replication
Multi version concurrency control.
Writers do not block readers.
Readers do not block writers. Append only
Hence, won’t corrupt
its data files
66. CouchDB “stuff”
BDCRR
MVCC Bi-directional, conflict
resolving, replication
Multi version concurrency control.
Writers do not block readers.
Readers do not block writers. Append only
Compaction Hence, won’t corrupt
its data files
Append only will cause data files to
grow. Compaction to the rescue, in
the background - for your pleasure.
67. CouchDB “stuff”
BDCRR
MVCC Bi-directional, conflict
resolving, replication
Multi version concurrency control.
Writers do not block readers.
Readers do not block writers. Append only
Compaction Hence, won’t corrupt
its data files
Append only will cause data files to
grow. Compaction to the rescue, in ACID
the background - for your pleasure. Awesome, Cool,
Impressive, Dope
68. CouchDB - Takeaways
• Kick ass replication
• Views are fast
• Can host and serve complete webapps
69. Outro
• Test one or more NoSQL thingys
• Get familiar with Brewers CAP theorem
• Get familiar with the Dynamo paper
70. Over and out.
Mårten Gustafson
@martengustafson
http://marten.gustafson.pp.se/
marten.gustafson@gmail.com
Editor's Notes
* Relational not always most suitable model
* Schema-less gives freedom
* Non-relational gives interesting scalability capabilities (which most provides)
* Most provides REST/JSON API
** Very suitable for web dev’t
** Easy peasy to use, regardless of environment
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
* Hinted handoff
collation - assembling in proper numerical or logical sequence