2. Who I am
• Philip Thompson
• Software Engineer at DataStax
• Contributor to Apache Cassandra
• A maintainer of CCM, the Cassandra Cluster Manager
3. Apache Cassandra™
•Apache Cassandra™ is a massively scalable, open source, NoSQL, distributed
database built for modern, mission-critical online applications.
•Written in Java and is a hybrid of Amazon Dynamo and Google BigTable
•Masterless with no single point of failure
•Distributed and data centre aware
•100% uptime
•Predictable scaling
3
10. Cassandra - More than one server
• All nodes participate in a
cluster
• Shared nothing
• Add or remove as needed
• More capacity? Add a
server
10
• Each node owns a number of tokens
• Tokens denote a range of keys
• 4 nodes? -> Key range/4
• Each node owns 1/4 the data
11. Cassandra - Locally Distributed
• Client writes to any
node
• Node coordinates with
others
• Data replicated in
parallel
• Replication factor (RF):
How many copies of
your data?
• RF = 3 here
Each node stores 3/4
of clusters total data.
11
12. Cassandra - Geographically Distributed
• Client writes local
• Data syncs across WAN
• Replication Factor per DC
Single coordinator
12
13. Cassandra - Replication Factor
• Replication factor (RF):
How many copies of
your data?
• Replication Factor is set
per keyspace
• Can be altered by
operator
13
RF = 3
14. Cassandra - Consistency
• Consistency Level (CL)
• Client specifies per read
or write
• ALL = All replicas ack
• QUORUM = > 51% of replicas ack
• LOCAL_QUORUM = > 51% in local DC ack
• ONE = Only one replica acks
14
15. Cassandra - Transparent to the application
• A single node failure shouldn’t bring failure
• Replication Factor + Consistency Level = Success
• This example:
• RF = 3
• CL = QUORUM
>51% Ack so we are good!
15
16. Cassandra - Scaling
• Take a cluster of four nodes
• Where does the fifth node go?
• Rebalancing is costly
75
16
0
25
50
17.
18.
19. Gossip
• Manages cluster state
• Nodes up/down
• Nodes joining/leaving
• Decentralized
• “Heartbeat” every second
• Every node contacts 1-3 other nodes
20. Snitch
• Responsible for determining cluster topology
• Datacenter awareness
• Tracks node responsiveness
• Many snitches provided out of the box
• SimpleSnitch
• GossipingPropertyFileSnitch (recommended for production)
• EC2Snitch and EC2MultiRegionSnitch
• For use with AWS
• Comparable GCE snitch has just been added
• Custom snitches can be added
20
22. Anti-Entropy - Hinted Handoff
• Three hour window
• Hints are replayed when node is
restored
• Stored in system.hints table on
coordinator
• Cassandra does not copy Dynamo’s
“sloppy quorum”
22
23. Anti-Entropy - Repair
• Nodetool repair
• Uses merkle trees for data
comparison
• Should be run weekly.
• Cassandra 2.1 has drastically
improved repair times, thanks to
incremental repair
23
31. Debugging your data model
• Tracing
cqlsh> tracing on;
Now tracing requests.
cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example');
Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9
activity | timestamp | source | source_elapsed
-------------------------------------+--------------+-----------+----------------
execute_cql3_query | 00:02:37,015 | 127.0.0.1 | 0
Parsing statement | 00:02:37,015 | 127.0.0.1 | 81
Preparing statement | 00:02:37,015 | 127.0.0.1 | 273
Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540
Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779
Messsage received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63
Applying mutation | 00:02:37,016 | 127.0.0.2 | 220
Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250
Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277
Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378
Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710
Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888
Messsage received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334
Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550
Request complete | 00:02:37,017 | 127.0.0.1 | 2581
32. Nodetool
• Command line interface for monitoring Cassandra and performing routine
database operations
• Commands for viewing detailed metrics for tables, server metrics, and
compaction statistics:
• cfstats: statistics for each table and keyspace
• cfhistograms: statistics about a table, including read/write latency, row size, column count,
and number of SSTables
• netstats: statistics about network operations and connections
• tpstats: statistics about the number of active, pending, and completed tasks for each stage of
Cassandra operations by thread pool
32