SlideShare a Scribd company logo
1 of 299
Download to read offline
@calonso
CASSANDRA WORKSHOP
Cassandra from scratch in one day.
@calonso
• Introductions
• Cassandra Core concepts
• CQL
• Data modelling
• More Cassandra Concepts
• Hardware Considerations
@calonso
CARLOS ALONSO
• Spanish Londoner
• MSc Salamanca University, Spain
• Software Engineer @MyDrive Solutions
• Cassandra certified developer
• Cassandra MVP 2015
• @calonso / http://mrcalonso.com
@calonso
MYDRIVE SOLUTIONS
• World leading driver profiling company
• Using technology and data to understand
how to improve driving behaviour
• Recently acquired by the Generali Group
• @_MyDrive / http://mydrivesolutions.com
• We are hiring!!
@calonso
ANDYOU?
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’m rolling in production with Cassandra
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’m rolling in production with Cassandra
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve heard about Cassandra and
want to get my hands on it
I’m rolling in production with Cassandra
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I’ve heard about Cassandra and
want to get my hands on it
I’m rolling in production with Cassandra
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I don’t know what I’m doing here
I’ve heard about Cassandra and
want to get my hands on it
I’m rolling in production with Cassandra
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I don’t know what I’m doing here
I’ve heard about Cassandra and
want to get my hands on it
I’m evaluating Cassandra as
a potential solution
I’m rolling in production with Cassandra
@calonso
ANDYOU?
I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I don’t know what I’m doing here
I’ve heard about Cassandra and
want to get my hands on it
I’ve using Cassandra for some
tests and want to go deeper
I’m evaluating Cassandra as
a potential solution
I’m rolling in production with Cassandra
@calonso
CASSANDRA
• A.k.a Alexandra or Kassandra
• Daughter of King Priam and Queen Hecuba
ofTroy.
• Apollo gave her the power of prophecy to
seduce her. She refused and then Apollo spat
on her mouth cursing her never to be
believed.
• https://en.wikipedia.org/wiki/Cassandra
@calonso
CASSANDRA
• Open Source distributed database
management system
• Initially developed at Facebook
• Inspired by Amazon’s Dynamo and Google
BigTable papers
• Became Apache top-level project in Feb, 2010
• Nowadays developed by DataStax
@calonso
WHY CASSANDRA?
“Cassandra is the
cursed ORACLE”
@calonso
WHY CASSANDRA?
“Cassandra is the
cursed ORACLE”
@calonso
CASSANDRA CORE CONCEPTS
Technical introduction to Apache Cassandra
@calonso
NOSQL
@calonso
BIG DATA REQUIREMENTS
• Everywhere
• Fast
• Always available
• Consistent
+
Ingestion Consumption
@calonso
THE CAPTHEOREM
@calonso
SCALING
Vertical Horizontal
@calonso
CASSANDRA
• Fast Distributed NoSQL Database
• High Availability
• Linear Scalability => Predictability
• No SPOF
• Multi-DC
• Horizontally scalable => $$$
• Not a drop in replacement for RDBMS
@calonso
CASSANDRA CLUSTER
@calonso
REPLICATION FACTOR
How many copies (replicas) for
your data
@calonso
CONSISTENCY LEVEL
How many replicas of your data
must respond ok?
@calonso
CASSANDRA DATA MODEL
• Query driven data model
• Column family non relational db
@calonso
CQL
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
Familiar row-column SQL-like approach.
INSERT INTO users (id, name, surname, birthdate)
VALUES (uuid(),‘Carlos’,‘Alonso’, ’1985-03-19’);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
ALTERTABLE users ADD addressVARCHAR;
@calonso
DISTRIBUTIONS
• Latest features
• JIRA
• Support via mailing list & IRC
• http://cassandra.apache.org
@calonso
DISTRIBUTIONS
• Integrated Solr for Multi-DC Search
• Integrated Spark for Analytics
• Free Startup Program
• Expert support
• Focused on stable releases for enterprises
• http://www.datastax.com/products/datastax-enterprise
@calonso
CASSANDRA:YES
• If you need:
• No SPOF
• Linear horizontal scalability in commodity hardware
• Real-time writes
• Reliable data replication across distributed data centres
• Clearly defined schema in a NoSQL environment
@calonso
CASSANDRA: NO
• If you need:
• ACID transactions with rollback
• Justification for high-end software
@calonso
REVIEW QUESTIONS
What do consistency, availability and partition tolerance mean?
@calonso
REVIEW QUESTIONS
What do consistency, availability and partition tolerance mean?
Consistency:All clients have the exact same value for the whole data set at any given point.
@calonso
REVIEW QUESTIONS
What do consistency, availability and partition tolerance mean?
Consistency:All clients have the exact same value for the whole data set at any given point.
Availability:All clients can read and write to the system at any given point.
@calonso
REVIEW QUESTIONS
What do consistency, availability and partition tolerance mean?
Consistency:All clients have the exact same value for the whole data set at any given point.
Availability:All clients can read and write to the system at any given point.
Partition tolerance: Whether or not the system tolerates a node being
disconnected from the system.
@calonso
REVIEW QUESTIONS
Where does Cassandra fit within the CAP Theorem?
@calonso
REVIEW QUESTIONS
Where does Cassandra fit within the CAP Theorem?
AP: Cassandra trades off consistency in order to guarantee
availability and partition tolerance, but in a configurable way, so it’s
up to the developer where to sit for each query.
@calonso
REVIEW QUESTIONS
Which are the technological roots of Cassandra?
@calonso
REVIEW QUESTIONS
Which are the technological roots of Cassandra?
Google BigTable and Amazon Dynamo pulled together
by developers at Facebook
@calonso
REVIEW QUESTIONS
What technology does Cassandra use to model data?
@calonso
REVIEW QUESTIONS
What technology does Cassandra use to model data?
CQL: Cassandra Query Language
@calonso
INSTALLATION
Installing, configuring and running Cassandra
@calonso
REQUIREMENTS
JAVA >= 1.7.0_25
All nodes synchronised (NTP)
@calonso
INSTALLATION
http://cassandra.apache.org/download/
@calonso
CONFIGURATION
• cluster_name
• listen_address
• rpc_address
• commitlog_directory
• data_file_directories
• saved_caches_directory
conf/cassandra.yaml
@calonso
CONFIGURATION
• MAX_HEAP_SIZE
• if system memory < 2G => 1/2 of it
• if between 2G and 4G => 1G
• if > 4G => 1/4 of it but no more than 8G
• HEAP_NEWSIZE
• 1/4 of MAX_HEAP_SIZE
conf/cassandra-env.sh
@calonso
START/STOP
sudo bin/cassandra
sudo service cassandra start
ctrl - csudo bin/cassandra [-f]
ps aux | grep cassandra
sudo kill <pid>
sudo service cassandra stop
@calonso
START/STOP
Node localhost/127.0.0.1 state jump to normal
@calonso
REVIEW QUESTIONS
Which setting determines a node’s cluster?
Where is it configured?
@calonso
REVIEW QUESTIONS
Which setting determines a node’s cluster?
Where is it configured?
cluster_name: In conf/cassandra.yaml
@calonso
REVIEW QUESTIONS
How would you stop a Cassandra instance running
in background in an Unix based machine?
@calonso
REVIEW QUESTIONS
How would you stop a Cassandra instance running
in background in an Unix based machine?
1. Get the PID: ps aux | grep cassandra
2. Kill the process: kill <pid>
@calonso
REVIEW QUESTIONS
Which settings would you adjust to tune how much memory
Cassandra uses?
In which file?
@calonso
REVIEW QUESTIONS
Which settings would you adjust to tune how much memory
Cassandra uses?
In which file?
MAX_HEAP_SIZE in conf/cassandra-env.sh
@calonso
BASICTOOLS
Knowing tools required for basic Cassandra management
NODETOOL
The command line swiss army knife.
@calonso
NODETOOL
status: displays cluster state, load, host ID and token
info: displays node memory use, disk load, uptime …
ring: displays node status and cluster ring state
help: displays all possible commands and description
CQLSH
Our data management and first
exploration tool
@calonso
CQLSH
DESC[RIBE]: shows information of the arguments
SOURCE: executes a file containing CQL statements
TRACING: enables/disables the tracing mode
help: shows available cqlsh + CQL commands
SELECT, ALTER, INSERT, …
CASSANDRA-
STRESS
Our tool to assess performance
@calonso
CASSANDRA-STRESS
read: to execute a read-only workload
mixed: executes mixed workload
user: user defined schema and workloads
write: to execute a write-only workload
CCM
One tool to manage them all.
@calonso
CCM
• Python 2.7 +
• PyYAML
• Six
• Ant
• Loopback IP aliases (Mac OS)
Prerequisites
github: pcmanus/ccm
• Testing tool
• Communicates with
localhost only
Limitations
@calonso
CCM
start/stop: starts/stops all nodes in cluster
status: shows current cluster status
<node> <command>: runs command connecting to node
i.e: ccm node1 cqlsh
create: downloads, compiles and builds cluster
@calonso
REVIEW QUESTIONS
Which tool/command would I use to know the
start/stop status of a particular node of my cluster
@calonso
REVIEW QUESTIONS
Which tool/command would I use to know the
start/stop status of a particular node of my cluster
nodetool status
@calonso
REVIEW QUESTIONS
Name and describe two non CQL commands
allowed in cqlsh.
@calonso
REVIEW QUESTIONS
Name and describe two non CQL commands
allowed in cqlsh.
CAPTURE COPY DESCRIBE EXPAND PAGING SOURCE
CONSISTENCY DESC EXIT HELP SHOW TRACING
@calonso
REVIEW QUESTIONS
Can I manage my production cluster remotely
using CCM?
@calonso
REVIEW QUESTIONS
Can I manage my production cluster remotely
using CCM?
No, that’s CCM’s biggest limitation. Only connects to localhost.
@calonso
REVIEW QUESTIONS
What happens if, in a cqlsh session I type:
DESCRIBE KEY and press TAB?
@calonso
INTERNAL ARCHITECTURE
Internal processes that make Cassandra work
@calonso
CLUSTER COMPONENTS
• Column:The smallest key-value pair.
• Row: Collection of columns. Identified by a row key.
• Partition: Bucket containing several rows. Identified by
a token.
• Node: a Cassandra instance. Contains a token range.
• Rack: a logical set of nodes
• Data Center: a logical set of racks.
• Cluster:The full set of nodes. Covers a whole token
ring.
CONSISTENT
HASHING
Which node holds this data?
– Wikipedia
“Hashing is the transformation of a string of characters into a usually
shorter fixed-length value or key that represents the original string.
Hashing is used to index and retrieve items in a database because it is
faster to find the item using the shorter hashed key than to find it using
the original value.”
@calonso
CONSISTENT HASHING
• Data is stored in partitions, identified by a
unique token within the range
(-2^63 - 2^63)
• Nodes contain partition ranges.
@calonso
THE PARTITIONER
• System running on each node that
computes hashes through a hash function.
• Various partitioners available.
• Default is murmur3
• All nodes MUST use the same!!!!
Hash function
“Carlos” 185664
1773456738847666528349
-894763734895827651234
@calonso
VNODES
@calonso
VNODES
@calonso
REQUEST
COORDINATION
How are client requests coordinated?
@calonso
THE COORDINATOR
• The node designated to coordinate a particular query.
• ANY node can coordinate ANY request.
• No SPOF: One of the main Cassandra’s principles.
• The driver chooses which node will coordinate
@calonso
A FULL EXAMPLE
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
CREATE KEYSPACE test WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
CONSISTENCY QUORUM;
@calonso
A FULL EXAMPLE
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
CREATE KEYSPACE test WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
CONSISTENCY QUORUM;
@calonso
A FULL EXAMPLE
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…
CREATE KEYSPACE test WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
CONSISTENCY QUORUM;
@calonso
A FULL EXAMPLE
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…
834
CREATE KEYSPACE test WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
CONSISTENCY QUORUM;
@calonso
A FULL EXAMPLE
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…
834
CREATE KEYSPACE test WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
CONSISTENCY QUORUM;
@calonso
A FULL EXAMPLE
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…
834
CREATE KEYSPACE test WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
CONSISTENCY QUORUM;
@calonso
A FULL EXAMPLE
CREATETABLE users (
id UUID,
nameVARCHAR,
surnameVARCHAR,
birthdateTIMESTAMP,
PRIMARY KEY(id)
);
SELECT * FROM users WHERE
id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…
834
CREATE KEYSPACE test WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
CONSISTENCY QUORUM;
REPLICATION
How many copies of your data?
@calonso
WHY REPLICATION?
• Disaster recovery
• Bring data closer to users (to reduce latencies)
• Workload segregation (analytical vs transactional)
@calonso
REPLICATION
Defined at keyspace level
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 2 };
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“NetworkTopologyStrategy”,
“dc-east”: 2,“dc-west”: 3 };
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
Token: 834
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
Token: 834
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
Token: 834
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3 };
Token: 834
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“NetworkTopologyStrategy”,
“dc-east”: 2,“dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“NetworkTopologyStrategy”,
“dc-east”: 2,“dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“NetworkTopologyStrategy”,
“dc-east”: 2,“dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“NetworkTopologyStrategy”,
“dc-east”: 2,“dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“NetworkTopologyStrategy”,
“dc-east”: 2,“dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION =
{ “class”:“NetworkTopologyStrategy”,
“dc-east”: 2,“dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
X
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
X
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
X
834
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
X
834
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
X
834
834
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
X
834
834
834
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
834
834
834
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
834
834
834
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
834
834
834
@calonso
WHAT IF A NODE OR DC IS DOWN?
Hinted Handoff to the rescue!
DriverClient
834
834
834
CONSISTENCY
How much consistency do we want?
@calonso
CONSISTENCY LEVEL
How many nodes must to successfully
write for the write to be success?
How many nodes must send their data
for the read to be success?
@calonso
CONSISTENCY LEVEL
RF = 3
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE,TWO,THREE
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE,TWO,THREE
LOCAL_ONE
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE,TWO,THREE
QUORUM = floor(RF / 2 + 1)
LOCAL_ONE
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE,TWO,THREE
QUORUM = floor(RF / 2 + 1)
LOCAL_ONE
LOCAL_QUORUM
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE,TWO,THREE
QUORUM = floor(RF / 2 + 1)
ALL
LOCAL_ONE
LOCAL_QUORUM
@calonso
CONSISTENCY LEVEL
Availability /

Partition tolerance
Consistency
@calonso
DEMO
Play with RFs, CLs and hints
REPAIR
Strengthening consistency.
@calonso
DIGEST QUERY
In consistent reads, only one node is asked
for data, the others are asked for a digest.
@calonso
READ REPAIR
What if nodes disagree?
DriverClient
CL >= QUORUM
SELECT city FROM …
@calonso
READ REPAIR
What if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123
SELECT city FROM …
@calonso
READ REPAIR
What if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123
Salamanca: 125
SELECT city FROM …
@calonso
READ REPAIR
What if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123
Salamanca: 125
London: 150
SELECT city FROM …
@calonso
READ REPAIR
What if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123
Salamanca: 125
London: 150London
SELECT city FROM …
@calonso
READ REPAIR
And if CL < QUORUM?
The coordinator will issue a read_repair based on
read_repair_chance table property.
CREATETABLE users (
…
) WITH read_repair_chance = 0.1;
@calonso
MANUAL REPAIR
Last defense against data entropy.
The nodetool repair command makes all data on a node
consistent with the latest replicas in the cluster.
—partitioner-range: option to restrict repair to node’s primary range only
@calonso
MANUAL REPAIR
• Run nodetool repair:
• Recovering a failed node
• Increasing RF
• Periodically on every node
• Sequentially
• Once a week
GOSSIP
Nodes gossip between themselves
@calonso
GOSSIP
• Every second
• Three nodes
• Heartbeat +Versioned information of
the whole cluster.
@calonso
GOSSIP
• Provide consistent list of seeds
• At least one per DC
Nodes prefer (10%) to gossip with their seeds
@calonso
SNITCH
• Allows the node to know its rack and data center topology.
• Enables replication in different racks
@calonso
SNITCH
• GossipingPropertyFileSnitch: config from cassandra-rackdc.properties and
propagated by gossiping
• Ec2Snitch:Amazon EC2 aware. Single region. Single DC.Availability zone = Rack
• Ec2MultiRegionSnitch: Multiple regions. Region = DC.
• …
@calonso
REVIEW QUESTIONS
Describe the relationship of nodes, racks, data
centers and clusters.
@calonso
REVIEW QUESTIONS
Describe the relationship of nodes, racks, data
centers and clusters.
node > rack > data center > cluster
@calonso
REVIEW QUESTIONS
What is the function of the partitioner?
@calonso
REVIEW QUESTIONS
What is the function of the partitioner?
The partitioner’s function is to hash keys.Then the rest of the
cluster uses that output to determine where the data should live.
@calonso
REVIEW QUESTIONS
Can a node hold a partition with a token outside
its primary range?
@calonso
REVIEW QUESTIONS
Can a node hold a partition with a token outside
its primary range?
Yes, if it’s replicating data for some other node, or if it’s holding a hint.
@calonso
REVIEW QUESTIONS
In a 3 nodes cluster with RF = 2. How much total
volume does each node own?
@calonso
REVIEW QUESTIONS
In a 3 nodes cluster with RF = 2. How much total
volume does each node own?
66%
@calonso
REVIEW QUESTIONS
What is the function of the nodetool repair
operation?
@calonso
REVIEW QUESTIONS
What is the function of the nodetool repair
operation?
Synchronising replicas.
Ensuring the node’s data is the most recent.
@calonso
REVIEW QUESTIONS
What is a remote coordinator?
@calonso
REVIEW QUESTIONS
What is a remote coordinator?
When using multiple DCs and NetworkTopologyStrategy, at the point
of replicating in the second DC, the only node that receives the data in
that DC will coordinate the request there. Is the remote coordinator.
This is to avoid transmitting all data to all nodes from DC to DC.
@calonso
REVIEW QUESTIONS
How could RF and CL be tuned to ensure
immediate consistency?
@calonso
REVIEW QUESTIONS
How could RF and CL be tuned to ensure
immediate consistency?
• RF >= 3
• CL Write = ONE and CL Read = ALL
• CL Write = ALL and CL Read = ONE
• CL Write = QUORUM and CL Read = Quorum
@calonso
CQL
The Cassandra Query Language
@calonso
PHYSICAL DATA STRUCTURE
DDL + DML
Defining our data shape
and actually using it
@calonso
DEV CENTER
@calonso
DDL
CREATE KEYSPACE musicdb WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3};
DROP KEYSPACE musicdb;
USE musicdb
@calonso
PRACTICETIME!
We need to build a system for an online electronic books reading site.
@calonso
PRACTICETIME!
We need to build a system for an online electronic books reading site.
CREATE KEYSPACE e_library WITH REPLICATION =
{ “class”:“SimpleStrategy”,“replication_factor”: 3};
@calonso
DDL
CREATETABLE performer (
nameVARCHAR,
typeVARCHAR,
countryVARCHAR,
styleVARCHAR,
founded INT,
bornTIMESTAMP,
diedTIMESTAMP,
PRIMARY KEY (name)
);
@calonso
PRIMARY KEY
PARTITION KEY +
CLUSTERING
COLUMN(S)
@calonso
PRIMARY KEY
• Simple partition key, no clustering columns:
• PRIMARY KEY (name)
• Composite partition key, no clustering columns:
• PRIMARY KEY ((album_title, year))
• Simple partition key and clustering columns:
• PRIMARY KEY (album_title, number)
• Composite partition key and clustering columns:
• PRIMARY KEY ((album_title, year), number)
@calonso
PRIMARY KEYS
CREATETABLE tracks_by_album (
album_titleVARCHAR,
year INT,
performerVARCHAR STATIC,
genreVARCHAR STATIC,
number INT,
track_titleVARCHAR,
PRIMARY KEY ((album_title, year), number)
);
CREATETABLE albums_by_track (
track_titleVARCHAR,
performerVARCHAR,
year INT,
album_titleVARCHAR,
PRIMARY KEY (
track_title, performer, year, album_title)
);
CQLTYPE Constants Description
ASCII strings US-ASCII character strings
BIGINT integers 64-bit signed long
BLOB blobs Arbitrary bytes (no validation), as hexadecimal
BOOLEAN booleans true or false
COUNTER integers Distributed counter value (64 bit long)
DECIMAL integers or floats Variable precision decimal
DOUBLE integers 64-bit IEEE-754 floating point
FLOAT integers, floats 32-bit IEEE-754 floating point
INET strings IP address string in IPV4 or IPV6 format
INT integers 32-bit signed integer
LIST n/a A collection of one or more ordered elements
MAP n/a A JSON style array of literals { literal: literal, literal: literal, …}
SET n/a A collection of one or more elements
TEXT strings UTF-8 encoded text
TIMESTAMP integers, strings Date + time as mills since EPOCH
TUPLE n/a Up to 32k fields
UUID uuids Standard UUID
VARCHAR strings UTF-8 encoded string
VARINT integers Arbitrary precision integer
TIMEUUID uuids Type I UUID
@calonso
INSERT
• CQL INSERTS are:
• Atomic: Either all the values are inserted or none
• Isolated:Two inserts on the exact same PK happen one after the other, no
mixed values.
INSERT INTO albums_by_performer (performer, year, title, genre)
VALUES (‘The Beatles’, 1966,‘Revolver’,‘Rock’);
@calonso
UPDATE
• Primary Key columns cannot be changed.
• Full Primary key is required as predicate.
• CQL UPDATES are:
• Atomic: Either all the values are inserted or none
• Isolated:Two inserts on the exact same PK happen one after the other, no mixed values.
UPDATE albums_by_performer
SET genre = ‘Rock’
WHERE performer = ‘The Beatles’AND
year = 1966 AND
title = ‘Revolver’;
@calonso
UPSERT
INSERT INTO albums_by_performer (performer, year, title, genre)
VALUES (‘The Beatles’, 1966,‘Revolver’,‘Rock’);
UPDATE albums_by_performer
SET genre = ‘Rock’
WHERE performer = ‘The Beatles’AND
year = 1966 AND
title = ‘Revolver’;
==
@calonso
LWT
• Use at your own discretion:
• Cassandra uses Paxos algorithm to determine if the record exists or not.
• In total 6x performance penalty.
INSERT INTO albums_by_performer (performer, year, title, genre)
VALUES (‘The Beatles’, 1966,‘Revolver’,‘Rock’) IF NOT EXISTS;
@calonso
PRACTICETIME!
We need to design a system that holds users. Users will have name, ID card (unique),
a phones list (home, mobile and work), birth date and an email address.
NOTE:As we haven’t studied SELECT, use
SELECT * FROM <table name>; to inspect your data.
@calonso
PRACTICETIME!
CREATETABLE users (
IDVARCHAR PRIMARY KEY,
nameVARCHAR,
home_phoneVARCHAR,
work_phoneVARCHAR,
mobile_phoneVARCHAR,
emailVARCHAR,
birth_dateTIMESTAMP
);
@calonso
MORE DDL
ALTERTABLE album ADD cover_imageVARCHAR;
ALTERTABLE album DROP cover_image;
ALTERTABLE album ALTER cover_imageTYPE BLOB;
@calonso
MORE DDL
CREATETABLE albums_by_genre (
genreVARCHAR,
performerVARCHAR,
year INT,
album_titleVARCHAR,
PRIMARY KEY (
genre, performer, year, album_title)
) WITH CLUSTERING ORDER BY
(performer ASC, year DESC, title ASC);
@calonso
SECONDARY INDEXES
• Tables are indexed on columns in a PK
• Search on a partition key is very efficient
• Search on a PK and Clustering column is very efficient
• Search on other things is not supported
• Secondary indexes allow indexing other columns to be queried.
• One index per column
@calonso
SECONDARY INDEXES
CREATETABLE performer (
nameVARCHAR,
typeVARCHAR,
countryVARCHAR,
styleVARCHAR,
founded INT,
bornTIMESTAMP,
diedTIMESTAMP,
PRIMARY KEY (name)
);
DROP INDEX performers_by_style;
CREATE INDEX performers_by_style
ON perfomer (style);
@calonso
SECONDARY INDEXES
• Same recommendations for RDBMS
• Use indexes on low cardinality fields
• Beware of the write overhead
• Every node indexes it local data therefore => a read hits all nodes!!
• Don’t use them. Use lookup tables instead.
@calonso
PRACTICETIME!
We need to query the users by name.
@calonso
PRACTICETIME!
We need to query the users by name.
CREATE INDEX users_by_name
ON users (name);
@calonso
UUID
• Type 4 UUID
• Our way to ensure uniqueness in a distributed system.
7ffa4040-9132-4e0b-b04f-610e869d8717
@calonso
UUID
• Type 4 UUID
• Our way to ensure uniqueness in a distributed system.
7ffa4040-9132-4e0b-b04f-610e869d8717
@calonso
PRACTICETIME!
Our system has another entity: Books. Books have a title and an author.
We have no guarantee of any of them or even their combination to be unique.
@calonso
PRACTICETIME!
Our system has another entity: Books. Books have a title and an author.
We have no guarantee of any of them or even their combination to be unique.
CREATETABLE books (
uidTIMEUUID PRIMARY KEY,
titleVARCHAR,
authorVARCHAR
);
@calonso
TIMEUUID
• Timestamp + UUID
• Type 1 UUID
• Generated with CQL now() function
• Can extract theTimestamp with CQL dateof() function
c9cc9e60-711c-11e5-9d70-feff819cdc9f
@calonso
TIMEUUID
• Timestamp + UUID
• Type 1 UUID
• Generated with CQL now() function
• Can extract theTimestamp with CQL dateof() function
c9cc9e60-711c-11e5-9d70-feff819cdc9f
@calonso
TIMEUUID
CREATETABLE track_ratings_by_user (
user UUID,
activityTIMEUUID,
rating INT,
album_titleVARCHAR,
album_year INT,
track_titleVARCHAR,
PRIMARY KEY (user, activity)
) WITH CLUSTERING ORDER (activity DESC);
@calonso
TTL
• TimeTo Live for columns specified in seconds.
• AfterTTL expires, column is marked with aTombstone.
INSERT INTO albums_by_performer (performer, year, title, genre)
VALUES (‘The Beatles’, 1966,‘Revolver’,‘Rock’) USING TTL 30;
@calonso
PRACTICETIME!
We are in the BigData era and therefore we want to measure absolutely
everything our users do in our portal.Actions will be defined by a type (string)
and a receiver (int).
@calonso
PRACTICETIME!
We are in the BigData era and therefore we want to measure absolutely
everything our users do in our portal.Actions will be defined by a type (string)
and a receiver (int).
CREATETABLE user_action (
user_IDVARCHAR,
timeTIMESTAMP,
typeVARCHAR,
receiver INT,
PRIMARY KEY(user_ID, time)
);
@calonso
DELETE
• A whole partition:
• DELETE FROM <table> WHERE <partition_key> = value;
• A row:
• DELETE FROM <table> WHERE <primary key> = value;
• A column:
• DELETE <column name> FROM <table> WHERE <primary key> = value;
• Deleted things are marked with a tombstone, not actually removed.
@calonso
TRUNCATE
TRUNCATE albums_by_performer;
@calonso
COUNTERS
• Implements distributed counters
• The value can only be updated, never
set
• Cannot be part of the PK
• If present on a table, all non counter
columns in the same table must be part
of the PK
CREATETABLE ratings_by_track (
album_titleVARCHAR,
album_year INT,
track_titleVARCHAR,
num_ratings COUNTER,
sum_ratings COUNTER
PRIMARY KEY
(album_title, album_year, track_title)
);
@calonso
COUNTERS
• Performance considerations
• Update requires a read before
• Accuracy considerations
• Counter update is not idempotent, so retrying false failures leads
to wrong value.
@calonso
COUNTERS
• No INSERT
• No value set, only update.
CREATETABLE stats (
performerVARCHAR
albums COUNTER,
concerts COUNTER,
PRIMARY KEY (performer)
);
UPDATE stats
SET albums = albums + 1, concerts = concerts + 10
WHERE performer = ‘The Beatles’;
@calonso
PRACTICETIME!
We need to keep track of the number of times a specific book has been read
by a specific user.
@calonso
PRACTICETIME!
We need to keep track of the number of times a specific book has been read
by a specific user.
CREATETABLE books_read_by_user (
book_uid UUID,
user_IDVARCHAR,
times COUNTER,
PRIMARY_KEY(book_uid, user_ID)
);
@calonso
COLLECTIONS
• Set: Uniqueness
• email_addresses SET<VARCHAR>
• List: Order
• email_addresses LIST<VARCHAR>
• Map: Key-Value pairs
• email_addresses MAP<VARCHAR, VARCHAR>
Our users can have several email addresses…
@calonso
SETS
• Insert:
• INSERT INTO band (name, members) VALUES (‘The Beatles’, {‘John’, ’Paul’,
‘George’});
• Union (duplicates deletion managed transparently):
• UPDATE band SET members = members + {‘John’, ’Ringo’} WHERE name = ‘The
Beatles’;
• Difference:
• UPDATE band SET members = members - {‘Ringo’} WHERE name = ‘The Beatles’;
• Deletion:
• DELETE members FROM band WHERE name = ‘The Beatles’;
@calonso
LISTS
• Insert:
• INSERT INTO song (name, songwriters) VALUES (‘Hold your hand’, [‘John’,
’Paul’]);
• Append:
• UPDATE song SET songwriters = songwriters +[‘Paul’] WHERE name = …;
CREATETABLE song (
nameVARCHAR
songwriters LIST<VARCHAR>,
PRIMARY KEY (name)
);
@calonso
LISTS
• Prepend:
• UPDATE song SET songwriters = [‘Paul’] + songwriters WHERE name = …;;
• Update:
• UPDATE song SET songwriters[1] = ‘Jonathan’ WHERE name = …;
• Subtract
• UPDATE song SET songwriters = songwriters - [‘Jonathan’] WHERE name = …;
• Delete
• DELETE songwriters[0] FROM song WHERE name = …;
@calonso
MAPS
• Insert:
• INSERT INTO album (title, tracks) VALUES (‘Revolver’,
{ 1: ’Taxman’, 2: ‘Eleanor’});
• Update:
• UPDATE album SET tracks[3] = ‘Yellow Submarine’ WHERE
title = …;
• Delete:
• DELETE tracks[3] FROM album WHERE title = …;
CREATETABLE album (
titleVARCHAR,
tracks MAP<INT,VARCHAR>,
PRIMARY KEY (title)
);
@calonso
PRACTICETIME!
Our users can define a set of preferences in the portal:
TimeZone, Language and Currency
@calonso
PRACTICETIME!
Our users can define a set of preferences in the portal:
TimeZone, Language and Currency
ALTERTABLE users ADD preferences MAP<VARCHAR,VARCHAR>;
@calonso
USER DEFINEDTYPES
CREATETABLE track_ratings_by_user (
user UUID,
activityTIMEUUID,
rating INT,
song FROZEN <track>,
PRIMARY KEY (user, activity)
) WITH CLUSTERING ORDER BY (activity DESC);
CREATETYPE track (
album_titleVARCHAR,
album_year INT,
track_titleVARCHAR
);
FROZEN: the value has to be fully written, cannot update a single field (i.e: album_year)
@calonso
USER DEFINEDTYPES
CREATETABLE track_ratings_by_user (
user UUID,
activityTIMEUUID,
rating INT,
song FROZEN <track>,
PRIMARY KEY (user, activity)
) WITH CLUSTERING ORDER BY (activity DESC);
CREATETYPE track (
album_titleVARCHAR,
album_year INT,
track_titleVARCHAR
);
INSERT INTO track_ratings_by_user (user, activity, rating, song)VALUES
(6ed4f220…, now(), 10,
{ album_title:‘Let it be’, album_year: 1970, track_title:‘Let it be’ });
@calonso
USER DEFINEDTYPES
• Update:
• UPDATE track_ratings_by_user SET song = { album_title:
‘Let it be’, album_year: 1970, track_title: ‘Two of
us’} WHERE user = … AND activity = …;
• Delete:
• DELETE song FROM track_ratings_by_user WHERE user = …
AND activity = …;
@calonso
TUPLES
CREATETABLE user (
id UUID PRIMARY KEY,
emailTEXT,
nameTEXT,
preferences SET<TEXT>,
equalizer FROZEN<TUPLE<FLOAT, FLOAT, FLOAT, INT,VARCHAR>>
);
INSERT INTO user (id, equalizer)VALUES
(6ed4f220…, (3.0, 1.1, 5.1, 3,“Pop-Rock”));
@calonso
PRACTICETIME!
Our users can have an e-reader, defined by brand and model.
CREATETYPE e_reader (
brandVARCHAR,
modelVARCHAR
);
@calonso
PRACTICETIME!
Our users can have an e-reader, defined by brand and model.
CREATETYPE e_reader (
brandVARCHAR,
modelVARCHAR
);
ALTERTABLE users ADD reader FROZEN <e_reader>;
BATCH
Grouping and atomising queries.
@calonso
BATCH
• Combines multiple INSERT, UPDATE and DELETE operations into a
single logical operation:
• Saves on client - coordinator communication
• Atomic: if one succeeds, all will
• No isolation: other transactions can read/write data affected by
partial batch.
@calonso
BATCH
• All modified cells will share same timestamp, so when read, will look
as atomic => No order guarantee!!
• Don’t use BATCHES with operations on the same PK.
BEGIN BATCH
DELETE FROM albums WHERE name = ‘Let it be’;
INSERT INTO albums WHERE name = ‘Let it be’;
APPLY BATCH;
@calonso
BATCH + LWT
• The whole BATCH will only run if conditions for all LWT are met.
• All operations in the BATCH will run sequentially.
BEGIN BATCH
UPDATE user SET lock = true IF lock = false;
DELETE FROM albums WHERE name = ‘Let it be’;
INSERT INTO albums WHERE name = ‘Let it be’;
UPDATE user SET lock = false;
APPLY BATCH;
@calonso
ROLLBACK
• Not necessary
• RDBMS cannot know, at the beginning of a transaction, if all queries
will be able to succeed
• Cassandra can, so if they won’t doesn’t even start
SELECT
Searching for data
@calonso
SELECT
• All rows:
• SELECT * FROM album;
• Specific columns:
• SELECT performer, title, year FROM album;
• Specific field from a UDT:
• SELECT performer.lastname FROM album;
• Count:
• SELECT COUNT(*) FROM album;
@calonso
WHERE
• Equality matches:
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’ AND
year = 1966;
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’ AND
year = 1966 AND number = 6;
• IN:
• Only applicable in the last WHERE clause
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’ AND
year = 1966 AND number IN (2, 3, 4);
@calonso
WHERE
• Range search:
• Only on clustering columns.
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’
AND year = 1966 AND number >= 6 AND number < 2;
• ALLOW FILTERING:
• Allows scanning through all partitions => potentially very time consuming
• SELECT * FROM tracks_by_album WHERE number = 2 ALLOW
FILTERING;
@calonso
DATA MODELLING
Processes and good practices to design our schema.
@calonso
DATA MODELLING
• Understand your data
• Decide how you’ll query the data
• Define column families to satisfy those queries
• Implement and optimize
@calonso
DATA MODELLING
Conceptual
Model
Logical
Model
Physical
Model
Query-Driven
Methodology
Analysis &
Validation
@calonso
DATA MODELLING
E-R
Diagram
Chebotko
Diagram
Physical-level
Chebotko Diagram
Query-Driven
Methodology
Analysis &
Validation
@calonso
CONCEPTUAL MODEL
@calonso
QUERY DRIVEN METHODOLOGY
• Spread data evenly around the cluster
• Minimize the number of partitions read
• Follow the mapping rules:
• Entities and relationships: map to tables
• Equality search attributes: must be at the beginning of the primary key
• Inequality search attributes: become clustering columns
• Ordering attributes: become clustering columns
• Key attributes: map to primary key columns
@calonsoLOGICAL MODEL
@calonso
ANALYSIS &VALIDATION
• Are write conflicts (overwrites) possible?
• How large are partitions?
• Ncells = Nrow X ( Ncols – Npk – Nstatic ) + Nstatic < 1M
• How much data duplication? (batches)
• Client side joins or new table?
@calonsoPHYSICAL MODEL
@calonso
REVIEW QUESTIONS
What is the relationship between a column family
and a CQL table?
@calonso
REVIEW QUESTIONS
What is the relationship between a column family
and a CQL table?
Terminologically they’re the same, but technically a
column family refers to the physical representation while
table refers to the logical tabular representation when
queried from CQL.
@calonso
REVIEW QUESTIONS
How are clustering columns ordered by default?
How can we modify it?
@calonso
REVIEW QUESTIONS
How are clustering columns ordered by default?
How can we modify it?
Ascending by default.
We can modify it by adding WITH CLUSTERING
ORDER BY… in CQL table definition.
@calonso
REVIEW QUESTIONS
Which is the biggest reason for using UUIDs in
Cassandra?
@calonso
REVIEW QUESTIONS
Which is the biggest reason for using UUIDs in
Cassandra?
Distributed uniqueness. UUIDs guarantee almost 100%
uniqueness in distributed systems.
@calonso
REVIEW QUESTIONS
What is the difference between an UUID and a
TIMEUUID?
@calonso
REVIEW QUESTIONS
What is the difference between an UUID and a
TIMEUUID?
TIMEUUID contains date and time information embedded.
@calonso
REVIEW QUESTIONS
When should secondary indexes be used?
@calonso
REVIEW QUESTIONS
When should secondary indexes be used?
Very rarely. Only when it’s holding values with very low
cardinality and a lookup table is truly inconvenient.
@calonso
REVIEW QUESTIONS
Are CQL COUNTERS 100% accurate?
@calonso
REVIEW QUESTIONS
Are CQL COUNTERS 100% accurate?
No, not 100%, because its update operations are not
idempotent and a wrong will assign a wrong value.
@calonso
REVIEW QUESTIONS
What does it mean that Cassandra does
UPSERTs?
@calonso
REVIEW QUESTIONS
What does it mean that Cassandra does
UPSERTs?
That the INSERT and UPDATE operation are exactly equivalent.
@calonso
REVIEW QUESTIONS
What predicates are allowed in a CQL query?
@calonso
REVIEW QUESTIONS
What predicates are allowed in a CQL query?
Equality, Inequality and IN
@calonso
REVIEW QUESTIONS
When should the ALLOW FILTERING clause be
used?
@calonso
REVIEW QUESTIONS
When should the ALLOW FILTERING clause be
used?
Typically never. Only in development to scan through all your data.
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a
CQL query?
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a
CQL query?
Cassandra doesn’t support JOIN statements, so we can:
• Nest dependent data in the same table.
• JOIN at application level.
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a
CQL query?
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a
CQL query?
Cassandra doesn’t support JOIN statements, so we can:
• Nest dependent data in the same table.
• JOIN at application level.
@calonso
REVIEW QUESTIONS
What is the purpose of Chebotko Diagrams?
@calonso
REVIEW QUESTIONS
What is the purpose of Chebotko Diagrams?
Capture our entities and properties as tables along with the query
access patterns expected on them.
@calonso
REVIEW QUESTIONS
Which is the most important thing to keep in
mind when designing our data models?
@calonso
REVIEW QUESTIONS
Which is the most important thing to keep in
mind when designing our data models?
Minimize the number of accessed partitions.
@calonso
MORE CASSANDRA CONCEPTS
Write and read paths and compactions.
WRITE PATH
The writing process
@calonso
WRITE PATH
RDBMS CASSANDRA
@calonso
WRITE PATH
• Memtable: in-memory tables corresponding to CQL tables.
• CommitLog: append-only log to make writes durable.
• SSTables: Memtable snapshots periodically flushed to disk. Never
updated.
• Compaction: Periodic process to merge and streamline SSTables.
@calonso
FLUSH PROCESS
• Dumps a Memtable to a new SSTable on disk and its summary index.
• Marks associated commit log entries as flushed
• Triggered by:
• memtable_total_space_in_mb reached
• commitlog_total_space_in_mb reached
• nodetool flush
READ PATH
The reading process
@calonso
READ PATH
• Memtable: in memory table. Serves data as part of the merge process
• RowCache: in memory cache. Stores recently read columns
• BloomFilter: predicts wether a partition key may be in its corresponding SSTable
• KeyCaches: maps recently read partition keys to specific SSTable offsets
• Partition summaries: indexes the partition indexes.
• Partition indexes: Sorted partition keys mapped to their SSTables offsets
• SSTables: static files containing data.
@calonso
READ/WRITE STATS
nodetool cfstats <keyspace>.<column family>
COMPACTIONS
Streamlining tables in disk
@calonso
DELETES
• When a column is deleted aTombstone is applied to the column in
its Memtable
• Tombstoned read columns are ignored
• Tombstoned columns are around for gc_grace_seconds time.
• gc_grace_seconds time is configurable, but beware “Zombies”
@calonso
COMPACTIONS
• Merges most recent partition keys and columns
• Evicts tombstoned columns
• Creates new SSTable
• Rebuilds partition indexes and summaries
• Deletes old SSTables
@calonso
COMPACTIONS
• SizeTieredCompactionStrategy
• LeveledCompactionStrategy
• DateTieredCompactionStrategy
CREATETABLE user (
id UUID PRIMARY KEY,
emailTEXT,
nameTEXT,
preferences SET<TEXT>,
) WITH COMPACTION =
{ “class”: “<strategy>”, <params> };
@calonso
SIZETIERED COMPACTION
@calonso
SIZETIERED COMPACTION
• Fast to complete
• Tables size endlessly increasing
• Potentially inconsistent read latency for updated data
• May waste disk as we don’t know when deleted data will
be merged away
• Requires 2x free disk space as largest table
• Recommended for write-once, read-many use cases
@calonso
LEVELED COMPACTION
@calonso
LEVELED COMPACTION
• Continuously compacting (more I/O)
• 10 x stable_size_in_mb (160Mb) as max required disk space
• Ensures low read latency
• Recommended with overwrites (updates) and tombstones
@calonso
DATETIERED COMPACTION
• Compacts together data that was written near in time
• Recommended for time series
@calonso
REVIEW QUESTIONS
What happens when a Memtable is flushed?
@calonso
REVIEW QUESTIONS
What happens when a Memtable is flushed?
We create a new SSTable on disk.Also the corresponding
CommitLog entries are marked as flushed.
@calonso
REVIEW QUESTIONS
What causes a Memtable to flush?
@calonso
REVIEW QUESTIONS
What causes a Memtable to flush?
• memtable_total_space_in_mb reached
• commitlog_total_space_in_mb reached
• nodetool flush manually executed
@calonso
REVIEW QUESTIONS
Do disk seeks happen during writes?
@calonso
REVIEW QUESTIONS
Do disk seeks happen during writes?
No, during writes we only write to the commit log that is an
append-ahead log type.That means that writes happen
sequentially on disk.
@calonso
REVIEW QUESTIONS
What benefit do Bloom Filters provide to the read
process?
@calonso
REVIEW QUESTIONS
What benefit do Bloom Filters provide to the read
process?
It allows to skip reading SSTables that do not have the data we’re
looking for.
@calonso
REVIEW QUESTIONS
Is the partition summary read for partition keys
found in the key cache?
@calonso
REVIEW QUESTIONS
Is the partition summary read for partition keys
found in the key cache?
No.The key cache allows us to skip the partition summary and
partition index and go straight to the SSTable.
@calonso
REVIEW QUESTIONS
What is the relationship between the partition
summary and the partition index?
@calonso
REVIEW QUESTIONS
What is the relationship between the partition
summary and the partition index?
The partition summary is an index on the partition index.
@calonso
REVIEW QUESTIONS
What are zombie columns and how do you
prevent them?
@calonso
REVIEW QUESTIONS
What are zombie columns and how do you
prevent them?
Zombie columns are those that appear after bringing up a node
that has been down for long enough to not see the tombstone
(gc_grace_seconds).
@calonso
REVIEW QUESTIONS
What are the benefits of SizeTieredCompaction?
@calonso
REVIEW QUESTIONS
What are the benefits of SizeTieredCompaction?
• Enable fast write operations
• Less disk I/O pressure
@calonso
REVIEW QUESTIONS
What are the benefits of LeveledCompaction?
@calonso
REVIEW QUESTIONS
What are the benefits of LeveledCompaction?
• Predictable fast read performance
• Not necessary to have a lot of free disk space for it to happen.
@calonso
HARDWARE CONSIDERATIONS
@calonso
MEMORY
• Memory helps reads
• Recommendations
• Dedicated machines: 16GB - 64GB. Never below 8GB
• Virtual machines: 8GB - 16GB. Never below 4GB
• Testing machines:Virtual machines ~ 256Mb
@calonso
CPU
• CPU helps writes
• Recommendations
• Dedicated machines: 8 core processors
• Virtual machines: 8 cores + CPU burst
@calonso
DISK
• SizeTieredCompaction: 50% free disk space
• LeveledCompaction: 10% free disk space
• Recommendations
• 500gb to 1tb per node
• Two drives: One for data, one for CommitLog
• SSDs if possible
@calonso
NETWORK
• Gigabit ethernet or faster
@calonso
THANKYOU!

More Related Content

What's hot

Cassandra Distributions and Variants
Cassandra Distributions and VariantsCassandra Distributions and Variants
Cassandra Distributions and VariantsAnant Corporation
 
Migrating from a Relational Database to Cassandra: Why, Where, When and How
Migrating from a Relational Database to Cassandra: Why, Where, When and HowMigrating from a Relational Database to Cassandra: Why, Where, When and How
Migrating from a Relational Database to Cassandra: Why, Where, When and HowAnant Corporation
 
Apache Cassandra Training,Apache Cassandra Training in Bangalore india
Apache Cassandra Training,Apache Cassandra Training in Bangalore indiaApache Cassandra Training,Apache Cassandra Training in Bangalore india
Apache Cassandra Training,Apache Cassandra Training in Bangalore indiasharepointexpert
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Case Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developerCase Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developerCarlos Alonso Pérez
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayCassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayDataStax Academy
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7DataStax
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra Nikiforos Botis
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...DataStax
 
Cassandra@Coursera: AWS deploy and MySQL transition
Cassandra@Coursera: AWS deploy and MySQL transitionCassandra@Coursera: AWS deploy and MySQL transition
Cassandra@Coursera: AWS deploy and MySQL transitionDaniel Jin Hao Chia
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overviewElifTech
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarDataStax Academy
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Johnny Miller
 

What's hot (20)

Cassandra Distributions and Variants
Cassandra Distributions and VariantsCassandra Distributions and Variants
Cassandra Distributions and Variants
 
Migrating from a Relational Database to Cassandra: Why, Where, When and How
Migrating from a Relational Database to Cassandra: Why, Where, When and HowMigrating from a Relational Database to Cassandra: Why, Where, When and How
Migrating from a Relational Database to Cassandra: Why, Where, When and How
 
Apache Cassandra Training,Apache Cassandra Training in Bangalore india
Apache Cassandra Training,Apache Cassandra Training in Bangalore indiaApache Cassandra Training,Apache Cassandra Training in Bangalore india
Apache Cassandra Training,Apache Cassandra Training in Bangalore india
 
Cassandra vs Databases
Cassandra vs Databases Cassandra vs Databases
Cassandra vs Databases
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Case Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developerCase Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developer
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayCassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
Cassandra@Coursera: AWS deploy and MySQL transition
Cassandra@Coursera: AWS deploy and MySQL transitionCassandra@Coursera: AWS deploy and MySQL transition
Cassandra@Coursera: AWS deploy and MySQL transition
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
 

Viewers also liked

Scalable data modelling by example - Cassandra Summit '16
Scalable data modelling by example - Cassandra Summit '16Scalable data modelling by example - Cassandra Summit '16
Scalable data modelling by example - Cassandra Summit '16Carlos Alonso Pérez
 
Cassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and OutCassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and OutDataStax Academy
 
Cassandra - An Introduction
Cassandra - An IntroductionCassandra - An Introduction
Cassandra - An IntroductionMikio L. Braun
 
Introduction à Cassandra - devoxx france 2012
Introduction à Cassandra - devoxx france 2012Introduction à Cassandra - devoxx france 2012
Introduction à Cassandra - devoxx france 2012jaxio
 
Ruby closures, how are they possible?
Ruby closures, how are they possible?Ruby closures, how are they possible?
Ruby closures, how are they possible?Carlos Alonso Pérez
 
Construyendo y publicando nuestra primera app multi plataforma (II)
Construyendo y publicando nuestra primera app multi plataforma (II)Construyendo y publicando nuestra primera app multi plataforma (II)
Construyendo y publicando nuestra primera app multi plataforma (II)Carlos Alonso Pérez
 
Construyendo y publicando nuestra primera app multiplataforma
Construyendo y publicando nuestra primera app multiplataformaConstruyendo y publicando nuestra primera app multiplataforma
Construyendo y publicando nuestra primera app multiplataformaCarlos Alonso Pérez
 
Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)
Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)
Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)Carlos Alonso Pérez
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Robbie Strickland
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraRobbie Strickland
 

Viewers also liked (20)

Cassandra for impatients
Cassandra for impatientsCassandra for impatients
Cassandra for impatients
 
Scalable data modelling by example - Cassandra Summit '16
Scalable data modelling by example - Cassandra Summit '16Scalable data modelling by example - Cassandra Summit '16
Scalable data modelling by example - Cassandra Summit '16
 
Cassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and OutCassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and Out
 
Cassandra - An Introduction
Cassandra - An IntroductionCassandra - An Introduction
Cassandra - An Introduction
 
Introduction à Cassandra - devoxx france 2012
Introduction à Cassandra - devoxx france 2012Introduction à Cassandra - devoxx france 2012
Introduction à Cassandra - devoxx france 2012
 
Ruby closures, how are they possible?
Ruby closures, how are they possible?Ruby closures, how are they possible?
Ruby closures, how are they possible?
 
Construyendo y publicando nuestra primera app multi plataforma (II)
Construyendo y publicando nuestra primera app multi plataforma (II)Construyendo y publicando nuestra primera app multi plataforma (II)
Construyendo y publicando nuestra primera app multi plataforma (II)
 
Javascript - 2014
Javascript - 2014Javascript - 2014
Javascript - 2014
 
Enumerados Server
Enumerados ServerEnumerados Server
Enumerados Server
 
Swift and the BigData
Swift and the BigDataSwift and the BigData
Swift and the BigData
 
iOS Notifications
iOS NotificationsiOS Notifications
iOS Notifications
 
Construyendo y publicando nuestra primera app multiplataforma
Construyendo y publicando nuestra primera app multiplataformaConstruyendo y publicando nuestra primera app multiplataforma
Construyendo y publicando nuestra primera app multiplataforma
 
Html5
Html5Html5
Html5
 
Aplicaciones móviles - HTML5
Aplicaciones móviles - HTML5Aplicaciones móviles - HTML5
Aplicaciones móviles - HTML5
 
Javascript
JavascriptJavascript
Javascript
 
iCloud
iCloudiCloud
iCloud
 
Programacion web
Programacion webProgramacion web
Programacion web
 
Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)
Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)
Sensors (Accelerometer, Magnetometer, Gyroscope, Proximity and Luminosity)
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 

Similar to Cassandra Workshop - Cassandra from scratch in one day

Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...Helena Edelson
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at PollfishPollfish
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataVictor Coustenoble
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupVictor Coustenoble
 
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1DataStax Academy
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014NoSQLmatters
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015Patrick McFadin
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideMohammed Fazuluddin
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkRed Hat Developers
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlDavid Daeschler
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesPhil Peace
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesScyllaDB
 
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyTokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyDataStax Academy
 
Data stax academy
Data stax academyData stax academy
Data stax academyDuyhai Doan
 
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeOracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeDataStax
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_finalSergioBruno21
 

Similar to Cassandra Workshop - Cassandra from scratch in one day (20)

Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
 
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosql
 
Stratio big data spain
Stratio   big data spainStratio   big data spain
Stratio big data spain
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
 
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyTokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
 
Data stax academy
Data stax academyData stax academy
Data stax academy
 
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeOracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 

Recently uploaded

Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptbibisarnayak0
 
Configuration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentConfiguration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentBharaniDharan195623
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Crystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxCrystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxachiever3003
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxNiranjanYadav41
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 

Recently uploaded (20)

Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.ppt
 
Configuration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentConfiguration of IoT devices - Systems managament
Configuration of IoT devices - Systems managament
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Crystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxCrystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 

Cassandra Workshop - Cassandra from scratch in one day