SlideShare a Scribd company logo
1 of 44
Download to read offline
Event Sourcing with Cassandra
Luke Tillman
Technical Evangelist
@LukeTillman
• Evangelist with a
focus on
Developers
– Long-time
Developer on
RDBMS (lots of
.NET)
• I still write a lot of
code, but now I also
do a lot of teaching
and speaking
Who are you?
2
A Quick Recap of Event Sourcing
3
Persistence with Event Sourcing
• Instead of keeping the
current state, keep a journal
of all the deltas (events)
• Append only (no UPDATE or
DELETE)
• We can replay our journal of
events to get the current
state
4
Shopping Cart (id = 1345)
user_id= 4762
created_on= 7/10/2015…
Cart Created
item_id= 7621
quantity= 1
price= 19.99
Item Added
item_id= 9134
quantity= 2
price= 16.99
Item Added
Item Removed item_id= 7621
Qty Changed
item_id= 9134
quantity= 1
Event Sourcing in Practice
• Typically two kinds of storage:
– Event Journal Store
– Snapshot Store
• A history of how we got to the
current state can be useful
• We've also got a lot more data
to store than we did before
5
Shopping Cart (id = 1345)
user_id= 4762
created_on= 7/10/2015…
Cart Created
item_id= 7621
quantity= 1
price= 19.99
Item Added
item_id= 9134
quantity= 2
price= 16.99
Item Added
Item Removed item_id= 7621
Qty Changed
item_id= 9134
quantity= 1
Why use Cassandra for Event Sourcing?
• Transactional (OLTP) Workload
• Sequentially written, immutable data
– Looks a lot like time series data
• Easy to scale out to capture more events
6
Event Sourcing Example: Akka Persistence
7
Akka Persistence Journal API Summary
• Write Method
– For a given actor, write a group
of messages
• Delete Method
– For a given actor, permanently
or logically delete all messages
up to a given sequence number
• Read Methods
– For a given actor, read back all
the messages between two
sequence numbers
– For a given actor, read the
highest sequence number that's
been written
8
An Event Journal in Cassandra
Data Modeling for Reads and Writes
9
A Simple First Attempt
• Use persistence_id as partition key
– all messages for a given persistence Id
together
• Use sequence_number as clustering
column
– order messages by sequence number
inside a partition
• Read all messages between two
sequence numbers
• Read the highest sequence number
10
CREATE TABLE messages (
persistence_id text,
sequence_number bigint,
message blob,
PRIMARY KEY (
persistence_id, sequence_number)
);
SELECT * FROM messages
WHERE persistence_id = ?
AND sequence_number >= ?
AND sequence_number <= ?;
SELECT sequence_number FROM messages
WHERE persistence_id = ?
ORDER BY sequence_number DESC LIMIT 1;
A Simple First Attempt
• Write a group of messages
• Use a Cassandra Batch statement to
ensure all messages (success) or no
messages (failure) get written
• What's the problem with this data
model (ignoring implementing deletes
for now)?
11
CREATE TABLE messages (
persistence_id text,
sequence_number bigint,
message blob,
PRIMARY KEY (
persistence_id, sequence_number)
);
BEGIN BATCH
INSERT INTO messages ... ;
INSERT INTO messages ... ;
INSERT INTO messages ... ;
APPLY BATCH;
Unbounded Partition Growth
• Cassandra has a hard limit of 2
billion cells in a partition
• But there's also a practical limit
– Depends on row/cell data size, but
likely not more than millions of rows
12
Journal
INSERT INTO messages ...
persistence_id=
'57ab...'
seq_nr=
1
seq_nr=
2
message=
0x00...
message=
0x00...
∞?
Fixing the Unbounded Partition Growth Problem
• General strategy: add a column to
the partition key
– Compound partition key
• Can be data that's already part of
the model, or a "synthetic" column
• Allow users to configure a partition
size in the plugin
– Partition Size = number of rows per
partition
– This should not be changeable once
messages have been written
• Partition number for a given
sequence number is then easy to
calculate
– (seqNr – 1) / partitionSize
(100 – 1) / 100 = partition 0
(101 – 1) / 100 = partition 1
13
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number)
);
Fixing the Unbounded Partition Growth Problem
• Read all messages between two
sequence numbers
• Read the highest sequence number
14
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number)
);
SELECT * FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND sequence_number >= ?
AND sequence_number <= ?;
SELECT sequence_number FROM messages
WHERE persistence_id = ?
AND partition_number = ?
ORDER BY sequence_number DESC LIMIT 1;
(repeat until we reach sequence number or run out of partitions)
(repeat until we run out of partitions)
Fixing the Unbounded Partition Growth Problem
• Write a group of messages
• A Cassandra Batch statement
might now write to multiple
partitions (if the sequence numbers
cross a partition boundary)
• Is that a problem?
15
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number)
);
BEGIN BATCH
INSERT INTO messages ... ;
INSERT INTO messages ... ;
INSERT INTO messages ... ;
APPLY BATCH;
RTFM: Cassandra Batches Edition
16
"Batches are atomic by default. In the context of a Cassandra batch
operation, atomic means that if any of the batch succeeds, all of it will."
- DataStax CQL Docs
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html
"Although an atomic batch guarantees that if any part of the batch succeeds,
all of it will, no other transactional enforcement is done at the batch level.
For example, there is no batch isolation. Clients are able to read the first
updated rows from the batch, while other rows are still being updated on the
server."
- DataStax CQL Docs
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html
Atomic? That's kind of a loaded word.
Multiple Partition Batch Failure Scenario
17
Journal
RF = 3
Multiple Partition Batch Failure Scenario
17
Journal
BEGIN BATCH
...
APPLY BATCH;
CL = QUORUM
RF = 3
Multiple Partition Batch Failure Scenario
17
Journal
BEGIN BATCH
...
APPLY BATCH;
Batch
Log
Batch
Log
Batch
Log
CL = QUORUM
RF = 3
Multiple Partition Batch Failure Scenario
• Once written to the
Batch Log successfully,
we know all the writes
in the batch will
succeed eventually
(atomic?)
17
Journal
BEGIN BATCH
...
APPLY BATCH;
CL = QUORUM
RF = 3
Multiple Partition Batch Failure Scenario
• Once written to the
Batch Log successfully,
we know all the writes
in the batch will
succeed eventually
(atomic?)
17
Journal
BEGIN BATCH
...
APPLY BATCH;
CL = QUORUM
RF = 3
Multiple Partition Batch Failure Scenario
• Once written to the
Batch Log successfully,
we know all the writes
in the batch will
succeed eventually
(atomic?)
• Batch has been
partially applied
17
Journal
BEGIN BATCH
...
APPLY BATCH;
CL = QUORUM
RF = 3
Multiple Partition Batch Failure Scenario
• Once written to the
Batch Log successfully,
we know all the writes
in the batch will
succeed eventually
(atomic?)
• Batch has been
partially applied
• Possible to read a
partially applied batch
since there is no batch
isolation
17
Journal
BEGIN BATCH
...
APPLY BATCH;
CL = QUORUM
RF = 3
WriteTimeout
- writeType = BATCH
RTFM: Cassandra Batches Edition Part 2
24
"For example, there is no batch isolation. Clients are able to read the first
updated rows from the batch, while other rows are still being updated on the
server. However, transactional row updates within a partition key are
isolated: clients cannot read a partial update."
- DataStax CQL Docs
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html
What we really need is Isolation.
When writing a group of messages, ensure that
we write the group to a single partition.
Logic Changes to Ensure Batch Isolation
• Still use configurable Partition Size
– not a "hard limit" but a "best attempt"
• On write, see if messages will all fit in the
current partition
• If not, roll over to the next partition early
• Reading is slightly more complicated
– For a given sequence number it might be in
partition n or (n+1)
25
seq_nr = 97
seq_nr = 98
seq_nr = 1
99
100
101
partition_nr = 1
partition_nr = 2
PartitionSize=100
Accounting for Deletes
26
Option 1: Mark Individual Messages as Deleted
• Add an is_deleted column
to our messages table
• Read all messages between
two sequence numbers
27
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
message blob,
is_deleted bool,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number)
);
SELECT * FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND sequence_number >= ?
AND sequence_number <= ?;
(repeat until we reach sequence number or run out of partitions)
... sequence_number message is_deleted
... 1 0x00 true
... 2 0x00 true
... 3 0x00 false
... 4 0x00 false
Option 1: Mark Individual Messages as Deleted
• Pros:
– On replay, easy to check if a
message has been deleted (comes
included in message query's data)
• Cons:
– Messages not immutable any
more
– Issue lots of UPDATEs to mark
each message as deleted
– Have to scan through a lot of rows
to find max deleted sequence
number if we want to avoid
issuing unnecessary UPDATEs
28
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
message blob,
is_deleted bool,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number)
);
Option 2: Write a Marker Row for Each Deleted Row
• Add a marker column and
make it a clustering column
– Messages written with 'A'
– Deletes get written with 'D'
• Read all messages between
two sequence numbers
29
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
marker text,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number, marker)
);
SELECT * FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND sequence_number >= ?
AND sequence_number <= ?;
(repeat until we reach sequence number or run out of partitions)
... sequence_number marker message
... 1 A 0x00
... 1 D null
... 2 A 0x00
... 3 A 0x00
Option 2: Write a Marker Row for Each Deleted Row
• Pros
– On replay, easy to peek at next
row to check if deleted (comes
included in message query's data)
– Message data stays immutable
• Cons
– Issue lots of INSERTs to mark
each message as deleted
– Have to scan through a lot of rows
to find max deleted sequence
number if we want to avoid
issuing unnecessary INSERTs
– Potentially twice as many rows to
store
30
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
marker text,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number, marker)
);
Looking at Physical Deletes
• Physically delete messages to a
given sequence number
• Still probably want to scan
through rows to see what's
already been deleted first
31
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
marker text,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number, marker)
);
BEGIN BATCH
DELETE FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND marker = 'A'
AND sequence_number = ?;
...
APPLY BATCH;
• Can't range delete, so we have
to do lots of individual
DELETEs
Looking at Physical Deletes
• Read all messages between
two sequence numbers
• With how DELETEs work in
Cassandra, is there a potential
problem with this query?
32
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
sequence_number bigint,
marker text,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
sequence_number, marker)
);
SELECT * FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND sequence_number >= ?
AND sequence_number <= ?;
(repeat until we reach sequence number or run out of partitions)
Tombstone Hell: Queue-like Data Sets
33
Journal persistence_id
'57ab...'
partition_nr
1
message=
0x00...
seq_nr=1
marker='A'
...
message=
0x00...
seq_nr=2
marker='A'
Tombstone Hell: Queue-like Data Sets
33
Journal persistence_id
'57ab...'
partition_nr
1
message=
0x00...
seq_nr=1
marker='A'
...
Delete messages to a sequence number
BEGIN BATCH
DELETE FROM messages
WHERE persistence_id = '57ab...'
AND partition_nr = 1
AND marker = 'A'
AND sequence_nr = 1;
...
APPLY BATCH;
message=
0x00...
seq_nr=2
marker='A'
Tombstone Hell: Queue-like Data Sets
33
Journal persistence_id
'57ab...'
partition_nr
1
message=
0x00...
seq_nr=1
marker='A'
seq_nr=1
marker='A'
Tombstone
NO DATA HERE
...
Delete messages to a sequence number
BEGIN BATCH
DELETE FROM messages
WHERE persistence_id = '57ab...'
AND partition_nr = 1
AND marker = 'A'
AND sequence_nr = 1;
...
APPLY BATCH;
message=
0x00...
seq_nr=2
marker='A'
seq_nr=2
marker='A'
Tombstone
NO DATA HERE
Tombstone Hell: Queue-like Data Sets
• At some point compaction runs and we
don't have two versions any more, but
tombstones don't go away immediately
– Tombstones remain for gc_grace_seconds
– Default is 10 days
33
Journal persistence_id
'57ab...'
partition_nr
1
seq_nr=1
marker='A'
Tombstone
NO DATA HERE
...
seq_nr=2
marker='A'
Tombstone
NO DATA HERE
Tombstone Hell: Queue-like Data Sets
37
Journal persistence_id
'57ab...'
partition_nr
1
seq_nr=1
marker='A'
Tombstone
NO DATA HERE
...
Read all messages between 2 sequence numbers
SELECT * FROM messages
WHERE persistence_id = '57ab...'
AND partition_number = 1
AND sequence_number >= 1
AND sequence_number <= [max value];
seq_nr=2
marker='A'
Tombstone
NO DATA HERE
seq_nr=3
marker='A'
Tombstone
NO DATA HERE
seq_nr=4
marker='A'
Tombstone
NO DATA HERE
Avoid Tombstone Hell
38
We need a way to avoid reading
tombstones when replaying messages.
SELECT * FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND sequence_number >= ?
AND sequence_number <= ?;
AND sequence_number >= ?
If we know what sequence number we've already deleted to
before we query, we could make that lower bound smarter.
A Third Option for Deletes
• Use marker as a clustering
column, but change the
clustering order
– Messages still 'A', Deletes 'D'
• Read all messages between
two sequence numbers
39
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
marker text,
sequence_number bigint,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
marker, sequence_number)
);
SELECT * FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND marker = 'A'
AND sequence_number >= ?
AND sequence_number <= ?;
(repeat until we reach sequence number or run out of partitions)
... sequence_number marker message
... 1 A 0x00
... 2 A 0x00
... 3 A 0x00
A Third Option for Deletes
• Messages data no longer has
deleted information, so how do we
know what's already been deleted?
• Get max deleted sequence number
• Can avoid tombstones if done
before getting message data
40
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
marker text,
sequence_number bigint,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
marker, sequence_number)
);
SELECT sequence_number FROM messages
WHERE persistence_id = ?
AND partition_number = ?
AND marker = 'D'
ORDER BY marker DESC,
sequence_number DESC
LIMIT 1;
A Third Option for Deletes
• Pros
– Message data stays immutable
– Issue a single INSERT when
deleting to a sequence number
– Read a single row to find out
what's been deleted (no more
scanning)
– Can avoid reading tombstones
created by physical deletes
• Cons
– Requires a separate query to find
out what's been deleted before
getting message data
41
CREATE TABLE messages (
persistence_id text,
partition_number bigint,
marker text,
sequence_number bigint,
message blob,
PRIMARY KEY (
(persistence_id, partition_number),
marker, sequence_number)
);
Lessons Learned
42
Summary
• Seemingly simple data models can
get a lot more complicated
• Avoid unbounded partition growth
– Add data to your partition key
• Be aware of how Cassandra Logged Batches work
– If you need isolation, only write to a single partition
• Avoid queue-like data sets and be aware of how tombstones might
impact your queries
– Try to query with ranges that avoid tombstones
43
Questions?
@LukeTillman
https://www.linkedin.com/in/luketillman/
https://github.com/LukeTillman/
44

More Related Content

What's hot

ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...Altinity Ltd
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEOAltinity Ltd
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd
 
Materialize: a platform for changing data
Materialize: a platform for changing dataMaterialize: a platform for changing data
Materialize: a platform for changing dataAltinity Ltd
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeDatabricks
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeDataWorks Summit
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Ltd
 

What's hot (20)

ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 
Materialize: a platform for changing data
Materialize: a platform for changing dataMaterialize: a platform for changing data
Materialize: a platform for changing data
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-time
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Distributed computing with spark
Distributed computing with sparkDistributed computing with spark
Distributed computing with spark
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
 

Viewers also liked

Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and CassandraAvoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and CassandraLuke Tillman
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionDataStax Academy
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraLuke Tillman
 
Building your First Application with Cassandra
Building your First Application with CassandraBuilding your First Application with Cassandra
Building your First Application with CassandraLuke Tillman
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraLuke Tillman
 
Getting started with DataStax .NET Driver for Cassandra
Getting started with DataStax .NET Driver for CassandraGetting started with DataStax .NET Driver for Cassandra
Getting started with DataStax .NET Driver for CassandraLuke Tillman
 
A Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET DevelopersA Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET DevelopersLuke Tillman
 
Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)
Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)
Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)Luke Tillman
 
From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...
From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...
From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...Luke Tillman
 

Viewers also liked (10)

Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and CassandraAvoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Building your First Application with Cassandra
Building your First Application with CassandraBuilding your First Application with Cassandra
Building your First Application with Cassandra
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Getting started with DataStax .NET Driver for Cassandra
Getting started with DataStax .NET Driver for CassandraGetting started with DataStax .NET Driver for Cassandra
Getting started with DataStax .NET Driver for Cassandra
 
A Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET DevelopersA Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET Developers
 
Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)
Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)
Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)
 
From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...
From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...
From Monolith to Microservices with Cassandra, gRPC, and Falcor (from Cassand...
 

Similar to Event Sourcing with Cassandra (from Cassandra Japan Meetup in Tokyo March 2016)

Lecture 2 coal sping12
Lecture 2 coal sping12Lecture 2 coal sping12
Lecture 2 coal sping12Rabia Khalid
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedis Labs
 
Week 6 java script loops
Week 6   java script loopsWeek 6   java script loops
Week 6 java script loopsbrianjihoonlee
 
What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0MariaDB plc
 
RICON keynote: outwards from the middle of the maze
RICON keynote: outwards from the middle of the mazeRICON keynote: outwards from the middle of the maze
RICON keynote: outwards from the middle of the mazepalvaro
 
Lec2_cont.pptx galgotias University questions
Lec2_cont.pptx galgotias University questionsLec2_cont.pptx galgotias University questions
Lec2_cont.pptx galgotias University questionsYashJain47002
 
Very basic functional design patterns
Very basic functional design patternsVery basic functional design patterns
Very basic functional design patternsTomasz Kowal
 
Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon RedshiftJeff Patti
 
Queuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerQueuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerNiels Berglund
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on androidKoan-Sin Tan
 
Lotusphere 2007 AD505 DevBlast 30 LotusScript Tips
Lotusphere 2007 AD505 DevBlast 30 LotusScript TipsLotusphere 2007 AD505 DevBlast 30 LotusScript Tips
Lotusphere 2007 AD505 DevBlast 30 LotusScript TipsBill Buchan
 
Tech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data ModelingTech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data ModelingScyllaDB
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012Connor McDonald
 
What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0MariaDB plc
 

Similar to Event Sourcing with Cassandra (from Cassandra Japan Meetup in Tokyo March 2016) (20)

Lecture 2 coal sping12
Lecture 2 coal sping12Lecture 2 coal sping12
Lecture 2 coal sping12
 
Performance Tuning
Performance TuningPerformance Tuning
Performance Tuning
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
 
Week 6 java script loops
Week 6   java script loopsWeek 6   java script loops
Week 6 java script loops
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0
 
RICON keynote: outwards from the middle of the maze
RICON keynote: outwards from the middle of the mazeRICON keynote: outwards from the middle of the maze
RICON keynote: outwards from the middle of the maze
 
Lec2_cont.pptx galgotias University questions
Lec2_cont.pptx galgotias University questionsLec2_cont.pptx galgotias University questions
Lec2_cont.pptx galgotias University questions
 
SQL Server 2012 Best Practices
SQL Server 2012 Best PracticesSQL Server 2012 Best Practices
SQL Server 2012 Best Practices
 
Very basic functional design patterns
Very basic functional design patternsVery basic functional design patterns
Very basic functional design patterns
 
Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon Redshift
 
Queuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerQueuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL Server
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
Lotusphere 2007 AD505 DevBlast 30 LotusScript Tips
Lotusphere 2007 AD505 DevBlast 30 LotusScript TipsLotusphere 2007 AD505 DevBlast 30 LotusScript Tips
Lotusphere 2007 AD505 DevBlast 30 LotusScript Tips
 
Introduction to C ++.pptx
Introduction to C ++.pptxIntroduction to C ++.pptx
Introduction to C ++.pptx
 
Tech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data ModelingTech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data Modeling
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012
 
Data race
Data raceData race
Data race
 
What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0What's new in MariaDB TX 3.0
What's new in MariaDB TX 3.0
 

Recently uploaded

The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutionsmonugehlot87
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 

Recently uploaded (20)

The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutions
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 

Event Sourcing with Cassandra (from Cassandra Japan Meetup in Tokyo March 2016)

  • 1. Event Sourcing with Cassandra Luke Tillman Technical Evangelist @LukeTillman
  • 2. • Evangelist with a focus on Developers – Long-time Developer on RDBMS (lots of .NET) • I still write a lot of code, but now I also do a lot of teaching and speaking Who are you? 2
  • 3. A Quick Recap of Event Sourcing 3
  • 4. Persistence with Event Sourcing • Instead of keeping the current state, keep a journal of all the deltas (events) • Append only (no UPDATE or DELETE) • We can replay our journal of events to get the current state 4 Shopping Cart (id = 1345) user_id= 4762 created_on= 7/10/2015… Cart Created item_id= 7621 quantity= 1 price= 19.99 Item Added item_id= 9134 quantity= 2 price= 16.99 Item Added Item Removed item_id= 7621 Qty Changed item_id= 9134 quantity= 1
  • 5. Event Sourcing in Practice • Typically two kinds of storage: – Event Journal Store – Snapshot Store • A history of how we got to the current state can be useful • We've also got a lot more data to store than we did before 5 Shopping Cart (id = 1345) user_id= 4762 created_on= 7/10/2015… Cart Created item_id= 7621 quantity= 1 price= 19.99 Item Added item_id= 9134 quantity= 2 price= 16.99 Item Added Item Removed item_id= 7621 Qty Changed item_id= 9134 quantity= 1
  • 6. Why use Cassandra for Event Sourcing? • Transactional (OLTP) Workload • Sequentially written, immutable data – Looks a lot like time series data • Easy to scale out to capture more events 6
  • 7. Event Sourcing Example: Akka Persistence 7
  • 8. Akka Persistence Journal API Summary • Write Method – For a given actor, write a group of messages • Delete Method – For a given actor, permanently or logically delete all messages up to a given sequence number • Read Methods – For a given actor, read back all the messages between two sequence numbers – For a given actor, read the highest sequence number that's been written 8
  • 9. An Event Journal in Cassandra Data Modeling for Reads and Writes 9
  • 10. A Simple First Attempt • Use persistence_id as partition key – all messages for a given persistence Id together • Use sequence_number as clustering column – order messages by sequence number inside a partition • Read all messages between two sequence numbers • Read the highest sequence number 10 CREATE TABLE messages ( persistence_id text, sequence_number bigint, message blob, PRIMARY KEY ( persistence_id, sequence_number) ); SELECT * FROM messages WHERE persistence_id = ? AND sequence_number >= ? AND sequence_number <= ?; SELECT sequence_number FROM messages WHERE persistence_id = ? ORDER BY sequence_number DESC LIMIT 1;
  • 11. A Simple First Attempt • Write a group of messages • Use a Cassandra Batch statement to ensure all messages (success) or no messages (failure) get written • What's the problem with this data model (ignoring implementing deletes for now)? 11 CREATE TABLE messages ( persistence_id text, sequence_number bigint, message blob, PRIMARY KEY ( persistence_id, sequence_number) ); BEGIN BATCH INSERT INTO messages ... ; INSERT INTO messages ... ; INSERT INTO messages ... ; APPLY BATCH;
  • 12. Unbounded Partition Growth • Cassandra has a hard limit of 2 billion cells in a partition • But there's also a practical limit – Depends on row/cell data size, but likely not more than millions of rows 12 Journal INSERT INTO messages ... persistence_id= '57ab...' seq_nr= 1 seq_nr= 2 message= 0x00... message= 0x00... ∞?
  • 13. Fixing the Unbounded Partition Growth Problem • General strategy: add a column to the partition key – Compound partition key • Can be data that's already part of the model, or a "synthetic" column • Allow users to configure a partition size in the plugin – Partition Size = number of rows per partition – This should not be changeable once messages have been written • Partition number for a given sequence number is then easy to calculate – (seqNr – 1) / partitionSize (100 – 1) / 100 = partition 0 (101 – 1) / 100 = partition 1 13 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, message blob, PRIMARY KEY ( (persistence_id, partition_number), sequence_number) );
  • 14. Fixing the Unbounded Partition Growth Problem • Read all messages between two sequence numbers • Read the highest sequence number 14 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, message blob, PRIMARY KEY ( (persistence_id, partition_number), sequence_number) ); SELECT * FROM messages WHERE persistence_id = ? AND partition_number = ? AND sequence_number >= ? AND sequence_number <= ?; SELECT sequence_number FROM messages WHERE persistence_id = ? AND partition_number = ? ORDER BY sequence_number DESC LIMIT 1; (repeat until we reach sequence number or run out of partitions) (repeat until we run out of partitions)
  • 15. Fixing the Unbounded Partition Growth Problem • Write a group of messages • A Cassandra Batch statement might now write to multiple partitions (if the sequence numbers cross a partition boundary) • Is that a problem? 15 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, message blob, PRIMARY KEY ( (persistence_id, partition_number), sequence_number) ); BEGIN BATCH INSERT INTO messages ... ; INSERT INTO messages ... ; INSERT INTO messages ... ; APPLY BATCH;
  • 16. RTFM: Cassandra Batches Edition 16 "Batches are atomic by default. In the context of a Cassandra batch operation, atomic means that if any of the batch succeeds, all of it will." - DataStax CQL Docs http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html "Although an atomic batch guarantees that if any part of the batch succeeds, all of it will, no other transactional enforcement is done at the batch level. For example, there is no batch isolation. Clients are able to read the first updated rows from the batch, while other rows are still being updated on the server." - DataStax CQL Docs http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html Atomic? That's kind of a loaded word.
  • 17. Multiple Partition Batch Failure Scenario 17 Journal RF = 3
  • 18. Multiple Partition Batch Failure Scenario 17 Journal BEGIN BATCH ... APPLY BATCH; CL = QUORUM RF = 3
  • 19. Multiple Partition Batch Failure Scenario 17 Journal BEGIN BATCH ... APPLY BATCH; Batch Log Batch Log Batch Log CL = QUORUM RF = 3
  • 20. Multiple Partition Batch Failure Scenario • Once written to the Batch Log successfully, we know all the writes in the batch will succeed eventually (atomic?) 17 Journal BEGIN BATCH ... APPLY BATCH; CL = QUORUM RF = 3
  • 21. Multiple Partition Batch Failure Scenario • Once written to the Batch Log successfully, we know all the writes in the batch will succeed eventually (atomic?) 17 Journal BEGIN BATCH ... APPLY BATCH; CL = QUORUM RF = 3
  • 22. Multiple Partition Batch Failure Scenario • Once written to the Batch Log successfully, we know all the writes in the batch will succeed eventually (atomic?) • Batch has been partially applied 17 Journal BEGIN BATCH ... APPLY BATCH; CL = QUORUM RF = 3
  • 23. Multiple Partition Batch Failure Scenario • Once written to the Batch Log successfully, we know all the writes in the batch will succeed eventually (atomic?) • Batch has been partially applied • Possible to read a partially applied batch since there is no batch isolation 17 Journal BEGIN BATCH ... APPLY BATCH; CL = QUORUM RF = 3 WriteTimeout - writeType = BATCH
  • 24. RTFM: Cassandra Batches Edition Part 2 24 "For example, there is no batch isolation. Clients are able to read the first updated rows from the batch, while other rows are still being updated on the server. However, transactional row updates within a partition key are isolated: clients cannot read a partial update." - DataStax CQL Docs http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html What we really need is Isolation. When writing a group of messages, ensure that we write the group to a single partition.
  • 25. Logic Changes to Ensure Batch Isolation • Still use configurable Partition Size – not a "hard limit" but a "best attempt" • On write, see if messages will all fit in the current partition • If not, roll over to the next partition early • Reading is slightly more complicated – For a given sequence number it might be in partition n or (n+1) 25 seq_nr = 97 seq_nr = 98 seq_nr = 1 99 100 101 partition_nr = 1 partition_nr = 2 PartitionSize=100
  • 27. Option 1: Mark Individual Messages as Deleted • Add an is_deleted column to our messages table • Read all messages between two sequence numbers 27 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, message blob, is_deleted bool, PRIMARY KEY ( (persistence_id, partition_number), sequence_number) ); SELECT * FROM messages WHERE persistence_id = ? AND partition_number = ? AND sequence_number >= ? AND sequence_number <= ?; (repeat until we reach sequence number or run out of partitions) ... sequence_number message is_deleted ... 1 0x00 true ... 2 0x00 true ... 3 0x00 false ... 4 0x00 false
  • 28. Option 1: Mark Individual Messages as Deleted • Pros: – On replay, easy to check if a message has been deleted (comes included in message query's data) • Cons: – Messages not immutable any more – Issue lots of UPDATEs to mark each message as deleted – Have to scan through a lot of rows to find max deleted sequence number if we want to avoid issuing unnecessary UPDATEs 28 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, message blob, is_deleted bool, PRIMARY KEY ( (persistence_id, partition_number), sequence_number) );
  • 29. Option 2: Write a Marker Row for Each Deleted Row • Add a marker column and make it a clustering column – Messages written with 'A' – Deletes get written with 'D' • Read all messages between two sequence numbers 29 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, marker text, message blob, PRIMARY KEY ( (persistence_id, partition_number), sequence_number, marker) ); SELECT * FROM messages WHERE persistence_id = ? AND partition_number = ? AND sequence_number >= ? AND sequence_number <= ?; (repeat until we reach sequence number or run out of partitions) ... sequence_number marker message ... 1 A 0x00 ... 1 D null ... 2 A 0x00 ... 3 A 0x00
  • 30. Option 2: Write a Marker Row for Each Deleted Row • Pros – On replay, easy to peek at next row to check if deleted (comes included in message query's data) – Message data stays immutable • Cons – Issue lots of INSERTs to mark each message as deleted – Have to scan through a lot of rows to find max deleted sequence number if we want to avoid issuing unnecessary INSERTs – Potentially twice as many rows to store 30 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, marker text, message blob, PRIMARY KEY ( (persistence_id, partition_number), sequence_number, marker) );
  • 31. Looking at Physical Deletes • Physically delete messages to a given sequence number • Still probably want to scan through rows to see what's already been deleted first 31 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, marker text, message blob, PRIMARY KEY ( (persistence_id, partition_number), sequence_number, marker) ); BEGIN BATCH DELETE FROM messages WHERE persistence_id = ? AND partition_number = ? AND marker = 'A' AND sequence_number = ?; ... APPLY BATCH; • Can't range delete, so we have to do lots of individual DELETEs
  • 32. Looking at Physical Deletes • Read all messages between two sequence numbers • With how DELETEs work in Cassandra, is there a potential problem with this query? 32 CREATE TABLE messages ( persistence_id text, partition_number bigint, sequence_number bigint, marker text, message blob, PRIMARY KEY ( (persistence_id, partition_number), sequence_number, marker) ); SELECT * FROM messages WHERE persistence_id = ? AND partition_number = ? AND sequence_number >= ? AND sequence_number <= ?; (repeat until we reach sequence number or run out of partitions)
  • 33. Tombstone Hell: Queue-like Data Sets 33 Journal persistence_id '57ab...' partition_nr 1 message= 0x00... seq_nr=1 marker='A' ... message= 0x00... seq_nr=2 marker='A'
  • 34. Tombstone Hell: Queue-like Data Sets 33 Journal persistence_id '57ab...' partition_nr 1 message= 0x00... seq_nr=1 marker='A' ... Delete messages to a sequence number BEGIN BATCH DELETE FROM messages WHERE persistence_id = '57ab...' AND partition_nr = 1 AND marker = 'A' AND sequence_nr = 1; ... APPLY BATCH; message= 0x00... seq_nr=2 marker='A'
  • 35. Tombstone Hell: Queue-like Data Sets 33 Journal persistence_id '57ab...' partition_nr 1 message= 0x00... seq_nr=1 marker='A' seq_nr=1 marker='A' Tombstone NO DATA HERE ... Delete messages to a sequence number BEGIN BATCH DELETE FROM messages WHERE persistence_id = '57ab...' AND partition_nr = 1 AND marker = 'A' AND sequence_nr = 1; ... APPLY BATCH; message= 0x00... seq_nr=2 marker='A' seq_nr=2 marker='A' Tombstone NO DATA HERE
  • 36. Tombstone Hell: Queue-like Data Sets • At some point compaction runs and we don't have two versions any more, but tombstones don't go away immediately – Tombstones remain for gc_grace_seconds – Default is 10 days 33 Journal persistence_id '57ab...' partition_nr 1 seq_nr=1 marker='A' Tombstone NO DATA HERE ... seq_nr=2 marker='A' Tombstone NO DATA HERE
  • 37. Tombstone Hell: Queue-like Data Sets 37 Journal persistence_id '57ab...' partition_nr 1 seq_nr=1 marker='A' Tombstone NO DATA HERE ... Read all messages between 2 sequence numbers SELECT * FROM messages WHERE persistence_id = '57ab...' AND partition_number = 1 AND sequence_number >= 1 AND sequence_number <= [max value]; seq_nr=2 marker='A' Tombstone NO DATA HERE seq_nr=3 marker='A' Tombstone NO DATA HERE seq_nr=4 marker='A' Tombstone NO DATA HERE
  • 38. Avoid Tombstone Hell 38 We need a way to avoid reading tombstones when replaying messages. SELECT * FROM messages WHERE persistence_id = ? AND partition_number = ? AND sequence_number >= ? AND sequence_number <= ?; AND sequence_number >= ? If we know what sequence number we've already deleted to before we query, we could make that lower bound smarter.
  • 39. A Third Option for Deletes • Use marker as a clustering column, but change the clustering order – Messages still 'A', Deletes 'D' • Read all messages between two sequence numbers 39 CREATE TABLE messages ( persistence_id text, partition_number bigint, marker text, sequence_number bigint, message blob, PRIMARY KEY ( (persistence_id, partition_number), marker, sequence_number) ); SELECT * FROM messages WHERE persistence_id = ? AND partition_number = ? AND marker = 'A' AND sequence_number >= ? AND sequence_number <= ?; (repeat until we reach sequence number or run out of partitions) ... sequence_number marker message ... 1 A 0x00 ... 2 A 0x00 ... 3 A 0x00
  • 40. A Third Option for Deletes • Messages data no longer has deleted information, so how do we know what's already been deleted? • Get max deleted sequence number • Can avoid tombstones if done before getting message data 40 CREATE TABLE messages ( persistence_id text, partition_number bigint, marker text, sequence_number bigint, message blob, PRIMARY KEY ( (persistence_id, partition_number), marker, sequence_number) ); SELECT sequence_number FROM messages WHERE persistence_id = ? AND partition_number = ? AND marker = 'D' ORDER BY marker DESC, sequence_number DESC LIMIT 1;
  • 41. A Third Option for Deletes • Pros – Message data stays immutable – Issue a single INSERT when deleting to a sequence number – Read a single row to find out what's been deleted (no more scanning) – Can avoid reading tombstones created by physical deletes • Cons – Requires a separate query to find out what's been deleted before getting message data 41 CREATE TABLE messages ( persistence_id text, partition_number bigint, marker text, sequence_number bigint, message blob, PRIMARY KEY ( (persistence_id, partition_number), marker, sequence_number) );
  • 43. Summary • Seemingly simple data models can get a lot more complicated • Avoid unbounded partition growth – Add data to your partition key • Be aware of how Cassandra Logged Batches work – If you need isolation, only write to a single partition • Avoid queue-like data sets and be aware of how tombstones might impact your queries – Try to query with ranges that avoid tombstones 43