5. Easy to use
• CQL is a familiar syntax
• Friendly to programmers
• Paxos for locking
CREATE TABLE users (!
username varchar,!
firstname varchar,!
lastname varchar,!
email list<varchar>,!
password varchar,!
created_date timestamp,!
PRIMARY KEY (username)!
);
INSERT INTO users (username, firstname, lastname, !
email, password, created_date)!
VALUES ('pmcfadin','Patrick','McFadin',!
['patrick@datastax.com'],'ba27e03fd95e507daf2937c937d499ab',!
'2011-06-20 13:50:00');!
INSERT INTO users (username, firstname, !
lastname, email, password, created_date)!
VALUES ('pmcfadin','Patrick','McFadin',!
['patrick@datastax.com'],!
'ba27e03fd95e507daf2937c937d499ab',!
'2011-06-20 13:50:00')!
IF NOT EXISTS;
6. Time series in production
• It’s all about “What’s happening”
• Data is the new currency
“Sirca, a non-profit university consortium based in Sydney, is the world’s biggest broker of
financial data, ingesting into its database 2million pieces of information a second from every
major trading exchange.”*
* http://www.theage.com.au/it-pro/business-it/help-poverty-theres-an-app-for-that-20140120-hv948.html
7. Why Cassandra for Time Series
Scales
Resilient
Good data model
Efficient Storage Model
What about that?
8. Data Model
CREATE TABLE temperature (
weatherstation_id text,
event_time timestamp,
temperature text,
PRIMARY KEY (weatherstation_id,event_time)
);
• Weather Station Id and Time
are unique
• Store as many as needed
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:01:00','72F');
!
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:02:00','73F');
!
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:03:00','73F');
!
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:04:00','74F');
9. Storage Model - Logical View
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD';
weatherstation_id
event_time
temperature
2013-04-03 07:01:00
1234ABCD
72F
2013-04-03 07:02:00
1234ABCD
73F
2013-04-03 07:03:00
1234ABCD
73F
2013-04-03 07:04:00
1234ABCD
74F
10. Storage Model - Disk Layout
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD';
2013-04-03 07:01:00
1234ABCD
72F
2013-04-03 07:02:00
73F
2013-04-03 07:03:00
2013-04-03 07:04:00
73F
Merged, Sorted and Stored Sequentially
74F
2013-04-03 07:05:00
!
2013-04-03 07:06:00
!
74F
75F
!
!
11. Query patterns
SELECT temperature
FROM event_time,temperature
WHERE weatherstation_id='1234ABCD'
AND event_time > '2013-04-03 07:01:00'
AND event_time < '2013-04-03 07:04:00';
• Range queries
• “Slice” operation on disk
Single seek on disk
2013-04-03 07:01:00
1234ABCD
72F
2013-04-03 07:02:00
73F
2013-04-03 07:03:00
73F
2013-04-03 07:04:00
74F
2013-04-03 07:05:00
!
2013-04-03 07:06:00
!
74F
75F
!
!
12. Query patterns
SELECT temperature
FROM event_time,temperature
WHERE weatherstation_id='1234ABCD'
AND event_time > '2013-04-03 07:01:00'
AND event_time < '2013-04-03 07:04:00';
weatherstation_id
event_time
• Range queries
• “Slice” operation on disk
temperature
2013-04-03 07:01:00
1234ABCD
72F
Sorted by event_time
2013-04-03 07:02:00
1234ABCD
73F
2013-04-03 07:03:00
1234ABCD
73F
2013-04-03 07:04:00
1234ABCD
74F
Programmers like this
14. Dealing with data at speed
• 1 million writes per second?
• 1 insert every microsecond
• Collisions?
Your totally!
killer!
application
weatherstation_id='5678EFGH'
• Primary Key determines node
placement
• Random partitioning
• Special data type - TimeUUID
weatherstation_id='1234ABCD'
15. TimeUUID
Timestamp to Microsecond
+
UUID
=
TimeUUID
• Also known as a Version 1 UUID
• Sortable
• Reversible
04d580b0-9412-11e3-baa8-0800200c9a66
=
Wednesday, February 12, 2014 6:18:06 PM GMT
http://www.famkruithof.net/uuid/uuidgen