This summer, coming to a server near you, Cassandra 3.0! Contributors and committers have been working hard on what is the most ambitious release to date. It’s almost too much to talk about, but we will dig into some of the most important, ground breaking features that you’ll want to use. Indexing changes that will make your applications faster and spark jobs more efficient. Storage engine changes to get even more density and efficiency from your nodes. Developer focused features like full JSON support and User Defined Functions. And finally, one of the most requested features, Windows support, has made it’s arrival. There is more, but you’ll just have to some see for yourself. Get your front row seat and don’t miss it!
4. User Defined Functions
• Counter table
• User clicks on a number of stars
• rating_counter = How many clicks
• rating_total = Cumulative amount of stars
4
CREATE TABLE video_rating (
videoid uuid,
rating_counter counter,
rating_total counter,
PRIMARY KEY (videoid)
);
5. User Defined Functions
5
CREATE TABLE video_rating (
videoid uuid,
rating_counter counter,
rating_total counter,
PRIMARY KEY (videoid)
);
public long getRatingForVideo(UUID videoId) {
BoundStatement bs =
getRatingByVideoPreparedStatement.bind(videoId);
ResultSet rs = session.execute(bs);
Row row = rs.one();
// Get the count and total rating for the video
long total = row.getLong("rating_total");
long count = row.getLong("rating_counter");
// Divide the total by the count and return an average
return (total / count);
}
6. User Defined Functions
6
CREATE TABLE video_rating (
videoid uuid,
rating_counter counter,
rating_total counter,
PRIMARY KEY (videoid)
);
public long getRatingForVideo(UUID videoId) {
BoundStatement bs =
getRatingByVideoPreparedStatement.bind(videoId);
ResultSet rs = session.execute(bs);
Row row = rs.one();
// Get the count and total rating for the video
long total = row.getLong("rating_total");
long count = row.getLong("rating_counter");
// Divide the total by the count and return an average
return (total / count);
}
Application code?
7. User Defined Functions
7
CREATE OR REPLACE FUNCTION averageRating ( rating_counter counter, rating_total counter )
RETURNS Float
LANGUAGE java
AS '
return Float.valueOf(rating_total.floatValue() / rating_counter.floatValue());
';
Function Name CQL TypeObject return type
Java Code
8. User Defined Functions
• Add to your CQL statement!
8
> SELECT averageRating(rating_counter, rating_total) AS avg
FROM video_rating
WHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0;
videoid | rating_counter | rating_total
--------------------------------------+----------------+--------------
99051fe9-6a9c-46c2-b949-38ef78858dd0 | 3 | 12
avg
-----
4
9. User Defined Functions - Fine print
• “Pure” functions
• Nothing outside of input parameters
• Return types are only objects. No primitives
• Method signatures on parameter type
9
10. User Defined Function Language Support
• Java
• JavaScript
10
• Scala
• Groovy
• Jython
• JRuby
Primary Languages
Optional Languages
11. JSON Support
• Table to store a video
• TYPE to store metadata
11
CREATE TYPE video_metadata (
height int,
width int,
video_bit_rate set<text>,
encoding text
);
CREATE TABLE videos (
videoid uuid,
userid uuid,
name varchar,
description varchar,
location text,
location_type int,
preview_thumbnails map<text,text>,
tags set<varchar>,
metadata set <frozen<video_metadata>>,
added_date timestamp,
PRIMARY KEY (videoid)
);
12. JSON Support
12
INSERT INTO videos (videoid, name, userid, description, location, location_type,
preview_thumbnails, tags, added_date, metadata)
VALUES (49f64d40-7d89-4890-b910-dbf923563a33,'The World''s Next Top Data Model',
9761d3d7-7fbd-4269-9988-6cfd4e188678,
'Third in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?
v=HdJlsOZVGwM',1,
{'YouTube':'http://www.youtube.com/watch?v=HdJlsOZVGwM'},{'cassandra','data
model','examples','instruction'},'2013-06-11 11:00:00',
{{ height: 480, width: 640, encoding: 'MP4', video_bit_rate: {'1000kbs', '400kbs'}}});
Decompose into standard insert
OR!
13. JSON Support
13
INSERT INTO videos JSON
'{
"videoid":"99051fe9-6a9c-46c2-b949-38ef78858dd0",
"added_date":"2012-06-01 08:00:00.000",
"description":"My cat likes to play the piano! So funny.",
"location":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401",
"location_type":1,
"metadata":[
{
"height":480,
"width":640,
"video_bit_rate":[
"1000kbs",
"400kbs"
],
"encoding":"MP4"
}
],
"name":"My funny cat",
"preview_thumbnails":{
"10":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401"
},
"tags":[
"cats",
"lol",
"piano"
],
"userid":"d0f60aa8-54a9-4840-b70c-fe562b68842b"
}';
One block of JSON
OR!
14. JSON Support
14
INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags,
added_date, metadata)
VALUES (99051fe9-6a9c-46c2-b949-38ef78858dd0,'My funny cat',d0f60aa8-54a9-4840-b70c-fe562b68842b,
'My cat likes to play the piano! So funny.','/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401',1,
{'10':'/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401'},{'cats','piano','lol'},'2012-06-01 08:00:00',
fromJson('
[{
"height":480,
"width":640,
"video_bit_rate":[
"1000kbs",
"400kbs"
],
"encoding":"MP4"
}]
')
);
Just a block at a time
20. More Indexes!
• Partial Indexes - Postponed until 3.1
• Functional Indexes - using a UDF in an index
20
CREATE INDEX ON user_rating averageRating(rating_counter, rating_total);
22. Hints to Raw Files
• Pre 3.0 hints stored in table
• Create load on entire write path
• …and read path
• …and compaction
22
CREATE TABLE system.hints (
target_id uuid,
hint_id timeuuid,
message_version int,
mutation blob,
PRIMARY KEY (target_id, hint_id, message_version)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);
23. Hints to Raw Files
• Hints now written to a local file
• Replays direct from disk
• Bulk streamed to endpoints
23
CREATE TABLE system.hints (
target_id uuid,
hint_id timeuuid,
message_version int,
mutation blob,
PRIMARY KEY (target_id, hint_id, message_version)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);
24. Windows Compatibility - The Problem
• Java file management on Windows is… different
• File delete’s are not possible
• Hard links - Broke
• Snapshots - Broke
• Memory Mapped I/O - Broke
24
25. Windows Compatibility - 3.0
• Re-tooling of critical file functions
• Extensive use of FILE_SHARE_DELETE from JDK7
• Launch now in PowerShell
• CCM now supports windows
25
26. Storage Engine Changes
• Now infamous CASSANDRA-8099
• Technical debt from Thrift
• Move from Thrift centric to CQL centric storage
26
27. Pre 3.0 Storage Engine Format
27
2005:12:1:102005:12:1:92005:12:1:82005:12:1:7
5F22A0BC
Partition Key Clustering Columns
F2B3652CFFB3652D7AB3652C
PRIMARY KEY (userId,added_date,videoId)
A12378E55F5A32
37. Commit Log Compression
• Segments are compressed by time interval
• Higher throughput under high writes
37
38. Commit Log Compression
• Segments are compressed by time interval
• Higher throughput under high writes
38
39. Commit Log Compression
• Segments are compressed by time interval
• Higher throughput under high writes
39
40. Smaller but significant changes
• Direct buffer decompression of reads
• Avoiding memory allocation on Index Summary search
• Repair concurrency improvements
• Optimal CRC32 implementation at runtime
40
42. Role Based Access Control
• Expands on User based auth in 1.2
• Requires the internal auth to be enabled
42
CREATE ROLE supervisor;
GRANT MODIFY ON user_credentials TO supervisor;
43. When will it ship?
43
Maybe June
When 8099 is finished, it ships