2. This should be boring
• Talking to a database should not
be any of the following:
• Exciting
• "AH HA!"
• Confusing
git@github.com:rustyrazorblade/python-presentation.git
10. 1 from cassandra.concurrent import execute_concurrent_with_args
2
3 stmt = """SELECT * FROM sensor_data WHERE sensor_id=?
4 ORDER BY created_at DESC LIMIT 1""")
5
6 select_statement = session.prepare(stmt)
7
8 sensor_ids = [["f472d5ff-0c76-404a-8044-038db416685e"],
9 ["940cb741-d5b5-4c5d-82f5-bf1aa61c6d47"],
10 ["497d4b2c-cba2-4d0f-bd80-42de612690fd"],
11 ["1bdeac75-7e12-43ba-80b5-2d38405f9843"]
12
13 result = execute_concurrent_with_args(session, select_statement, sensor_ids)
Async Queries (managed)
prepared statement
automatically manages concurrency
11. Performance Considerations
• Like SQL, CQL features IN() but in
general, it's terrible for
performance
• Results in more GC & perf
problems
• BATCH has the same issue
• Failure to get a single result
causes entire IN() or batch to retry
13. Defining Models
• Each model maps to a single table
• Every model inherits from cassandra.cqlengine.models.Model
• Define fields in your table programatically
• Collections map to native Python types (lists, sets, dict)
• Table management included (no need to write ALTER)
14. Model with Collections
• Sets & Maps are most useful
• Use to denormalize
• Lists can have performance issues if misused
1 class Message(Model):
2 message_id = TimeUUID(primary_key=True, default=uuid1)
3 subject = Text()
4 body = Text()
5 addressed_to = Set(UUID)
6
7 class Photo(Model):
8 photo_id = UUID(primary_key=True, default=uuid4)
9 title = Text()
10 likes = Map<UUID, Text>
15. Clustering Keys
• Automatically determined by
ordering in model
• First primary key is partition key
• The rest are clustering keys
1 class UsersInGroup(Model):
2 group_id = UUID(primary_key=True)
3 user_id = UUID(primary_key=True)
4 is_admin = Boolean()
5
6
1 class UsersInGroupByState(Model):
2 group_id = UUID(primary_key=True, partition_key=True)
3 state = Text(primary_key=True, partition_key=True
4 user_id = UUID(primary_key=True)
5 is_admin = Boolean(default=False)
17. Lightweight Transactions
• Uses paxos for consensus
• IF NOT EXISTS for INSERT
• IF FIELD=VALUE for UPDATE
• Use sparingly - requires
several round trips
18. Batches
• Use only to maintain multiple views (for consistency purposes)
1 class User(Model):
2 name = Text(primary_key=True)
3 twitter = Text()
4 email = Text()
5
6 class TwitterToUser(Model):
7 twitter = Text(primary_key=True)
8 name = Text()
9
10 (twitter, name) = ("rustyrazorblade", "jon")
11
12 with BatchQuery() as b:
13 User.batch(b).create(name=name, twitter=twitter)
14 EmailToUser.batch(b).create(twitter=twitter, name=name)
19. Fetching a Row
• Model.get() can be used to
fetch a single row
• Will throw a DoesNotExist
exception if not found