Talk URL: https://kafka-summit.org/sessions/kafka-102-streams-tables-way/
Video recording: https://www.confluent.io/kafka-summit-san-francisco-2019/kafka-102-streams-and-tables-all-the-way-down
Abstract: Streams and Tables are the foundation of event streaming with Kafka, and they power nearly every conceivable use case, from payment processing to change data capture, from streaming ETL to real-time alerting for connected cars, and even the lowly WordCount example. Tables are something that most of us are familiar with from the world of databases, whereas Streams are a rather new concept. Trying to leverage Kafka without understanding tables and streams is like building a rocket ship without understanding the laws of physics-a mission bound to fail. In this session for developers, operators, and architects alike we take a deep dive into these two fundamental primitives of Kafka’s data model. We discuss how streams and tables incl. global tables relate to each other and to topics, partitioning, compaction, serialization (Kafka’s storage layer), and how they interplay to process data, react to data changes, and manage state in an elastic, scalable, fault-tolerant manner (Kafka’s compute layer). Developers will understand better how to use streams and tables to build event-driven applications with Kafka Streams and KSQL, and we answer questions such as “How can I query my tables?” and “What is data co-partitioning, and how does it affect my join?”. Operators will better understand how these applications will run in production, with questions such as “How do I scale my application?” and “When my application crashes, how will it recover its state?”. At a higher level, we will explore how Kafka uses streams and tables to turn the Database inside-out and put it back together.
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019
1. 1
Kafka 102:
Streams and Tables All the Way Down
Kafka Summit San Francisco, Sep 2019
Michael G. Noll
Technologist, Office of the CTO, Confluent
@miguno
8. 88
@miguno
Event Streaming Paradigm
Highly Scalable
Durable
Persistent
Maintains Order
Fast (Low Latency)
Kafka = Source of Truth,
stores every article since 1851
Denormalized into
“Content View”
Normalized assets
(images, articles,
bylines, etc.)
https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/
Streams record history, even hundreds of years
9. 99
“The ledger of sales.” “The sales totals.”
Streams
record history
Tables
represent state
@miguno
10. 1010
1. e4 e5
2. Nf3 Nc6
3. Bc4 Bc5
4. d3 Nf6
5. Nbd2
“The sequence of moves.” “The state of the board.”
Streams
record history
Tables
represent state
@miguno
12. 12
The key to mutability is … the event.key!
12
@miguno
Stream Table
Has unique key constraint? No Yes
First event with key ‘alice’ arrives INSERT INSERT
Another event with key ‘alice’ arrives INSERT UPDATE
Event with key ‘alice’ and value == null arrives INSERT DELETE
Event with key == null arrives INSERT <ignored>
RDBMS analogy: A Stream is ~ a Table that has no unique key and is append-only.
15. 1515
aggregation
(like SUM, COUNT)
table changes
*See Streams and Tables: Two Sides of the Same Coin, M. Sax et al., BIRTE ’18
Streams
record history
Tables
represent state
Duality
@miguno
19. 19
Data storage of a Kafka topic is partitioned
which impacts data processing as we see later
19
...
...
...
...
P1
P2
P3
P4
storage
Topic
@miguno
20. 20
Producers determine target partition of an event
through a partitioning function ƒ(event.key)
20
...
...
...
...
P1
P2
P3
P4
storage
Topic
Producer client 1
Producer client 2
event sent and appended
to partition 1
@miguno
21. 21
Events with same key should be in same partition
to ensure proper ordering of related events
to ensure processing by Consumers returns expected results
21
...
...
...
...
P1
P2
P3
P4
Producer client 1
Producer client 2
Yellow events should always
be stored in partition 3
@miguno
22. 22
Top causes for same key in different partitions
1. You increased/decreased number of partitions
2. A producer uses a custom partitioner
→ Be careful in this situation!
22
@miguno
23. 23
Processing Layer
(KStreams, KSQL, etc.)
Storage Layer
(Brokers)
Partitions play a central role in Kafka
23
@miguno
Topics are partitioned. Partitions enable scalability, elasticity, fault-tolerance.
joined based on
stored in
replicated based on
log-compacted based on
read from and written to
processed based on
partitionsData is
25. 25
Topics
live in the Kafka storage layer, ‘filesystem’ (Brokers)
Streams and Tables
live in the Kafka processing layer (KStreams, KSQL)
25
@miguno
26. 26
Processing Layer
(KSQL, KStreams)
26
00100 11101 11000 00011 00100 00110Topic
alice Paris bob Sydney alice RomeStream
plus schema (serdes)
alice Rome
bob Sydney
Table
plus aggregation
Storage Layer
(Brokers)
Topics vs. Streams and Tables
@miguno
28. 28
Streams and Tables are partitioned, too
And so is their processing!
28
...
...
...
...
P1
P2
P3
P4
Stream Task 1
Stream Task 2
Stream Task 3
Stream Task 4
KTable / TABLE
2 GB
3 GB
5 GB
2 GB
@miguno
29. 29
Global Tables give complete data to every task
Great for joins without re-keying the input or to broadcast info
29
...
...
...
...
P1
P2
P3
P4
Stream Task 1
Stream Task 2
Stream Task 3
Stream Task 4
GlobalKTable
2 + 3 + 5 + 2 = 12 GB
12 GB
12 GB
12 GB
@miguno
30. 30
Tables are cached in ‘state stores’ on local disk
Tables and other state don’t need to fit into RAM.
Enables large-scale state. Saves $$$ on cloud instances.
30
But: The ‘source of truth’ of a table is its underlying topic!
@miguno
Stream Task 1 Cached on local disk under /var
(default storage engine: RocksDB)
Global
Table
Table
31. 31
streaming restore
via network
Stream Task 1
On machine B
Tables and other state are always fault-tolerant
because backed by Kafka topics (cf. event sourcing)
31
streaming backup
via network
table’s changelog topic
(log-compacted)
Kafka
Storage
KSQL,
Kafka Streams
On machine A
Stream Task 1
@miguno
32. 32
Standby Replicas speed up application recovery
App instances can optionally maintain copies of another
instance’s local state stores to minimize failover times
32
@miguno
num.standby.replicas = 1
Stream Task 1
Stream Task 2
App Instance 1
App Instance 2
num.standby.replicas = 0 (default)
Stream Task 1
Stream Task 2
Stream Task 2
failover
34. 34
bob Zurich
alice Rome
bob Zurich
bob Sydney alice Romealice Bern alice Paris
Tables <-> topic log-compaction
A table’s underlying topic is compacted by default
to save Kafka storage space
to speed up failover and recovery for processing
34
Note: Compaction intentionally removes part of a table’s history. If you need the full
history and don’t have the historic data elsewhere, consider disabling compaction.
@miguno
35. 3535
@miguno
TL;DR for Log Compaction
Have a Stream?
→ Disable log compaction for its topic (= default)
Have a Table?
→ Enable log compaction for its topic (= default)
Disable only when needed, see previous slide
36. 3636
Concept Schema Partitioned Unbounded Ordering Mutable Unique key constraint Fault-tolerant
Storage Layer
Topic No (raw bytes) Yes Yes Yes No No Yes
Processing Layer
Stream Yes Yes Yes Yes No No Yes
Table Yes Yes No* No Yes Yes Yes
Global Table Yes No No* No Yes Yes Yes
*Generally speaking the answer is Yes but, in practice, tables are almost always bounded due to finite key space.
Topics vs. Streams and Tables
@miguno
38. 38
Max processing parallelism = #input partitions
38
...
...
...
...
P1
P2
P3
P4
Topic Application Instance 1
Application Instance 2
Application Instance 3
Application Instance 4
Application Instance 5 *** idle ***
Application Instance 6 *** idle ***
→ Need higher parallelism? Increase the original topic’s partition count.
→ Higher parallelism for just one use case? Derive a new topic from the
original with higher partition count. Lower its retention to save storage.
@miguno
39. 39
How to increase number of partitions when needed
KSQL example: statement below creates a new stream with
the desired number of partitions.
39
CREATE STREAM products_repartitioned
WITH (PARTITIONS=30) AS
SELECT * FROM products
@miguno
40. 40
‘Hot’ partitions can be problematic, often caused by
1. Events not evenly distributed across partitions
2. Events evenly distributed but certain events take longer to process
40
Strategies to address hot partitions include
1a. Ingress: Find better partitioning function ƒ(event.key) for producers
1b. Storage: Re-partition data into new topic if you can’t change the original
2. Scale processing vertically, e.g. more powerful CPU instances
...
...
...
...
P1
P2
P3
P4
@miguno
41. 41
Joining Streams and Tables
Data must be ‘co-partitioned’
41
TableStream
Join Output
(Stream)
@miguno
42. 42
Joining Streams and Tables
Data must be ‘co-partitioned’
42
bob male
alice female
alex male
alice Paris
Table
P1
P2
P3
zoie female
andrew male
mina female
natalie female
blake male
alice Paris
Stream
P2
(alice, Paris) from stream’s
P2 has a matching entry
for alice in the table’s P2.
female
@miguno
43. 43
Joining Streams and Tables
Data is looked up in same partition number
43
alice Paris alice male
alice female
alice Paris
Stream Table
P2 P1
P2
P3
Here, key ‘alice’ exists in
multiple partitions.
But entry in P2 (female) is
used because the
stream-side event is from
stream’s partition P2.
female
Scenario 2
@miguno
44. 44
Joining Streams and Tables
Data is looked up in same partition number
44
alice Paris alice male
alice Paris
Stream Table
P2 P1
P2
P3
Here, key ‘alice’ exists only
in the table’s P1 != P2.
null
no
match!
Scenario 3
@miguno
45. 45
Data co-partitioning requirements in detail
1. Same keying scheme for both input sides
2. Same number of partitions
3. Same partitioning function ƒ(event.key)
45
Further Reading on Joining Streams and Tables:
https://www.confluent.io/kafka-summit-sf18/zen-and-the-art-of-streaming-joins
https://docs.confluent.io/current/ksql/docs/developer-guide/partition-data.html
@miguno
46. 46
Why is that so?
Because of how input data is mapped to stream tasks
46
...
...
...
P1
P2
P3
storage
processing state
Stream Task 2
read via
network
Stream
Topic
...
...
...
P1
P2
P3
Table
Topic
from stream’s P2
from table’s P2
@miguno
47. 47
How to re-partition your data when needed
KSQL example: statement below creates a new stream with
changed number of partitions and a new field as event.key
(so that its data is now correctly co-partitioned for joining)
47
CREATE STREAM products_repartitioned
WITH (PARTITIONS=42) AS
SELECT * FROM products
PARTITION BY product_id;
@miguno
48. 48
Joining Streams and Global Tables
No need to worry about co-partitioning!
48
Global TableStream
Join Output
(Stream)
@miguno
Stream Task 1 2 + 3 + 5 + 2 = 12 GB
That’s because each stream task has the full data from
all the table’s partitions:
49. 49
Capacity Planning and Sizing
Sorry, not covered here! I recommend:
KSQL Performance Tuning for Fun and Profit, by Nick Dearden
October 1, 2019 @ 2:50-3:30 pm, Stream Processing track
https://kafka-summit.org/sessions/ksql-performance-tuning-fun-profit/
49
@miguno
50. 50
Streams and Tables in KSQL
Stream 01
Stream 02
Stream 03
Table
Process event streams to create new, continuously updated streams or tables
QueryQuery
Push
Query
CREATE TABLE OrderTotals AS SELECT * FROM ... EMIT CHANGES
@miguno
51. 51
Streams and Tables in KSQL
Query tables similar to a relational database
Table
QueryQuery
Pull
Query
SELECT * FROM OrderTotals WHERE region = ‘Europe’
Result
New feature
@miguno
52. 52
Query tables from other apps with push or pull queries
Other Applications
(Java, Go, Python, etc.)
can directly query tables
Result
via network
(KSQL REST API)
Table
SELECT * FROM OrderTotals WHERE region = ‘Europe’
Streams and Tables in KSQL
@miguno
53. 53
Streaming import/export of Tables
KSQL integrates with Kafka Connect
CREATE SOURCE CONNECTOR my-postgres-jdbc WITH (
connector.class = "io.confluent.connect.jdbc.jdbcSourceConnector",
connection.url = "jdbc:postgresql://dbserver:5432/my-db", ...);
New feature
controls controls
@miguno
54. 54
KSQL example use case
Creating an event-driven dashboard from a customer database
@miguno
customers
table
Table
Kafka Connect is
streaming change events
Stream
Aggregations are
computed in real-time
Table
Results are
continuously updating
Elasticsearch
Table (Index)Stream