Kafka Streams Windowing Behind the Curtain, Neil Buesing, Principal Solutions Architect, Rill
https://www.meetup.com/TwinCities-Apache-Kafka/events/279316299/
What Are The Drone Anti-jamming Systems Technology?
Kafka streams windowing behind the curtain
1. Kafka Streams Windowing
Behind the Curtain
Neil Buesing
Principal Solutions Architect, Rill
Confluent Meetup
July 15th, 2021
2. • Operational intelligence for data in motion
• Easy In - Easy Up - Easy Out
• Work with customers to build & modernize their pipelines
What Does Rill Do?
3. • Principal Solutions Architect
• Help customers with pipelines leveraging Apache Druid, Apache Kafka, Kafka
Streams, Apache Beam, and other technologies.
• Data Modeling and Governance
• Rill Data / Apache Druid
What Do I Do?
4. • Overview of the Windowing Options within Kafka Streams
• Windowing Use-Cases
• Examples of Aggregate Windowing
• What each windowing options does within RocksDB and the -changelog topics
• The key serialization of the -changeling topics
• Developer Tools & Ideas
Takeaways
5. • Stream / Table Duality
• Compacted Topics need stateful front-end
• Stateful Operations
• Finite datasets — tables
• Boundaries for unbounded data — windows
Why Kafka Streams
6. Windowing Options
Window Type Time boundary Examples
# records for key
@ point in time
Fixed Size
Tumbling Epoch
[8:00, 8:30)
[8:30, 9:00)
single
Yes
Hopping Epoch
[8:00, 8:30)
[8:15, 8:45)
[8:30, 8:45)
[8:45, 9:00)
constant
Yes
Sliding Record
[8:02, 8:32]
[8:20, 8:50]
[8:21, 8:51]
variable
Yes
Session Record
[8:02, 8:02]
[8:02, 8:10]
[9:10, 12:56]
single
(by tombstoning)
No
11. Windowing Options
Window Type Time boundary Examples
# records for key
@ point in time
Fixed Size
Tumbling Epoch
[8:00, 8:30)
[8:30, 9:00)
single
Yes
Hopping Epoch
[8:00, 8:30)
[8:15, 8:45)
[8:30, 8:45)
[8:45, 9:00)
constant
Yes
Sliding Record
[8:02, 8:32]
[8:20, 8:50]
[8:21, 8:51]
variable
Yes
Session Record
[8:02, 8:02]
[8:02, 8:10]
[9:10, 12:56]
single
(by tombstoning)
No
12. • Good
• Web Visitors
• Products Purchased
• Inventory Management
• IoT Sensors
• Ad Impressions*
• Bad
• Fraud Detection
• User Interactions
• Composition*
Tumbling Time Windows
* event timestamp & grace period
13. • Good
• Web Visitors
• Products Purchased
• Fraud Detection
• IoT Sensors
• Bad
• User Interactions
• Inventory Management
• Composition
• Ad Impressions
Hopping Time Windows
14. • Good
• User Interactions
• Fraud Detection
• Usage Changes
• Bad
• Composition
• IoT Sensors
Sliding Time Windows
15. • Good
• User Interactions / Click Stream
• User Behavior Analysis
• IoT device - session oriented
(running)
• Bad
• Data Analytics (Generalizations)
• IoT sensors - always on
(pacemaker)
Session Windows
16. • Good
• Composition*
• Finite Datasets
• Bad
• Fraud Detection
• Monitoring
• Unbounded data*
No Windows
* manual tombstoning
17. Order Processing
Order Analytics
Demo Applications
orders-purchase orders-pickup
repartition
attach
user
& store
attach
line item
pricing
assemble
product
analytics
pickup-order-handler-purchase-order-join-product-repartition
product-repartition product-stats
State
23. • Emitting Results
• Suppression
• Commit Time
• Window Boundaries
• Epoch vs. Event
• Long Windows*
• Join Windowing
• RocksDB Tuning
• RocksDB state store instances…
What Next
24. • Overview of the Windowing Options within Kafka Streams
• Windowing Use-Cases
• Examples of Aggregate Windowing
• What each windowing options does within RocksDB and the -changelog topics
• The key serialization of the -changeling topics
• Advance Considerations
Takeaways