SlideShare a Scribd company logo
1 of 28
Kafka Streams Windowing
Behind the Curtain
Neil Buesing
Principal Solutions Architect, Rill
Confluent Meetup
July 15th, 2021
• Operational intelligence for data in motion
• Easy In - Easy Up - Easy Out
• Work with customers to build & modernize their pipelines
What Does Rill Do?
• Principal Solutions Architect
• Help customers with pipelines leveraging Apache Druid, Apache Kafka, Kafka
Streams, Apache Beam, and other technologies.
• Data Modeling and Governance
• Rill Data / Apache Druid
What Do I Do?
• Overview of the Windowing Options within Kafka Streams
• Windowing Use-Cases
• Examples of Aggregate Windowing
• What each windowing options does within RocksDB and the -changelog topics
• The key serialization of the -changeling topics
• Developer Tools & Ideas
Takeaways
• Stream / Table Duality
• Compacted Topics need stateful front-end
• Stateful Operations
• Finite datasets — tables
• Boundaries for unbounded data — windows
Why Kafka Streams
Windowing Options
Window Type Time boundary Examples
# records for key
@ point in time
Fixed Size
Tumbling Epoch
[8:00, 8:30)
[8:30, 9:00)
single
Yes
Hopping Epoch
[8:00, 8:30)
[8:15, 8:45)
[8:30, 8:45)
[8:45, 9:00)
constant
Yes
Sliding Record
[8:02, 8:32]
[8:20, 8:50]
[8:21, 8:51]
variable
Yes
Session Record
[8:02, 8:02]
[8:02, 8:10]
[9:10, 12:56]
single
(by tombstoning)
No
001 / 12:00
002 / 12:00
003 / 12:00
001 / 12:30
002 / 12:30
003 / 12:30
12:00 12:15 12:30 12:45 1:00
01
5
Tumbling Time Windows
01
3
02
9
03
5
01
4
5 8 12
5
9
03
11
02
3
02
6
02
5
02
1
03
2
03
7
01
7
01
5
01
3
16 7 9
6 11 12
3 8 15
12
Hopping Time Windows
001/ 12:00
002 / 12:00
003 / 12:00
001 / 12:30
002 / 12:30
003 / 12:30
12:00 12:15 12:30 12:45 1:00
01
5
01
3
02
9
03
5
01
4
5 8 12
5
9
03
11
02
3
02
6
02
5
02
1
03
2
03
7
01
7
01
5
01
3
16 7 9
6 11 12
3 8 15
001 / 11:45
5 8 12
001 / 12:45
002 / 11:45 002 / 12:15 002 / 12:45
003 / 11:45 003 / 12:45
003 / 12:15
3 8 15
1
18 20
11
5
9
12
3 9 14
Sliding Time Windows
001 / 11:31:00.000
12:00 12:15 12:30 12:45 1:00
01
5
01
3
01
4
5
01
5
01
3
001 / 11:33:00.000
8
001 / 11:47.00.000
12
001 / 12:31:00.001
7
001 / 11:33.00:001
4
001 / 11:45:00.000
3
12:01
12:03
12:12
12:48
12:55
001 / 12:31:00.001
3
001 / 11:55:00.000
001 / 11:45:00.001
8
5
12:00 12:15 12:30 12:45 1:00
01
5
Session Windows
01
3
02
9
03
5
01
4
03
11
02
3
02
6
02
5
02
1
03
2
03
7
01
7
01
5
01
3
5 8 12 3 8 15
5
9
16 23 25
18 23 24
12
Windowing Options
Window Type Time boundary Examples
# records for key
@ point in time
Fixed Size
Tumbling Epoch
[8:00, 8:30)
[8:30, 9:00)
single
Yes
Hopping Epoch
[8:00, 8:30)
[8:15, 8:45)
[8:30, 8:45)
[8:45, 9:00)
constant
Yes
Sliding Record
[8:02, 8:32]
[8:20, 8:50]
[8:21, 8:51]
variable
Yes
Session Record
[8:02, 8:02]
[8:02, 8:10]
[9:10, 12:56]
single
(by tombstoning)
No
• Good
• Web Visitors
• Products Purchased
• Inventory Management
• IoT Sensors
• Ad Impressions*
• Bad
• Fraud Detection
• User Interactions
• Composition*
Tumbling Time Windows
* event timestamp & grace period
• Good
• Web Visitors
• Products Purchased
• Fraud Detection
• IoT Sensors
• Bad
• User Interactions
• Inventory Management
• Composition
• Ad Impressions
Hopping Time Windows
• Good
• User Interactions
• Fraud Detection
• Usage Changes
• Bad
• Composition
• IoT Sensors
Sliding Time Windows
• Good
• User Interactions / Click Stream
• User Behavior Analysis
• IoT device - session oriented
(running)
• Bad
• Data Analytics (Generalizations)
• IoT sensors - always on
(pacemaker)
Session Windows
• Good
• Composition*
• Finite Datasets
• Bad
• Fraud Detection
• Monitoring
• Unbounded data*
No Windows
* manual tombstoning
Order Processing
Order Analytics
Demo Applications
orders-purchase orders-pickup
repartition
attach
user
& store
attach
line item
pricing
assemble
product
analytics
pickup-order-handler-purchase-order-join-product-repartition
product-repartition product-stats
State
Materialized!<> store = Materialized.as("po").withCachingDisabled();
builder.<String, PurchaseOrder>stream(opt.getTopic())
.groupByKey(Grouped.as("groupByKey"))
.windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize()))
.grace(Duration.ofSeconds(opt.getGracePeriod())))
.aggregate(Streams!::initialize,
Streams!::aggregator,
Named.as("aggregate"),
store)
.toStream(Named.as("toStream"))
.selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")")
.mapValues(Streams!::minimize)
.to(opt.topic(), Produced.as("to"));
Materialized!<> store = Materialized.as("po").withCachingDisabled();
builder.<String, PurchaseOrder>stream(opt.getTopic())
.groupByKey(Grouped.as("groupByKey"))
.windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize()))
.grace(Duration.ofSeconds(opt.getGracePeriod())))
.aggregate(Streams!::initialize,
Streams!::aggregator,
Named.as("aggregate"),
store)
.toStream(Named.as("toStream"))
.selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")")
.mapValues(Streams!::minimize)
.to(opt.topic(), Produced.as("to"));
TimeWindows.of(Duration.ofSeconds(opt.getWindowSize()))
.advanceBy(Duration.ofSeconds(opt.getWindowSize() / 2))
.grace(Duration.ofSeconds(opt.getGracePeriod())
SlidingWindows.withTimeDifferenceAndGrace(
Duration.ofSeconds(opt.getWindowSize()),
Duration.ofSeconds(opt.getGracePeriod()))
SessionWindows.with(Duration.ofSeconds(opt.getWindowSize()))
Demo “Time”
product
analytics
product-repartition product-stats
rocksdb_ldb
console-
consumer
console-
consumer
Metrics
-changelog
* Caching disabled
• Deserializers. % ln -s {jar} /usr/local/confluent/share/java/kafka
• WindowDeserializer
• SessionDeserializer
• RocksDB
• RocksDB’s rocksdb_ldb % brew install rocksdb
• Scripts
• rocksdb_key_parser
• rocksdb_window_parser
• rocksdb_session_parser
Demo Tools
DEMO
• Emitting Results
• Suppression
• Commit Time
• Window Boundaries
• Epoch vs. Event
• Long Windows*
• Join Windowing
• RocksDB Tuning
• RocksDB state store instances…
What Next
• Overview of the Windowing Options within Kafka Streams
• Windowing Use-Cases
• Examples of Aggregate Windowing
• What each windowing options does within RocksDB and the -changelog topics
• The key serialization of the -changeling topics
• Advance Considerations
Takeaways
• Demo Application
https://github.com/nbuesing/kafka-streams-dashboards
• Kafka Summit Europe 2021
https://www.confluent.io/events/kafka-summit-europe-2021/what-is-the-state-of-
my-kafka-streams-application-unleashing-metrics/
Resources
• Nick Dearden’s
https://www.confluent.io/kafka-summit-ny19/zen-and-the-art-of-streaming-joins/
• Anna McDonald’s
https://www.confluent.io/kafka-summit-san-francisco-2019/using-kafka-to-discover-events-hidden-in-
your-database/
• Matthias Sax’s
https://www.confluent.io/kafka-summit-san-francisco-2019/whats-the-time-and-why/
https://www.confluent.io/resources/kafka-summit-2020/the-flux-capacitor-of-kafka-streams-and-ksqldb/
Additional Resources
Blooper Reel
Kafka streams windowing behind the curtain

More Related Content

What's hot

ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database Systemconfluent
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKai Wähner
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Araf Karsh Hamid
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Worksconfluent
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기NAVER D2
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafkaemreakis
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022HostedbyConfluent
 
Air traffic controller - Streams Processing meetup
Air traffic controller  - Streams Processing meetupAir traffic controller  - Streams Processing meetup
Air traffic controller - Streams Processing meetupEd Yakabosky
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Microservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka EcosystemMicroservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka Ecosystemconfluent
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connectKnoldus Inc.
 

What's hot (20)

ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Works
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Apache Kafka Security
Apache Kafka Security Apache Kafka Security
Apache Kafka Security
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
 
Air traffic controller - Streams Processing meetup
Air traffic controller  - Streams Processing meetupAir traffic controller  - Streams Processing meetup
Air traffic controller - Streams Processing meetup
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Microservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka EcosystemMicroservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka Ecosystem
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connect
 

Similar to Kafka streams windowing behind the curtain

Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksMatthias Niehoff
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneDataWorks Summit
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...Flink Forward
 
Onyx data processing the clojure way
Onyx   data processing  the clojure wayOnyx   data processing  the clojure way
Onyx data processing the clojure wayBahadir Cambel
 
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with AnalyticsWSO2
 
Streaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache BeamStreaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache BeamAll Things Open
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsDavide Mauri
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQLWSO2
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
 
Apache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalApache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalSub Szabolcs Feczak
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...DataWorks Summit
 
SplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security WorkshopSplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security WorkshopSplunk
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptQingsong Yao
 
Google's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesGoogle's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesIntel® Software
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorScyllaDB
 

Similar to Kafka streams windowing behind the curtain (20)

Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 
Onyx data processing the clojure way
Onyx   data processing  the clojure wayOnyx   data processing  the clojure way
Onyx data processing the clojure way
 
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
 
Streaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache BeamStreaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache Beam
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream Analytics
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 
Apache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalApache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - final
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
 
SplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security WorkshopSplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security Workshop
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.ppt
 
Google's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesGoogle's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT Services
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
 

More from confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

More from confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Kafka streams windowing behind the curtain

  • 1. Kafka Streams Windowing Behind the Curtain Neil Buesing Principal Solutions Architect, Rill Confluent Meetup July 15th, 2021
  • 2. • Operational intelligence for data in motion • Easy In - Easy Up - Easy Out • Work with customers to build & modernize their pipelines What Does Rill Do?
  • 3. • Principal Solutions Architect • Help customers with pipelines leveraging Apache Druid, Apache Kafka, Kafka Streams, Apache Beam, and other technologies. • Data Modeling and Governance • Rill Data / Apache Druid What Do I Do?
  • 4. • Overview of the Windowing Options within Kafka Streams • Windowing Use-Cases • Examples of Aggregate Windowing • What each windowing options does within RocksDB and the -changelog topics • The key serialization of the -changeling topics • Developer Tools & Ideas Takeaways
  • 5. • Stream / Table Duality • Compacted Topics need stateful front-end • Stateful Operations • Finite datasets — tables • Boundaries for unbounded data — windows Why Kafka Streams
  • 6. Windowing Options Window Type Time boundary Examples # records for key @ point in time Fixed Size Tumbling Epoch [8:00, 8:30) [8:30, 9:00) single Yes Hopping Epoch [8:00, 8:30) [8:15, 8:45) [8:30, 8:45) [8:45, 9:00) constant Yes Sliding Record [8:02, 8:32] [8:20, 8:50] [8:21, 8:51] variable Yes Session Record [8:02, 8:02] [8:02, 8:10] [9:10, 12:56] single (by tombstoning) No
  • 7. 001 / 12:00 002 / 12:00 003 / 12:00 001 / 12:30 002 / 12:30 003 / 12:30 12:00 12:15 12:30 12:45 1:00 01 5 Tumbling Time Windows 01 3 02 9 03 5 01 4 5 8 12 5 9 03 11 02 3 02 6 02 5 02 1 03 2 03 7 01 7 01 5 01 3 16 7 9 6 11 12 3 8 15 12
  • 8. Hopping Time Windows 001/ 12:00 002 / 12:00 003 / 12:00 001 / 12:30 002 / 12:30 003 / 12:30 12:00 12:15 12:30 12:45 1:00 01 5 01 3 02 9 03 5 01 4 5 8 12 5 9 03 11 02 3 02 6 02 5 02 1 03 2 03 7 01 7 01 5 01 3 16 7 9 6 11 12 3 8 15 001 / 11:45 5 8 12 001 / 12:45 002 / 11:45 002 / 12:15 002 / 12:45 003 / 11:45 003 / 12:45 003 / 12:15 3 8 15 1 18 20 11 5 9 12 3 9 14
  • 9. Sliding Time Windows 001 / 11:31:00.000 12:00 12:15 12:30 12:45 1:00 01 5 01 3 01 4 5 01 5 01 3 001 / 11:33:00.000 8 001 / 11:47.00.000 12 001 / 12:31:00.001 7 001 / 11:33.00:001 4 001 / 11:45:00.000 3 12:01 12:03 12:12 12:48 12:55 001 / 12:31:00.001 3 001 / 11:55:00.000 001 / 11:45:00.001 8 5
  • 10. 12:00 12:15 12:30 12:45 1:00 01 5 Session Windows 01 3 02 9 03 5 01 4 03 11 02 3 02 6 02 5 02 1 03 2 03 7 01 7 01 5 01 3 5 8 12 3 8 15 5 9 16 23 25 18 23 24 12
  • 11. Windowing Options Window Type Time boundary Examples # records for key @ point in time Fixed Size Tumbling Epoch [8:00, 8:30) [8:30, 9:00) single Yes Hopping Epoch [8:00, 8:30) [8:15, 8:45) [8:30, 8:45) [8:45, 9:00) constant Yes Sliding Record [8:02, 8:32] [8:20, 8:50] [8:21, 8:51] variable Yes Session Record [8:02, 8:02] [8:02, 8:10] [9:10, 12:56] single (by tombstoning) No
  • 12. • Good • Web Visitors • Products Purchased • Inventory Management • IoT Sensors • Ad Impressions* • Bad • Fraud Detection • User Interactions • Composition* Tumbling Time Windows * event timestamp & grace period
  • 13. • Good • Web Visitors • Products Purchased • Fraud Detection • IoT Sensors • Bad • User Interactions • Inventory Management • Composition • Ad Impressions Hopping Time Windows
  • 14. • Good • User Interactions • Fraud Detection • Usage Changes • Bad • Composition • IoT Sensors Sliding Time Windows
  • 15. • Good • User Interactions / Click Stream • User Behavior Analysis • IoT device - session oriented (running) • Bad • Data Analytics (Generalizations) • IoT sensors - always on (pacemaker) Session Windows
  • 16. • Good • Composition* • Finite Datasets • Bad • Fraud Detection • Monitoring • Unbounded data* No Windows * manual tombstoning
  • 17. Order Processing Order Analytics Demo Applications orders-purchase orders-pickup repartition attach user & store attach line item pricing assemble product analytics pickup-order-handler-purchase-order-join-product-repartition product-repartition product-stats State
  • 18. Materialized!<> store = Materialized.as("po").withCachingDisabled(); builder.<String, PurchaseOrder>stream(opt.getTopic()) .groupByKey(Grouped.as("groupByKey")) .windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize())) .grace(Duration.ofSeconds(opt.getGracePeriod()))) .aggregate(Streams!::initialize, Streams!::aggregator, Named.as("aggregate"), store) .toStream(Named.as("toStream")) .selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")") .mapValues(Streams!::minimize) .to(opt.topic(), Produced.as("to"));
  • 19. Materialized!<> store = Materialized.as("po").withCachingDisabled(); builder.<String, PurchaseOrder>stream(opt.getTopic()) .groupByKey(Grouped.as("groupByKey")) .windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize())) .grace(Duration.ofSeconds(opt.getGracePeriod()))) .aggregate(Streams!::initialize, Streams!::aggregator, Named.as("aggregate"), store) .toStream(Named.as("toStream")) .selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")") .mapValues(Streams!::minimize) .to(opt.topic(), Produced.as("to")); TimeWindows.of(Duration.ofSeconds(opt.getWindowSize())) .advanceBy(Duration.ofSeconds(opt.getWindowSize() / 2)) .grace(Duration.ofSeconds(opt.getGracePeriod()) SlidingWindows.withTimeDifferenceAndGrace( Duration.ofSeconds(opt.getWindowSize()), Duration.ofSeconds(opt.getGracePeriod())) SessionWindows.with(Duration.ofSeconds(opt.getWindowSize()))
  • 21. • Deserializers. % ln -s {jar} /usr/local/confluent/share/java/kafka • WindowDeserializer • SessionDeserializer • RocksDB • RocksDB’s rocksdb_ldb % brew install rocksdb • Scripts • rocksdb_key_parser • rocksdb_window_parser • rocksdb_session_parser Demo Tools
  • 22. DEMO
  • 23. • Emitting Results • Suppression • Commit Time • Window Boundaries • Epoch vs. Event • Long Windows* • Join Windowing • RocksDB Tuning • RocksDB state store instances… What Next
  • 24. • Overview of the Windowing Options within Kafka Streams • Windowing Use-Cases • Examples of Aggregate Windowing • What each windowing options does within RocksDB and the -changelog topics • The key serialization of the -changeling topics • Advance Considerations Takeaways
  • 25. • Demo Application https://github.com/nbuesing/kafka-streams-dashboards • Kafka Summit Europe 2021 https://www.confluent.io/events/kafka-summit-europe-2021/what-is-the-state-of- my-kafka-streams-application-unleashing-metrics/ Resources
  • 26. • Nick Dearden’s https://www.confluent.io/kafka-summit-ny19/zen-and-the-art-of-streaming-joins/ • Anna McDonald’s https://www.confluent.io/kafka-summit-san-francisco-2019/using-kafka-to-discover-events-hidden-in- your-database/ • Matthias Sax’s https://www.confluent.io/kafka-summit-san-francisco-2019/whats-the-time-and-why/ https://www.confluent.io/resources/kafka-summit-2020/the-flux-capacitor-of-kafka-streams-and-ksqldb/ Additional Resources