SlideShare a Scribd company logo
1 of 24
patterns
exploring
Neil is a senior engineer and technologist at Confluent, the
company founded by the creators of Apache Kafka®. He has over
20 years of expertise of working on distributed computing,
messaging and stream processing. He has built or redesigned
commercial messaging platforms, distributed caching products as
well as developed large scale bespoke systems for tier-1 banks.
After a period at ThoughtWorks, he went on to build some of the
first distributed risk engines in financial services. In 2008 he
launched a startup that specialised in distributed data analytics and
visualization. Then prior to joining Confluent he was the CTO at a
fintech consultancy.
Neil Avery
Senior Engineer and Technologist, Confluent
Declarative
Stream
Language
Processing
KSQLis a
KSQLis the
Streaming
SQL Enginefor
Apache Kafka
KSQL Concepts
KSQL Concepts
• Streams are first-class citizens
• Tables are first-class citizens
• 2 types of Queries (Pers, Trans)
• Some queries are persistent
• Persistent Queries populate Streams and Tables
• Transient Queries are for users interaction
• All queries run until terminated
CREATE STREAM clickstream
WITH (
value_format = ‘JSON’,
kafka_topic=‘my_clickstream_topic’
);
Creating a Stream
• “Data in motion”
• Let’s say we have a topic called my_clickstream_topic
• The topic contains JSON data
• KSQL now knows about that topic
Exploring that Stream
SELECT status, bytes
FROM clickstream
WHERE user_agent =
‘Mozilla/5.0 (compatible; MSIE 6.0)’;
• Now that the stream exists, we can examine its contents
• Simple, declarative filtering
• A non-persistent query
CREATE TABLE users
(user_id int, registered_At long …)
WITH (
key = ‘user_id',
kafka_topic=‘clickstream_users’,
value_format=‘JSON’
);
Creating a Table
• “A stateful view of a data in motion”
• Can be built from a kafka topic OR another KSQL Stream
• We have a topic called my_clickstream_topic
• The topic contains JSON data, The topic contains changelog data
CREATE TABLE events_per_min
AS SELECT user_id, count(*)AS events
FROM clickstream WINDOW TUMBLING(
size 10 seconds= ‘user_id'
) GROUP BY user_id;
Creating a Table
• Derived from a stream
• Windowed aggregate
Inspecting that Table
SELECT userid, username
FROM users
WHERE level = ‘Platinum’;
• Now that the table exists, we can examine its contents
• Simple, declarative filtering
• A non-persistent query
Joining a Stream to a Table
• Now that we have clickstream and users, we can join them
• This allows us to do filtering of clicks on a user attribute
CREATE STREAM vip_actions AS
SELECT userid, page, action FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';
Usage Patterns
KSQL for Streaming ETL
• Kafka is popular for data pipelines.
• KSQL enables easy transformations of data within the pipe.
• Transforming data while moving from Kafka to another system.
CREATE STREAM vip_actions AS
SELECT userid, page, action FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';
KSQL for Anomaly Detection
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
Identifying patterns or anomalies in real-time data,
surfaced in milliseconds
KSQL for Real-Time
Monitoring• Log data monitoring, tracking and alerting
• Sensor / IoT data
CREATE TABLE error_counts AS
SELECT error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE type = 'ERROR'
GROUP BY error_code;
KSQL for Data Transformation
CREATE STREAM views_by_userid
WITH (PARTITIONS=6,
VALUE_FORMAT=‘JSON’,
TIMESTAMP=‘view_time’) AS
SELECT * FROM clickstream PARTITION BY user_id;
Make simple derivations of existing topics from the command line
Stream Patterns
Stream
Stream with partitioning
(scaling out)
Stream
(fork)
Table view
(windowed - from stream)
Table
(scaled out – across nodes)
Stream table join
(left-join)
Streaming Apps - Clickstream
clickstream
stream (topic)
events_per_mi
n
table (window)
users table
table (topic)
platinum users
stream-table join
Demo
• Streams, Tables and Queries
• Many streaming patterns
• The clickstream demo is a good place to ‘grok'
Recap
Resources and Next Steps
https://github.com/confluentinc/ksql
http://confluent.io/ksql
https://slackpass.io/confluentcommunity #ksql
Signup for
- Part 2: Development
- Part 3 Deployment
https://www.confluent.io/empowering-streams-through-ksql
Thank you!
Q&A
@avery_neil

More Related Content

What's hot

Streaming Transformations - Putting the T in Streaming ETL
Streaming Transformations - Putting the T in Streaming ETLStreaming Transformations - Putting the T in Streaming ETL
Streaming Transformations - Putting the T in Streaming ETLconfluent
 
Leveraging Microservice Architectures & Event-Driven Systems for Global APIs
Leveraging Microservice Architectures & Event-Driven Systems for Global APIsLeveraging Microservice Architectures & Event-Driven Systems for Global APIs
Leveraging Microservice Architectures & Event-Driven Systems for Global APIsconfluent
 
Real-Time Stream Processing with KSQL and Apache Kafka
Real-Time Stream Processing with KSQL and Apache KafkaReal-Time Stream Processing with KSQL and Apache Kafka
Real-Time Stream Processing with KSQL and Apache Kafkaconfluent
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Applicationconfluent
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterconfluent
 
The State of Stream Processing
The State of Stream ProcessingThe State of Stream Processing
The State of Stream Processingconfluent
 
Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka confluent
 
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on KubernetesStateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetesconfluent
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafkaconfluent
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...HostedbyConfluent
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!confluent
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Applicationconfluent
 
KSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use BothKSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use Bothconfluent
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesLightbend
 
Streams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLStreams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLconfluent
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKai Wähner
 
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...confluent
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 

What's hot (20)

Streaming Transformations - Putting the T in Streaming ETL
Streaming Transformations - Putting the T in Streaming ETLStreaming Transformations - Putting the T in Streaming ETL
Streaming Transformations - Putting the T in Streaming ETL
 
Leveraging Microservice Architectures & Event-Driven Systems for Global APIs
Leveraging Microservice Architectures & Event-Driven Systems for Global APIsLeveraging Microservice Architectures & Event-Driven Systems for Global APIs
Leveraging Microservice Architectures & Event-Driven Systems for Global APIs
 
Real-Time Stream Processing with KSQL and Apache Kafka
Real-Time Stream Processing with KSQL and Apache KafkaReal-Time Stream Processing with KSQL and Apache Kafka
Real-Time Stream Processing with KSQL and Apache Kafka
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Application
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matter
 
The State of Stream Processing
The State of Stream ProcessingThe State of Stream Processing
The State of Stream Processing
 
Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka
 
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on KubernetesStateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Application
 
KSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use BothKSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use Both
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
 
Streams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLStreams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQL
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
 
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 

Similar to Exploring KSQL Patterns

KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafkaconfluent
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentKinetica
 
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...Unlocking the world of stream processing with KSQL, the streaming SQL engine ...
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...Michael Noll
 
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...Michael Noll
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLFlorent Ramiere
 
JHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemJHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemFlorent Ramiere
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKai Wähner
 
ksqlDB Workshop
ksqlDB WorkshopksqlDB Workshop
ksqlDB Workshopconfluent
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLconfluent
 
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...Lviv Startup Club
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Michael Noll
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshopconfluent
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsC4Media
 
Unlock the Power of Streaming Data with Kinetica and Confluent Platform
Unlock the Power of Streaming Data with Kinetica and Confluent PlatformUnlock the Power of Streaming Data with Kinetica and Confluent Platform
Unlock the Power of Streaming Data with Kinetica and Confluent Platformconfluent
 
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQLEvent-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQLKai Wähner
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectKaufman Ng
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaAttunity
 
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLRethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLKai Wähner
 

Similar to Exploring KSQL Patterns (20)

KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
 
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...Unlocking the world of stream processing with KSQL, the streaming SQL engine ...
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...
 
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQL
 
JHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemJHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka Ecosystem
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
 
ksqlDB Workshop
ksqlDB WorkshopksqlDB Workshop
ksqlDB Workshop
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
 
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshop
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live Streams
 
Unlock the Power of Streaming Data with Kinetica and Confluent Platform
Unlock the Power of Streaming Data with Kinetica and Confluent PlatformUnlock the Power of Streaming Data with Kinetica and Confluent Platform
Unlock the Power of Streaming Data with Kinetica and Confluent Platform
 
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQLEvent-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache Kafka
 
Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
 
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLRethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
 

More from confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

More from confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Recently uploaded

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Exploring KSQL Patterns

  • 2. Neil is a senior engineer and technologist at Confluent, the company founded by the creators of Apache Kafka®. He has over 20 years of expertise of working on distributed computing, messaging and stream processing. He has built or redesigned commercial messaging platforms, distributed caching products as well as developed large scale bespoke systems for tier-1 banks. After a period at ThoughtWorks, he went on to build some of the first distributed risk engines in financial services. In 2008 he launched a startup that specialised in distributed data analytics and visualization. Then prior to joining Confluent he was the CTO at a fintech consultancy. Neil Avery Senior Engineer and Technologist, Confluent
  • 6.
  • 7. KSQL Concepts • Streams are first-class citizens • Tables are first-class citizens • 2 types of Queries (Pers, Trans) • Some queries are persistent • Persistent Queries populate Streams and Tables • Transient Queries are for users interaction • All queries run until terminated
  • 8. CREATE STREAM clickstream WITH ( value_format = ‘JSON’, kafka_topic=‘my_clickstream_topic’ ); Creating a Stream • “Data in motion” • Let’s say we have a topic called my_clickstream_topic • The topic contains JSON data • KSQL now knows about that topic
  • 9. Exploring that Stream SELECT status, bytes FROM clickstream WHERE user_agent = ‘Mozilla/5.0 (compatible; MSIE 6.0)’; • Now that the stream exists, we can examine its contents • Simple, declarative filtering • A non-persistent query
  • 10. CREATE TABLE users (user_id int, registered_At long …) WITH ( key = ‘user_id', kafka_topic=‘clickstream_users’, value_format=‘JSON’ ); Creating a Table • “A stateful view of a data in motion” • Can be built from a kafka topic OR another KSQL Stream • We have a topic called my_clickstream_topic • The topic contains JSON data, The topic contains changelog data
  • 11. CREATE TABLE events_per_min AS SELECT user_id, count(*)AS events FROM clickstream WINDOW TUMBLING( size 10 seconds= ‘user_id' ) GROUP BY user_id; Creating a Table • Derived from a stream • Windowed aggregate
  • 12. Inspecting that Table SELECT userid, username FROM users WHERE level = ‘Platinum’; • Now that the table exists, we can examine its contents • Simple, declarative filtering • A non-persistent query
  • 13. Joining a Stream to a Table • Now that we have clickstream and users, we can join them • This allows us to do filtering of clicks on a user attribute CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum';
  • 15. KSQL for Streaming ETL • Kafka is popular for data pipelines. • KSQL enables easy transformations of data within the pipe. • Transforming data while moving from Kafka to another system. CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum';
  • 16. KSQL for Anomaly Detection CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3; Identifying patterns or anomalies in real-time data, surfaced in milliseconds
  • 17. KSQL for Real-Time Monitoring• Log data monitoring, tracking and alerting • Sensor / IoT data CREATE TABLE error_counts AS SELECT error_code, count(*) FROM monitoring_stream WINDOW TUMBLING (SIZE 1 MINUTE) WHERE type = 'ERROR' GROUP BY error_code;
  • 18. KSQL for Data Transformation CREATE STREAM views_by_userid WITH (PARTITIONS=6, VALUE_FORMAT=‘JSON’, TIMESTAMP=‘view_time’) AS SELECT * FROM clickstream PARTITION BY user_id; Make simple derivations of existing topics from the command line
  • 19. Stream Patterns Stream Stream with partitioning (scaling out) Stream (fork) Table view (windowed - from stream) Table (scaled out – across nodes) Stream table join (left-join)
  • 20. Streaming Apps - Clickstream clickstream stream (topic) events_per_mi n table (window) users table table (topic) platinum users stream-table join
  • 21. Demo
  • 22. • Streams, Tables and Queries • Many streaming patterns • The clickstream demo is a good place to ‘grok' Recap
  • 23. Resources and Next Steps https://github.com/confluentinc/ksql http://confluent.io/ksql https://slackpass.io/confluentcommunity #ksql Signup for - Part 2: Development - Part 3 Deployment https://www.confluent.io/empowering-streams-through-ksql