SlideShare a Scribd company logo
1 of 70
1C O N F I D E N T I A L
Stream Processing with Confluent
Kafka Streams and KSQL
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
2C O N F I D E N T I A L
3C O N F I D E N T I A L
4C O N F I D E N T I A L
Ubiquitous connectivity
Globally scalable platform for all
event producers and consumers
Immediate data access
Data accessible to all
consumers in real time
Single system of record
Persistent storage to enable
reprocessing of past events
Continuous queries
Stream processing capabilities
for in-line data transformation
Microservice
s
DBs
SaaS apps
Mobile
Customer 360
Real-time fraud
detection
Data warehouse
Producers
Consumers
Database
change
Microservices
events
SaaS
data
Customer
experience
s
Streams of real time events
Stream processing appsStream processing apps Stream processing apps
A Streaming Platform is the Underpinning
of an Event-driven Architecture
5C O N F I D E N T I A L
6C O N F I D E N T I A L
The beginning of a new Era
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
The first use case. This is why Kafka was created!
7C O N F I D E N T I A L
● Global-scale
● Real-time
● Persistent Storage
● Stream Processing
Apache Kafka: The De-facto Standard for Real-Time Event Streaming
Edge
Cloud
Data LakeDatabases
Datacenter
IoT
SaaS AppsMobile
Microservices Machine
Learning
Apache Kafka
8C O N F I D E N T I A L
Apache Kafka at Scale at Tech Giants
> 4.5 trillion messages / day > 6 Petabytes / day
“You name it”
* Kafka Is not just used by tech giants
** Kafka is not just used for big data
9C O N F I D E N T I A L
Confluents Business Value per Use Case
Improve
Customer
Experience
(CX)
Increase
Revenue
(make money)
Business
Value
Decrease
Costs
(save
money)
Core Business
Platform
Increase
Operational
Efficiency
Migrate to
Cloud
Mitigate Risk
(protect money)
Key Drivers
Strategic Objectives
(sample)
Fraud
Detection
IoT sensor
ingestion
Digital
replatforming/
Mainframe Offload
Connected Car: Navigation & improved
in-car experience: Audi
Customer 360
Simplifying Omni-channel Retail at
Scale: Target
Faster transactional
processing / analysis
incl. Machine Learning / AI
Mainframe Offload: RBC
Microservices
Architecture
Online Fraud Detection
Online Security
(syslog, log aggregation,
Splunk replacement)
Middleware
replacement
Regulatory
Digital
Transformation
Application Modernization: Multiple
Examples
Website / Core
Operations
(Central Nervous System)
The [Silicon Valley] Digital Natives;
LinkedIn, Netflix, Uber, Yelp...
Predictive Maintenance: Audi
Streaming Platform in a regulated
environment (e.g. Electronic Medical
Records): Celmatix
Real-time app
updates
Real Time Streaming Platform for
Communications and Beyond: Capital One
Developer Velocity - Building Stateful
Financial Applications with Kafka
Streams: Funding Circle
Detect Fraud & Prevent Fraud in Real
Time: PayPal
Kafka as a Service - A Tale of Security
and Multi-Tenancy: Apple
Example Use Cases
$↑
$↓
$
Example Case Studies
(of many)
10C O N F I D E N T I A L
A Modern, Distributed Platform for
Data Streams.
Messaging + Storage + Processing!
11C O N F I D E N T I A L
Stream Processing
processing-time
event-time
windowing
alice
bob
dave
12C O N F I D E N T I A L
Confluent Delivers a Mission-Critical Event Streaming Platform
Apache Kafka®
Core | Connect API | Streams API
Data Compatibility
Schema Registry
Enterprise Operations
Replicator | Auto Data Balancer | Connectors | MQTT Proxy | Kubernetes Operator
Database
Changes
Log Events IoT Data Web Events other events
Hadoop
Database
Data
Warehouse
CRM
other
DATA INTEGRATION
Transformations
Custom Apps
Analytics
Monitoring
other
REAL-TIME APPLICATIONS
COMMUNITY FEATURES COMMERCIAL FEATURES
Datacenter Public Cloud Confluent Cloud
Confluent Platform
Management & Monitoring
Control Center | Security
Development & Connectivity
Clients | Connectors | REST Proxy | KSQL
CONFLUENT FULLY-MANAGEDCUSTOMER SELF-MANAGED
13C O N F I D E N T I A L
What We Cover Today:
streams
The streaming SQL engine
for Apache Kafka® to write
real-time applications in SQL
Apache Kafka® library to write
real-time applications and
microservices in Java, Scala
KSQL
10
14C O N F I D E N T I A L
Lower the bar to enter the world of streaming
User Population
CodingSophistication
Core developers who use Java/Scala
Core developers who don’t use Java/Scala
Data engineers, architects, DevOps/SRE
BI analysts
streams
15C O N F I D E N T I A L
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8
Lower the bar to enter the world of streaming
vs.
KSQL
streams
16C O N F I D E N T I A L
Confluent
● Kafka Streams and
KSQL for stream
processing
● Lower-level Kafka
Producer and Kafka
Consumer clients
for multiple
languages
Java C/C++ Go
Python.NET JMS
Kafka Streams KSQL
Java/Scala Streaming SQL
17C O N F I D E N T I A L
Microservices
Example use cases
Data enrichment Streaming ETL
Filter, cleanse, mask Real-time monitoring Anomaly detection
18C O N F I D E N T I A L
Similarities and Differences
of KSQL and Kafka Streams
19C O N F I D E N T I A L
Similarities
Enterprise
support
All you need is
Kafka
Run everywhere Elastic, scalable,
fault-tolerant
Kafka security
integration
Powerful
processing
Supports
streams & tables
Exactly-once
processing
Event-time
processing And more!
1 2 3 4 5
6 7 8 9 ...
streams
KSQL
20C O N F I D E N T I A L
Similarities
KSQL
(processing)
Kafka
(data)
JVM application
with Kafka Streams (processing)
Does not run on
Kafka brokers!
Does not run on
Kafka brokers!
21C O N F I D E N T I A L
Differences
Consumer,
Producer
KSQL
Kafka Streams
Flexibilit
y
Ease of Use
CREATE STREAM ...
CREATE TABLE ...
SELECT, JOIN, COUNT, …
KStream, KTable,
filter(), map(), flatMap(),
join(), aggregate(), …
subscribe(), poll(),
send(), flush(),
beginTransaction(), …
streams
KSQL Kafka
Clients
22C O N F I D E N T I A L
Differences
You write…
UI included for human interaction
CLI included for human interaction
Data formats
Interactive queries
KSQL statements
Yes (Enterprise)
Yes
Avro, JSON, CSV (today)
Not yet
JVM applications
No
No
Any data format, including
Avro, JSON, CSV, Protobuf, XML
Yes
streams
KSQL
Flexibility, use case coverage Limited to KSQL syntax, UDFs Full power of Java, Scala
REST API included Yes No, but you can DIY
Runtime included Yes, the KSQL server Not needed, applications run
as standard JVM processes
23C O N F I D E N T I A L
Guidance
streams
KSQL
• New to streaming and Kafka
• Prefer SQL to writing code in Java, Scala
• Prefer interactive experience with UI, CLI
• Use cases include: filtering, transforming,
masking data; enriching data, joining
data sources
• Use case is naturally expressible through
SQL, with optional help from User Defined
Functions as “get out of jail free” card
• Provides KSQL REST API for use from
Python, Go, JavaScript, shell, etc.
• At least basic Kafka experience
• Prefer writing and deploying JVM apps
• Writing microservices
• Use cases cover KSQL’s and more
• To integrate with external services or
3rd party libraries (but see KSQL UDFs)
• To customize or fine-tune a use case,
e.g. custom joins, probabilistic
counting
• Need for queryable state, which is not
yet supported by KSQL
24C O N F I D E N T I A L
KSQL and Kafka Streams
A closer look
25C O N F I D E N T I A L
Next: KSQL in more detail
streams
The streaming SQL engine
for Apache Kafka® to write
real-time applications in SQL
Apache Kafka® library to write
real-time applications and
microservices in Java, Scala
KSQL
30
26C O N F I D E N T I A L
KSQL
● You write only SQL.
No Java, Python, or
other boilerplate to
wrap around it!
● Create KSQL user
defined functions in
Java when needed.
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8
27C O N F I D E N T I A L
New user experience: interactive stream processing
28C O N F I D E N T I A L
KSQL can be used interactively + programmatically
ksql>
1 UI
POST /query
2 CLI 3 REST 4 Headless
29C O N F I D E N T I A L
KSQL REST API
POST /query HTTP/1.1
{
"ksql": "SELECT * FROM users WHERE name LIKE 'a%';"
"streamsProperties": {
"your.custom.setting": "value"
}
}
Work with KSQL programmatically from other languages or the terminal
Here: run a continuous query and stream back the results
30C O N F I D E N T I A L
All you need is Kafka & KSQL
1.Build & package
2. Submit job
ksql> SELECT * FROM myStream
Without KSQL With KSQL
Processing
clusterStorage
system
required for
fault-tolerant
processing
31C O N F I D E N T I A L
KSQL is a stream processing technology
As such it is not yet a great fit for:
Ad-hoc queries
● No indexes yet in KSQL
● Kafka often configured to retain
data for only a limited span of
time
BI reports (Tableau etc.)
● No indexes yet in KSQL
● No official JDBC
● Most BI tools don’t understand
continuous, streaming results
32C O N F I D E N T I A L
Data exploration
KSQL example use cases
Data enrichment Streaming ETL
Filter, cleanse, mask Real-time monitoring Anomaly detection
33C O N F I D E N T I A L
Example: CDC from DB via Kafka to Elastic
KSQL processes table changes in
real-time to continuously
maintain aggregates of metrics,
KPI
Kafka Connect
streams data
in
Kafka Connect
streams data out
34C O N F I D E N T I A L
Example: Retail
KSQL joins the two
streams in real-time
Stream of shipments
that arrive
Stream of purchases
from online and physical
stores
35C O N F I D E N T I A L
Example: IoT, Automotive, Connected Cars
KSQL joins the stream
and table in real-time, and
spots for vehicle failures
Kafka Connect
streams data in
Cars send telemetry data
via Kafka API
Kafka Streams
application to notify
customers
36C O N F I D E N T I A L
KSQL for Streaming ETL
● Joining, filtering,
and aggregating
streams of event
data
CREATE STREAM vip_actions AS
SELECT user_id, page, action
FROM clickstream c
LEFT JOIN users u
ON c.user_id = u.user_id
WHERE u.level = 'Platinum'
37C O N F I D E N T I A L
KSQL for Data Transformation
● Easily make
derivations of
existing topics
● Change data format
● Change number of
partitions or
partitioning scheme
CREATE STREAM pageviews_avro
WITH (PARTITIONS=6,
VALUE_FORMAT='AVRO') AS
SELECT * FROM pageviews_json
PARTITION BY user_id
38C O N F I D E N T I A L
KSQL for Real-Time Monitoring
● Filtering, tracking,
and alerting
● Log data monitoring
● Syslog data
● Sensor / IoT data
● Application metrics
CREATE STREAM syslog_invalid_users AS
SELECT host, message
FROM syslog
WHERE message LIKE '%Invalid user%'
http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
39C O N F I D E N T I A L
KSQL for Anomaly Detection
● Identify patterns or
anomalies in real-
time data, surfaced
in milliseconds
CREATE TABLE possible_fraud AS
SELECT card_number, COUNT(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING COUNT(*) > 3
40C O N F I D E N T I A L
Streams and Tables
Important because most use cases need both
41C O N F I D E N T I A L
§
Do you think that’s a table
you are querying?
42C O N F I D E N T I A L
The Stream-Table Duality
aggregation
changelog “materialized view”
of the stream
(like SUM, COUNT)
Stream Table
(CDC)
43C O N F I D E N T I A L
The Stream-Table Duality
CREATE TABLE num_visited_locations_per_user AS
SELECT username, COUNT(*)
FROM location_updates
GROUP BY username
45C O N F I D E N T I A L
Scalability, Elasticity,
Fault-Tolerance
46C O N F I D E N T I A L
Fault-tolerance, powered by Kafka
Server A:
“I do stateful stream
processing, like tables,
joins, aggregations.”
“streaming
restore” of
A’s local state to B
Changelog Topic
“streaming
backup” of
A’s local state
KSQL
Kafka
State is automatically migrated
in case of server failure
Server B:
“I restore the state and
continue processing where
server A stopped.”
A key challenge of distributed stream processing is fault-tolerant state.
47C O N F I D E N T I A L
Fault-tolerance, powered by Kafka
Processing fails over automatically, without data loss or miscomputation.
1 Kafka consumer group
rebalance is triggered
2 Processing and state of #3
is migrated via Kafka to
remaining servers #1 + #2
#3 died, so #1 and #2 take over
1 Kafka consumer group
rebalance is triggered
2 Part of processing incl.
state is migrated via Kafka
from #1 + #2 to server #3
#3 is back, so work is split again
48C O N F I D E N T I A L
Elasticity and Scalability, powered by Kafka
You can add, remove, restart servers during live operations.
We need more processing power!” “Ok, we can scale down again.”
49C O N F I D E N T I A L
Deploying KSQL
50C O N F I D E N T I A L
Deploying KSQL
KSQL Server
(JVM process)
…and many more…
DEB, RPM, ZIP, TAR downloads
http://confluent.io/ksql
Docker images
confluentinc/cp-ksql-server
confluentinc/cp-ksql-cli
51C O N F I D E N T I A L
Deploying KSQL
#1 Interactive KSQL, for development & testing
ksql>
POST /query
...
KSQL
(processing)
Kafka
(data)
52C O N F I D E N T I A L
Deploying KSQL
#2 Headless KSQL, for production
servers started
with same
.sql file ...
interaction for
UI, CLI, REST
is disabled
KSQL
(processing)
Kafka
(data)
53C O N F I D E N T I A L
Deploying KSQL
read,
write
…
BookingsTeam
…
FraudTeam
…
MobileTeam
KSQL
(processing)
Kafka
(data)
More KSQL
54C O N F I D E N T I A L
Monitoring KSQL
https://www.confluent.io/blog/troubleshooting-ksql-part-2
Confluent Control Center JMX
56C O N F I D E N T I A L
Next: Kafka Streams in more detail
streams
The streaming SQL engine
for Apache Kafka® to write
real-time applications in SQL
Apache Kafka® library to write
real-time applications and
microservices in Java, Scala
KSQL
10
57C O N F I D E N T I A L
Kafka Streams
● You write standard
Java or Scala
applications to
process your data
● The Kafka Streams
library makes these
applications:
elastic, scalable,
fault-tolerant, and
more
● All you need is
58C O N F I D E N T I A L
All you need is Kafka & Kafka Streams
Without Kafka Streams With Kafka Streams
JVM application
1.Build & package
2. Submit job
required for
fault-tolerant
processing
Processing
clusterStorage
system
59C O N F I D E N T I A L
DB (key-value store) included, and queryable
Location-tracking
Application:
“I continuously track the
latest geolocation of every
customer vehicle in a table.”
Your app has its own local DB. You can also expose it to other apps, e.g. via
REST.
Other Applications
(Java, Go, Python, etc.)
can directly query this table.
Kafka Write
results
Read
results
Alternative
query
60C O N F I D E N T I A L
Microservices
Kafka Streams example use cases
Data enrichment Streaming ETL
Filter, cleanse, mask Real-time monitoring Anomaly detection
61C O N F I D E N T I A L
Writing Kafka Streams applications
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>2.1.0</version>
</dependency>
Add as dependency to your Java/Scala application
62C O N F I D E N T I A L
Writing Kafka Streams applications
DSL Processor API
And of course, you can combine the DSL and the Processor API!
API style Functional programming Imperative programming
Typically used when Starting point for most developers
and most use cases
To customize, tune, or to add functionality
beyond what’s in the DSL today
You work with KStream and KTable Processors and state stores
Example operations KStream#map(), KTable#filter(),
KStream#join(), aggregate()
Processor#init(), Processor#close(),
Processor#process(msg)
Suitable for use cases S / M / L / XL S / M / L / XL
63C O N F I D E N T I A L
Deploying Kafka Streams applications
JVM application
with Kafka Streams (processing)
Develop your
application
Build and package
(jar, container, ...)
Deploy and run
one or multiple
app instances
…and many more…
64C O N F I D E N T I A L
Elasticity and Scalability, powered by Kafka
You can add, remove, restart instances of your application during live operations.
We need more processing power!” “Ok, we can scale down again.”
65C O N F I D E N T I A L
Deploying Kafka Streams applications
read,
write
App
(processing)
Kafka
(data)
More Apps
BookingsTeam
FraudTeam
…
MobileTeam
…
66C O N F I D E N T I A L
Confluent Platform as
Central Nervous System
67C O N F I D E N T I A L
Confluent’s Streaming Maturity Model - where are you?
Value
Maturity (Investment & time)
2
Enterprise
Streaming Pilot /
Early Production
Pub + Sub Store Process
5
Central Nervous
System
1
Developer
Interest
Pre-Streaming
4
Global
Streaming
3
SLA Ready,
Integrated
Streaming
Projects
Platform
69C O N F I D E N T I A L
Kafka Connect
Kafka Cluster
CRM
Integration
Domain-Driven Design for your Event Steaming Platform
Legacy
Integration
Custom
Application
ESB Connector
Java / KSQL /
Kafka Streams
Schema
Registry
Event Streaming Platform
CRM Domain Legacy Domain Payment Domain
è Independent and loosely coupled, but scalable, highly available and reliable!
70C O N F I D E N T I A L
Confluent Schema Registry for Message Validation
Input Data
Schema
Registry
App 1
• “Kafka Benefits Under the Hood”
• Schema definition + evolution
• Forward and backward compatibility
• Multi data center deployment
App X
71C O N F I D E N T I A L
Resources and Next Steps
confluentinc/kafka-streams-examples
https://docs.confluent.io/current/streams/
http://cnfl.io/slack
72C O N F I D E N T I A L
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.confluent.io
www.kai-waehner.de
LinkedIn
Questions? Feedback?
Please contact me!
73C O N F I D E N T I A L

More Related Content

What's hot

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®confluent
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?Kai Wähner
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafkaconfluent
 
Streaming all over the world Real life use cases with Kafka Streams
Streaming all over the world  Real life use cases with Kafka StreamsStreaming all over the world  Real life use cases with Kafka Streams
Streaming all over the world Real life use cases with Kafka Streamsconfluent
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsGuido Schmutz
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Kai Wähner
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Kai Wähner
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...Databricks
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...HostedbyConfluent
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumChengKuan Gan
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)confluent
 

What's hot (20)

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Streaming all over the world Real life use cases with Kafka Streams
Streaming all over the world  Real life use cases with Kafka StreamsStreaming all over the world  Real life use cases with Kafka Streams
Streaming all over the world Real life use cases with Kafka Streams
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
kafka
kafkakafka
kafka
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with Debezium
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
 

Similar to Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka

KSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache KafkaKSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache KafkaChris Mueller
 
KSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use BothKSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use Bothconfluent
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Servicesconfluent
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZconfluent
 
ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Eventsconfluent
 
How to Build Streaming Apps with Confluent II
How to Build Streaming Apps with Confluent IIHow to Build Streaming Apps with Confluent II
How to Build Streaming Apps with Confluent IIconfluent
 
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...Kai Wähner
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshopconfluent
 
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Michael Noll
 
Using Kafka to integrate DWH and Cloud Based big data systems
Using Kafka to integrate DWH and Cloud Based big data systemsUsing Kafka to integrate DWH and Cloud Based big data systems
Using Kafka to integrate DWH and Cloud Based big data systemsconfluent
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridPaolo Castagna
 
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...Michael Noll
 
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLRethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLKai Wähner
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertconfluent
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Paolo Castagna
 
Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019confluent
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719Patrik Kleindl
 

Similar to Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka (20)

KSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache KafkaKSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache Kafka
 
KSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use BothKSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use Both
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
 
ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Events
 
How to Build Streaming Apps with Confluent II
How to Build Streaming Apps with Confluent IIHow to Build Streaming Apps with Confluent II
How to Build Streaming Apps with Confluent II
 
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshop
 
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
 
Using Kafka to integrate DWH and Cloud Based big data systems
Using Kafka to integrate DWH and Cloud Based big data systemsUsing Kafka to integrate DWH and Cloud Based big data systems
Using Kafka to integrate DWH and Cloud Based big data systems
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - Madrid
 
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
 
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLRethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!
 
Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019Amsterdam meetup at ING June 18, 2019
Amsterdam meetup at ING June 18, 2019
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719
 

More from Kai Wähner

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail IndustryKai Wähner
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Kai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsKai Wähner
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationKai Wähner
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
 

More from Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and Logistics
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
 

Recently uploaded

Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profileakrivarotava
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 

Recently uploaded (20)

Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profile
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 

Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka

  • 1. 1C O N F I D E N T I A L Stream Processing with Confluent Kafka Streams and KSQL Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  • 2. 2C O N F I D E N T I A L
  • 3. 3C O N F I D E N T I A L
  • 4. 4C O N F I D E N T I A L Ubiquitous connectivity Globally scalable platform for all event producers and consumers Immediate data access Data accessible to all consumers in real time Single system of record Persistent storage to enable reprocessing of past events Continuous queries Stream processing capabilities for in-line data transformation Microservice s DBs SaaS apps Mobile Customer 360 Real-time fraud detection Data warehouse Producers Consumers Database change Microservices events SaaS data Customer experience s Streams of real time events Stream processing appsStream processing apps Stream processing apps A Streaming Platform is the Underpinning of an Event-driven Architecture
  • 5. 5C O N F I D E N T I A L
  • 6. 6C O N F I D E N T I A L The beginning of a new Era https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying The first use case. This is why Kafka was created!
  • 7. 7C O N F I D E N T I A L ● Global-scale ● Real-time ● Persistent Storage ● Stream Processing Apache Kafka: The De-facto Standard for Real-Time Event Streaming Edge Cloud Data LakeDatabases Datacenter IoT SaaS AppsMobile Microservices Machine Learning Apache Kafka
  • 8. 8C O N F I D E N T I A L Apache Kafka at Scale at Tech Giants > 4.5 trillion messages / day > 6 Petabytes / day “You name it” * Kafka Is not just used by tech giants ** Kafka is not just used for big data
  • 9. 9C O N F I D E N T I A L Confluents Business Value per Use Case Improve Customer Experience (CX) Increase Revenue (make money) Business Value Decrease Costs (save money) Core Business Platform Increase Operational Efficiency Migrate to Cloud Mitigate Risk (protect money) Key Drivers Strategic Objectives (sample) Fraud Detection IoT sensor ingestion Digital replatforming/ Mainframe Offload Connected Car: Navigation & improved in-car experience: Audi Customer 360 Simplifying Omni-channel Retail at Scale: Target Faster transactional processing / analysis incl. Machine Learning / AI Mainframe Offload: RBC Microservices Architecture Online Fraud Detection Online Security (syslog, log aggregation, Splunk replacement) Middleware replacement Regulatory Digital Transformation Application Modernization: Multiple Examples Website / Core Operations (Central Nervous System) The [Silicon Valley] Digital Natives; LinkedIn, Netflix, Uber, Yelp... Predictive Maintenance: Audi Streaming Platform in a regulated environment (e.g. Electronic Medical Records): Celmatix Real-time app updates Real Time Streaming Platform for Communications and Beyond: Capital One Developer Velocity - Building Stateful Financial Applications with Kafka Streams: Funding Circle Detect Fraud & Prevent Fraud in Real Time: PayPal Kafka as a Service - A Tale of Security and Multi-Tenancy: Apple Example Use Cases $↑ $↓ $ Example Case Studies (of many)
  • 10. 10C O N F I D E N T I A L A Modern, Distributed Platform for Data Streams. Messaging + Storage + Processing!
  • 11. 11C O N F I D E N T I A L Stream Processing processing-time event-time windowing alice bob dave
  • 12. 12C O N F I D E N T I A L Confluent Delivers a Mission-Critical Event Streaming Platform Apache Kafka® Core | Connect API | Streams API Data Compatibility Schema Registry Enterprise Operations Replicator | Auto Data Balancer | Connectors | MQTT Proxy | Kubernetes Operator Database Changes Log Events IoT Data Web Events other events Hadoop Database Data Warehouse CRM other DATA INTEGRATION Transformations Custom Apps Analytics Monitoring other REAL-TIME APPLICATIONS COMMUNITY FEATURES COMMERCIAL FEATURES Datacenter Public Cloud Confluent Cloud Confluent Platform Management & Monitoring Control Center | Security Development & Connectivity Clients | Connectors | REST Proxy | KSQL CONFLUENT FULLY-MANAGEDCUSTOMER SELF-MANAGED
  • 13. 13C O N F I D E N T I A L What We Cover Today: streams The streaming SQL engine for Apache Kafka® to write real-time applications in SQL Apache Kafka® library to write real-time applications and microservices in Java, Scala KSQL 10
  • 14. 14C O N F I D E N T I A L Lower the bar to enter the world of streaming User Population CodingSophistication Core developers who use Java/Scala Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts streams
  • 15. 15C O N F I D E N T I A L CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8 Lower the bar to enter the world of streaming vs. KSQL streams
  • 16. 16C O N F I D E N T I A L Confluent ● Kafka Streams and KSQL for stream processing ● Lower-level Kafka Producer and Kafka Consumer clients for multiple languages Java C/C++ Go Python.NET JMS Kafka Streams KSQL Java/Scala Streaming SQL
  • 17. 17C O N F I D E N T I A L Microservices Example use cases Data enrichment Streaming ETL Filter, cleanse, mask Real-time monitoring Anomaly detection
  • 18. 18C O N F I D E N T I A L Similarities and Differences of KSQL and Kafka Streams
  • 19. 19C O N F I D E N T I A L Similarities Enterprise support All you need is Kafka Run everywhere Elastic, scalable, fault-tolerant Kafka security integration Powerful processing Supports streams & tables Exactly-once processing Event-time processing And more! 1 2 3 4 5 6 7 8 9 ... streams KSQL
  • 20. 20C O N F I D E N T I A L Similarities KSQL (processing) Kafka (data) JVM application with Kafka Streams (processing) Does not run on Kafka brokers! Does not run on Kafka brokers!
  • 21. 21C O N F I D E N T I A L Differences Consumer, Producer KSQL Kafka Streams Flexibilit y Ease of Use CREATE STREAM ... CREATE TABLE ... SELECT, JOIN, COUNT, … KStream, KTable, filter(), map(), flatMap(), join(), aggregate(), … subscribe(), poll(), send(), flush(), beginTransaction(), … streams KSQL Kafka Clients
  • 22. 22C O N F I D E N T I A L Differences You write… UI included for human interaction CLI included for human interaction Data formats Interactive queries KSQL statements Yes (Enterprise) Yes Avro, JSON, CSV (today) Not yet JVM applications No No Any data format, including Avro, JSON, CSV, Protobuf, XML Yes streams KSQL Flexibility, use case coverage Limited to KSQL syntax, UDFs Full power of Java, Scala REST API included Yes No, but you can DIY Runtime included Yes, the KSQL server Not needed, applications run as standard JVM processes
  • 23. 23C O N F I D E N T I A L Guidance streams KSQL • New to streaming and Kafka • Prefer SQL to writing code in Java, Scala • Prefer interactive experience with UI, CLI • Use cases include: filtering, transforming, masking data; enriching data, joining data sources • Use case is naturally expressible through SQL, with optional help from User Defined Functions as “get out of jail free” card • Provides KSQL REST API for use from Python, Go, JavaScript, shell, etc. • At least basic Kafka experience • Prefer writing and deploying JVM apps • Writing microservices • Use cases cover KSQL’s and more • To integrate with external services or 3rd party libraries (but see KSQL UDFs) • To customize or fine-tune a use case, e.g. custom joins, probabilistic counting • Need for queryable state, which is not yet supported by KSQL
  • 24. 24C O N F I D E N T I A L KSQL and Kafka Streams A closer look
  • 25. 25C O N F I D E N T I A L Next: KSQL in more detail streams The streaming SQL engine for Apache Kafka® to write real-time applications in SQL Apache Kafka® library to write real-time applications and microservices in Java, Scala KSQL 30
  • 26. 26C O N F I D E N T I A L KSQL ● You write only SQL. No Java, Python, or other boilerplate to wrap around it! ● Create KSQL user defined functions in Java when needed. CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8
  • 27. 27C O N F I D E N T I A L New user experience: interactive stream processing
  • 28. 28C O N F I D E N T I A L KSQL can be used interactively + programmatically ksql> 1 UI POST /query 2 CLI 3 REST 4 Headless
  • 29. 29C O N F I D E N T I A L KSQL REST API POST /query HTTP/1.1 { "ksql": "SELECT * FROM users WHERE name LIKE 'a%';" "streamsProperties": { "your.custom.setting": "value" } } Work with KSQL programmatically from other languages or the terminal Here: run a continuous query and stream back the results
  • 30. 30C O N F I D E N T I A L All you need is Kafka & KSQL 1.Build & package 2. Submit job ksql> SELECT * FROM myStream Without KSQL With KSQL Processing clusterStorage system required for fault-tolerant processing
  • 31. 31C O N F I D E N T I A L KSQL is a stream processing technology As such it is not yet a great fit for: Ad-hoc queries ● No indexes yet in KSQL ● Kafka often configured to retain data for only a limited span of time BI reports (Tableau etc.) ● No indexes yet in KSQL ● No official JDBC ● Most BI tools don’t understand continuous, streaming results
  • 32. 32C O N F I D E N T I A L Data exploration KSQL example use cases Data enrichment Streaming ETL Filter, cleanse, mask Real-time monitoring Anomaly detection
  • 33. 33C O N F I D E N T I A L Example: CDC from DB via Kafka to Elastic KSQL processes table changes in real-time to continuously maintain aggregates of metrics, KPI Kafka Connect streams data in Kafka Connect streams data out
  • 34. 34C O N F I D E N T I A L Example: Retail KSQL joins the two streams in real-time Stream of shipments that arrive Stream of purchases from online and physical stores
  • 35. 35C O N F I D E N T I A L Example: IoT, Automotive, Connected Cars KSQL joins the stream and table in real-time, and spots for vehicle failures Kafka Connect streams data in Cars send telemetry data via Kafka API Kafka Streams application to notify customers
  • 36. 36C O N F I D E N T I A L KSQL for Streaming ETL ● Joining, filtering, and aggregating streams of event data CREATE STREAM vip_actions AS SELECT user_id, page, action FROM clickstream c LEFT JOIN users u ON c.user_id = u.user_id WHERE u.level = 'Platinum'
  • 37. 37C O N F I D E N T I A L KSQL for Data Transformation ● Easily make derivations of existing topics ● Change data format ● Change number of partitions or partitioning scheme CREATE STREAM pageviews_avro WITH (PARTITIONS=6, VALUE_FORMAT='AVRO') AS SELECT * FROM pageviews_json PARTITION BY user_id
  • 38. 38C O N F I D E N T I A L KSQL for Real-Time Monitoring ● Filtering, tracking, and alerting ● Log data monitoring ● Syslog data ● Sensor / IoT data ● Application metrics CREATE STREAM syslog_invalid_users AS SELECT host, message FROM syslog WHERE message LIKE '%Invalid user%' http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
  • 39. 39C O N F I D E N T I A L KSQL for Anomaly Detection ● Identify patterns or anomalies in real- time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, COUNT(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING COUNT(*) > 3
  • 40. 40C O N F I D E N T I A L Streams and Tables Important because most use cases need both
  • 41. 41C O N F I D E N T I A L § Do you think that’s a table you are querying?
  • 42. 42C O N F I D E N T I A L The Stream-Table Duality aggregation changelog “materialized view” of the stream (like SUM, COUNT) Stream Table (CDC)
  • 43. 43C O N F I D E N T I A L The Stream-Table Duality CREATE TABLE num_visited_locations_per_user AS SELECT username, COUNT(*) FROM location_updates GROUP BY username
  • 44. 45C O N F I D E N T I A L Scalability, Elasticity, Fault-Tolerance
  • 45. 46C O N F I D E N T I A L Fault-tolerance, powered by Kafka Server A: “I do stateful stream processing, like tables, joins, aggregations.” “streaming restore” of A’s local state to B Changelog Topic “streaming backup” of A’s local state KSQL Kafka State is automatically migrated in case of server failure Server B: “I restore the state and continue processing where server A stopped.” A key challenge of distributed stream processing is fault-tolerant state.
  • 46. 47C O N F I D E N T I A L Fault-tolerance, powered by Kafka Processing fails over automatically, without data loss or miscomputation. 1 Kafka consumer group rebalance is triggered 2 Processing and state of #3 is migrated via Kafka to remaining servers #1 + #2 #3 died, so #1 and #2 take over 1 Kafka consumer group rebalance is triggered 2 Part of processing incl. state is migrated via Kafka from #1 + #2 to server #3 #3 is back, so work is split again
  • 47. 48C O N F I D E N T I A L Elasticity and Scalability, powered by Kafka You can add, remove, restart servers during live operations. We need more processing power!” “Ok, we can scale down again.”
  • 48. 49C O N F I D E N T I A L Deploying KSQL
  • 49. 50C O N F I D E N T I A L Deploying KSQL KSQL Server (JVM process) …and many more… DEB, RPM, ZIP, TAR downloads http://confluent.io/ksql Docker images confluentinc/cp-ksql-server confluentinc/cp-ksql-cli
  • 50. 51C O N F I D E N T I A L Deploying KSQL #1 Interactive KSQL, for development & testing ksql> POST /query ... KSQL (processing) Kafka (data)
  • 51. 52C O N F I D E N T I A L Deploying KSQL #2 Headless KSQL, for production servers started with same .sql file ... interaction for UI, CLI, REST is disabled KSQL (processing) Kafka (data)
  • 52. 53C O N F I D E N T I A L Deploying KSQL read, write … BookingsTeam … FraudTeam … MobileTeam KSQL (processing) Kafka (data) More KSQL
  • 53. 54C O N F I D E N T I A L Monitoring KSQL https://www.confluent.io/blog/troubleshooting-ksql-part-2 Confluent Control Center JMX
  • 54. 56C O N F I D E N T I A L Next: Kafka Streams in more detail streams The streaming SQL engine for Apache Kafka® to write real-time applications in SQL Apache Kafka® library to write real-time applications and microservices in Java, Scala KSQL 10
  • 55. 57C O N F I D E N T I A L Kafka Streams ● You write standard Java or Scala applications to process your data ● The Kafka Streams library makes these applications: elastic, scalable, fault-tolerant, and more ● All you need is
  • 56. 58C O N F I D E N T I A L All you need is Kafka & Kafka Streams Without Kafka Streams With Kafka Streams JVM application 1.Build & package 2. Submit job required for fault-tolerant processing Processing clusterStorage system
  • 57. 59C O N F I D E N T I A L DB (key-value store) included, and queryable Location-tracking Application: “I continuously track the latest geolocation of every customer vehicle in a table.” Your app has its own local DB. You can also expose it to other apps, e.g. via REST. Other Applications (Java, Go, Python, etc.) can directly query this table. Kafka Write results Read results Alternative query
  • 58. 60C O N F I D E N T I A L Microservices Kafka Streams example use cases Data enrichment Streaming ETL Filter, cleanse, mask Real-time monitoring Anomaly detection
  • 59. 61C O N F I D E N T I A L Writing Kafka Streams applications <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>2.1.0</version> </dependency> Add as dependency to your Java/Scala application
  • 60. 62C O N F I D E N T I A L Writing Kafka Streams applications DSL Processor API And of course, you can combine the DSL and the Processor API! API style Functional programming Imperative programming Typically used when Starting point for most developers and most use cases To customize, tune, or to add functionality beyond what’s in the DSL today You work with KStream and KTable Processors and state stores Example operations KStream#map(), KTable#filter(), KStream#join(), aggregate() Processor#init(), Processor#close(), Processor#process(msg) Suitable for use cases S / M / L / XL S / M / L / XL
  • 61. 63C O N F I D E N T I A L Deploying Kafka Streams applications JVM application with Kafka Streams (processing) Develop your application Build and package (jar, container, ...) Deploy and run one or multiple app instances …and many more…
  • 62. 64C O N F I D E N T I A L Elasticity and Scalability, powered by Kafka You can add, remove, restart instances of your application during live operations. We need more processing power!” “Ok, we can scale down again.”
  • 63. 65C O N F I D E N T I A L Deploying Kafka Streams applications read, write App (processing) Kafka (data) More Apps BookingsTeam FraudTeam … MobileTeam …
  • 64. 66C O N F I D E N T I A L Confluent Platform as Central Nervous System
  • 65. 67C O N F I D E N T I A L Confluent’s Streaming Maturity Model - where are you? Value Maturity (Investment & time) 2 Enterprise Streaming Pilot / Early Production Pub + Sub Store Process 5 Central Nervous System 1 Developer Interest Pre-Streaming 4 Global Streaming 3 SLA Ready, Integrated Streaming Projects Platform
  • 66. 69C O N F I D E N T I A L Kafka Connect Kafka Cluster CRM Integration Domain-Driven Design for your Event Steaming Platform Legacy Integration Custom Application ESB Connector Java / KSQL / Kafka Streams Schema Registry Event Streaming Platform CRM Domain Legacy Domain Payment Domain è Independent and loosely coupled, but scalable, highly available and reliable!
  • 67. 70C O N F I D E N T I A L Confluent Schema Registry for Message Validation Input Data Schema Registry App 1 • “Kafka Benefits Under the Hood” • Schema definition + evolution • Forward and backward compatibility • Multi data center deployment App X
  • 68. 71C O N F I D E N T I A L Resources and Next Steps confluentinc/kafka-streams-examples https://docs.confluent.io/current/streams/ http://cnfl.io/slack
  • 69. 72C O N F I D E N T I A L Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.confluent.io www.kai-waehner.de LinkedIn Questions? Feedback? Please contact me!
  • 70. 73C O N F I D E N T I A L