Streaming Data Ingest and Processing with Apache Kafka

Streaming Data Ingest and Processing with
Kafka

You will learn how to
• Realize the value of streaming data
ingest with Kafka
• Turn databases into live feeds for
streaming ingest and processing
• Accelerate data delivery to enable real-
time analytics
• Reduce skill and training requirements
for data ingest

Apache Kafka and Stream Processing

About Confluent
• Founded by the creators of Apache Kafka
• Founded September 2014
• Technology developed while at LinkedIn
• 73%of active Kafka committers
Cheryl Dalrymple
CFO
Jay Kreps
CEO
Neha Narkhede
CTO, VP Engineering
Luanne Dauber
CMO
Leadership
Todd Barnett
VP WW Sales
Jabari Norton
VP Business Dev

What does Kafka do? Producers
Consumers
Kafka Connect
Kafka Connect
Topic
Your interfaces to the world
Connected to your systems in real time

Kafka is much more than
a pub-sub messaging system

Before: Many Ad Hoc Pipelines
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational Metrics
Hadoop Search Monitoring
Data
Warehouse
Espresso Cassandra Oracle

After: Stream Data Platform with Kafka
 Distributed  Fault Tolerant  Stores Messages
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational MetricsEspresso Cassandra Oracle
Hadoop Log Search Monitoring
Data
Warehouse
Kafka
 Processes Streams

People Using Kafka Today
Financial Services
Entertainment & Media
Consumer Tech
Travel & Leisure
Enterprise Tech
Telecom Retail

Common Kafka Use Cases
Data transport and integration
• Log data
• Database changes
• Sensors and device data
• Monitoring streams
• Call data records
• Stock ticker data
Real-time stream processing
• Monitoring
• Asynchronous applications
• Fraud and security

What is the key challenge?
Making sure all data ends up in the right places
Kafka for Integration

1. Ad-hoc pipelines
2. Extreme processing
3. Loss of metadata
Data Integration Anti-Patterns
Tight Coupling
Agility

Because at the heart of EVERY system…
…there is a LOG,
and Kafka is a scalable and reliable system to manage LOGs
Why is Kafka such a great fit?

Basic Data Integration Patterns
Push
Pull

Kafka Connect Allows Kafka to Pull Data

Turn the Change Capture Log into a Kafka Topic
16

• Database data is available for any application
• No impact on production
• Database TABLES turned into a STREAM of events
• Ready for the next challenge? Stream processing applications
What’s next?

Confluent Platform with Attunity Connectivity
Confluent Platform
Alerting
Monitoring
Real-time
Analytics
Custom
Application
Transformations
Real Time
Applications
Apache Kafka Core
Connectors
Control Center Clients & Developer Tools
Hadoop
ERP
CRM
Data Warehouse
RDBMS
Data
Integration
Connectors
Database
Changes
Mobile DevicesloTLogs Website Events
Confluent Platform Confluent Platform Enterprise External Product
Support, Services and Consulting
Kafka Streams
Source Sink

Confluent Platform: It’s Kafka ++
Feature Benefit Apache Kafka Confluent Platform 3.0 Confluent Enterprise 3.0
Apache Kafka
High throughput, low latency, high availability, secure distributed message
system
Kafka Connect
Advanced framework for connecting external sources
and destinations into Kafka
Java Client Provides easy integration into Java applications
Kafka Streams
Simple library that enables streaming application development within the Kafka
framework
Additional Clients Supports non-Java clients; C, C++, Python, etc.
Rest Proxy
Provides universal access to Kafka from any network connected device via
HTTP
Schema Registry
Central registry for the format of Kafka data – guarantees all data is always
consumable
Pre-Built Connectors
HDFS, JDBC and other connectors fully Certified
and fully supported by Confluent
Confluent Control Center Includes Connector Management and Stream Monitoring
Support
Connection and Monitoring command center provides advanced functionality
and control
Community Community 24x7x365
Free Free Subscription

Confluent Control Center
Configures Kafka Connect data pipelines
Monitors all pipelines from end-to-end

Attunity Replicate
Streaming databases into Kafka

About Attunity
Overview
Global operations, US HQ
2000 customers in 65 countries
NASDAQ traded, fast growing
Global Footprint
Data Integration and Big Data Management
1. Accelerate data delivery and availability
2. Automate data readiness for analytics
3. Optimize data management with intelligence

Attunity Replicate Attunity Compose Attunity Visibility
Universal Data Availability Data Warehouse Automation Data Usage Profiling & Analytics
Move
data to any platform
Automate
ETL/EDW
Optimize
performance and cost
On Premises / Cloud
Hadoop FilesRDBMS EDW SAP Mainframe
Attunity Product Suite

Stream your databases to Kafka with Attunity Replicate:
• Easily – configurable and automated solution, with a few clicks
you can turn databases into live feeds for Kafka
• Continuously – capture and stream data changes efficiently,
in real-time, and with low impact
• Heterogeneously – using the same platform for many source
database systems (Oracle, SQL, DB2, Mainframe, many more…)
Attunity Replicate for Kafka

Attunity Replicate architecture
Transfer
TransformFilter
Batch
CDC Incremental
In-Memory
File Channel
Batch
Hadoop
Files
RDBMS
Data Warehouse
Mainframe
Cloud
On-prem
Cloud
On-prem
Hadoop
Files
RDBMS
Data Warehouse
Kafka
Persistent Store

Demand
• Easy ingest and CDC
• Real-time processing
• Real-time monitoring
• Real-time Hadoop
• Scalable to 1000’s applications
• One publisher – multiple consumers
Attunity Replicate
• Direct integration using Kafka APIs
• In-memory optimized data streaming
• Support for multi-topic and multi-
partitioned data publication
• Full load and CDC
• Integrated management and
monitoring via GUI
Kafka and real-time streaming

CDC
Attunity Replicate for Kafka - Architecture
MSG
n 2 1
MSG MSG
DataStreaming
Transaction
logs
In memory optimized metadata
management and data transport
Message
broker
Message
broker
Bulk
Load
MSG
n 2 1
MSG MSG
DataStreaming
T1/P0
T2/P1
T3/P0
Broker 1
M0 M1 M2 M3 M4 M5 M6 M7 M8
M0 M1 M2 M3 M4 M5
M0 M1 M2 M3 M4 M5 M6 M7
T1/P1
T2/P0
Broker 2
M0 M1 M2 M3 M4
M0 M1 M2 M3 M4 M5 M6

"table": "table-name",
"schema": "schema-name",
"op": "operation-type",
"ts": "change-timestamp",
"data": [{"col1": "val1"}, {"col2": "val2"}, …., {"colN": "valN"}]
"bu_data": [{"col1": "val1"}, {"col2": "val2"}, …., {"colN":
"valN"}],
Easily create and manage Kafka endpoints
Eliminate manual coding
• Drag and drop interface for
all sources and targets
• Monitor and control data
stream through web console
• Bulk load or CDC
• Multi-topic and multi-
partitioned data publication
Attunity Replicate
Command Line

Zero-footprint architecture
Lower impact on IT
• No software agents on
sources and targets for
mainstream databases
• Replicate data from 100’s of
source systems with easy
configuration
• No software upgrades
required at each database
source or target
Hadoop
Files
RDBMS
EDW
Mainframe
• Log based
• Source specific optimization
Hadoop
Files
RDBMS
EDW
Kafka

Heterogeneous – Broad support for sources and targets
RDBMS
Oracle
SQL Server
DB2 LUW
DB2 iSeries
DB2 z/OS
MySQL
Sybase ASE
Informix
Data Warehouse
Exadata
Teradata
Netezza
Vertica
Actian Vector
Actian Matrix
Hortonworks
Cloudera
MapR
Pivotal
Hadoop
IMS/DB
SQL M/P
Enscribe
RMS
VSAM
Legacy
AWS RDS
Salesforce
Cloud
RDBMS
Oracle
SQL Server
DB2 LUW
MySQL
PostgreSQL
Sybase ASE
Informix
Data Warehouse
Exadata
Teradata
Netezza
Vertica
Pivotal DB
(Greenplum)
Pivotal HAWQ
Actian Vector
Actian Matrix
Sybase IQ
Hortonworks
Cloudera
MapR
Pivotal
Hadoop
MongoDB
NoSQL
AWS RDS/Redshift/EC2
Google Cloud SQL
Google Cloud Dataproc
Azure SQL Data
Warehouse
Azure SQL Database
Cloud
Kafka
Message Broker
targets
sources

Watch the recorded webinar today!

Streaming Data Ingest and Processing with Apache Kafka

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Streaming Data Ingest and Processing with Apache Kafka

Similar to Streaming Data Ingest and Processing with Apache Kafka (20)

More from Attunity

More from Attunity (8)

Recently uploaded

Recently uploaded (20)

Streaming Data Ingest and Processing with Apache Kafka