SlideShare a Scribd company logo
1 of 15
PROCESSING TRILLIONS
OF EVENTS PER DAY
WITH APACHE KAFKA ON
AZURE
LESSONS LEARNED
SIPHON
NOOR ABANI
NEGIN RAOOF
SCALE
 Use Cases:
 O365 Intrusion detection
 Bing Health Monitoring
 PowerPoint cloud-based
grammar check
 Skype call analytics
 Bing Ads campaign
reporting and billing
 Scale:
 2.6 Trillion ingress events per day
 2 PB ingress per day
 > 20 regions
 > 100k producers
 > 2k brokers
WHAT
DO
YOU
TUNE
KAFKA
FOR?
CUSTOMERS’ USE CASES
Throughput
Latency
Security and
intrusion
detection
Real-time
online spelling
and grammar
checks
Telemetry data
- Availability
monitoring
< 250 ms – 1.5 GBps< 100ms – 2 GBps
< 10ms – 250 MBps
LOW LATENCY
Real-time online
spelling and grammar checks
Configs
o Higher Partition Density
o Smaller Producer Batch Size
o Additional Disks per Broker
Security and intrusion detection
applications
Configs
o Fewer Replications
o Fewer Required Acks
o Larger Producer Batch Size
HIGH THROUGHPUT LOW RELIABILITY
Telemetry data for availability
monitoring
Configs
o Higher Partition Density
o Larger Producer Batch Size
o Additional Disks per Broker
EXPERIMENT SETUP
Kafka o Kafka version 1.1
o Java producer client
o 20 Topics x 3 replicas
CONFIGURATIONS
Hardware Configs
o CPU
o RAM
o Disks per broker
Kafka Producer Configs
o Batch size
o Linger
o Compression
o Producer required acks
o Buffer memory
Kafka Broker Configs
o Num.io.threads
o Num.network.threads
o Min.replica.fetchers
o Replica.fetch.min.bytes
o Number of topics/partitions
ADDITIONAL DISKS PER BROKER
Maximum of 16 standard HDD disks per
broker
o CPU: 10 brokers with 8 cores
o RAM: 28 GB per broker
o Disks: Azure standard S30 HDD
1 TB - Up to 60 MB/second
BROKER CONFIGURATIONS
Broker Configuration Our Setting Default
min.insync.replicas 1 1
num.io.threads 48 8
num.network.threads 40 3
num.replica.fetchers 32 1
replica.fetch.min.bytes 512 1
PARTITIONS DISTRIBUTION (NOT INCLUDING REPLICAS)
BATCH SIZE
PRODUCER REQUIRED ACKS
COMPRESSION
CONCLUSION
Throughput
Latency
Security and
intrusion
detection
Real-time
online spelling
and grammar
checks
Telemetry data
- Availability
monitoring
< 250 ms – 1.5 GBps< 100ms – 2 GBps
< 10ms – 250 MBps
• Larger batches
• Higher ingress load
• 3x replication
• Producer acks = -1
• replica.fetch.min.bytes = 512
• Smaller batches
• Lower ingress load
• 3x replication
• Producer acks = -1
• replica.fetch.min.bytes = 1
• Larger batches
• Higher ingress load
• 2x replication
• Producer acks = 1
• replica.fetch.min.bytes = 512
Q & A
Noor.Abani@microsoft.co
m
Negin.Raoof@microsoft.co
m
https://azure.microsoft.com/en-us/blog/processing-
trillions-of-events-per-day-with-apache-kafka-on-
azure/
On Azure Blog:

More Related Content

What's hot

Building scalable web socket backend
Building scalable web socket backendBuilding scalable web socket backend
Building scalable web socket backendConstantine Slisenka
 
Dhcp security #netseckh
Dhcp security #netseckhDhcp security #netseckh
Dhcp security #netseckhHEM Sothon
 
FastNetMon Advanced DDoS detection tool
FastNetMon Advanced DDoS detection toolFastNetMon Advanced DDoS detection tool
FastNetMon Advanced DDoS detection toolPavel Odintsov
 
FastNetMon - ENOG9 speech about DDoS mitigation
FastNetMon - ENOG9 speech about DDoS mitigationFastNetMon - ENOG9 speech about DDoS mitigation
FastNetMon - ENOG9 speech about DDoS mitigationPavel Odintsov
 
Nanog66 vicente de luca fast netmon
Nanog66 vicente de luca fast netmonNanog66 vicente de luca fast netmon
Nanog66 vicente de luca fast netmonPavel Odintsov
 
VoxxedDays Minsk - Building scalable WebSocket backend
VoxxedDays Minsk - Building scalable WebSocket backendVoxxedDays Minsk - Building scalable WebSocket backend
VoxxedDays Minsk - Building scalable WebSocket backendConstantine Slisenka
 
DDoS Mitigation Tools and Techniques
DDoS Mitigation Tools and TechniquesDDoS Mitigation Tools and Techniques
DDoS Mitigation Tools and TechniquesBabak Farrokhi
 
Строим ханипот и выявляем DDoS-атаки
Строим ханипот и выявляем DDoS-атакиСтроим ханипот и выявляем DDoS-атаки
Строим ханипот и выявляем DDoS-атакиPositive Hack Days
 
Blackholing from a_providers_perspektive_theo_voss
Blackholing from a_providers_perspektive_theo_vossBlackholing from a_providers_perspektive_theo_voss
Blackholing from a_providers_perspektive_theo_vossPavel Odintsov
 
FastNetMonを試してみた
FastNetMonを試してみたFastNetMonを試してみた
FastNetMonを試してみたYutaka Ishizaki
 
How to launch and defend against a DDoS
How to launch and defend against a DDoSHow to launch and defend against a DDoS
How to launch and defend against a DDoSjgrahamc
 
DNS как линия защиты/DNS as a Defense Vector
DNS как линия защиты/DNS as a Defense VectorDNS как линия защиты/DNS as a Defense Vector
DNS как линия защиты/DNS as a Defense VectorPositive Hack Days
 
BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...
BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...
BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...JosephTesta9
 
DNS-SD
DNS-SDDNS-SD
DNS-SDnetvis
 
Be Mean to Your Code
Be Mean to Your CodeBe Mean to Your Code
Be Mean to Your CodeJames Wickett
 
DeiC DDoS Prevention System - DDPS
DeiC DDoS Prevention System - DDPSDeiC DDoS Prevention System - DDPS
DeiC DDoS Prevention System - DDPSPavel Odintsov
 
New DNS Traffic Analysis Techniques to Identify Global Internet Threats
New DNS Traffic Analysis Techniques to Identify Global Internet ThreatsNew DNS Traffic Analysis Techniques to Identify Global Internet Threats
New DNS Traffic Analysis Techniques to Identify Global Internet ThreatsOpenDNS
 
How to bypass an IDS with netcat and linux
How to bypass an IDS with netcat and linuxHow to bypass an IDS with netcat and linux
How to bypass an IDS with netcat and linuxKirill Shipulin
 
Distributed Denial of Service Attack - Detection And Mitigation
Distributed Denial of Service Attack - Detection And MitigationDistributed Denial of Service Attack - Detection And Mitigation
Distributed Denial of Service Attack - Detection And MitigationPavel Odintsov
 

What's hot (20)

DNS-SD Extentions
DNS-SD ExtentionsDNS-SD Extentions
DNS-SD Extentions
 
Building scalable web socket backend
Building scalable web socket backendBuilding scalable web socket backend
Building scalable web socket backend
 
Dhcp security #netseckh
Dhcp security #netseckhDhcp security #netseckh
Dhcp security #netseckh
 
FastNetMon Advanced DDoS detection tool
FastNetMon Advanced DDoS detection toolFastNetMon Advanced DDoS detection tool
FastNetMon Advanced DDoS detection tool
 
FastNetMon - ENOG9 speech about DDoS mitigation
FastNetMon - ENOG9 speech about DDoS mitigationFastNetMon - ENOG9 speech about DDoS mitigation
FastNetMon - ENOG9 speech about DDoS mitigation
 
Nanog66 vicente de luca fast netmon
Nanog66 vicente de luca fast netmonNanog66 vicente de luca fast netmon
Nanog66 vicente de luca fast netmon
 
VoxxedDays Minsk - Building scalable WebSocket backend
VoxxedDays Minsk - Building scalable WebSocket backendVoxxedDays Minsk - Building scalable WebSocket backend
VoxxedDays Minsk - Building scalable WebSocket backend
 
DDoS Mitigation Tools and Techniques
DDoS Mitigation Tools and TechniquesDDoS Mitigation Tools and Techniques
DDoS Mitigation Tools and Techniques
 
Строим ханипот и выявляем DDoS-атаки
Строим ханипот и выявляем DDoS-атакиСтроим ханипот и выявляем DDoS-атаки
Строим ханипот и выявляем DDoS-атаки
 
Blackholing from a_providers_perspektive_theo_voss
Blackholing from a_providers_perspektive_theo_vossBlackholing from a_providers_perspektive_theo_voss
Blackholing from a_providers_perspektive_theo_voss
 
FastNetMonを試してみた
FastNetMonを試してみたFastNetMonを試してみた
FastNetMonを試してみた
 
How to launch and defend against a DDoS
How to launch and defend against a DDoSHow to launch and defend against a DDoS
How to launch and defend against a DDoS
 
DNS как линия защиты/DNS as a Defense Vector
DNS как линия защиты/DNS as a Defense VectorDNS как линия защиты/DNS as a Defense Vector
DNS как линия защиты/DNS as a Defense Vector
 
BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...
BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...
BSides Rochester 2018: Chris Partridge: Turning Domain Data Into Domain Intel...
 
DNS-SD
DNS-SDDNS-SD
DNS-SD
 
Be Mean to Your Code
Be Mean to Your CodeBe Mean to Your Code
Be Mean to Your Code
 
DeiC DDoS Prevention System - DDPS
DeiC DDoS Prevention System - DDPSDeiC DDoS Prevention System - DDPS
DeiC DDoS Prevention System - DDPS
 
New DNS Traffic Analysis Techniques to Identify Global Internet Threats
New DNS Traffic Analysis Techniques to Identify Global Internet ThreatsNew DNS Traffic Analysis Techniques to Identify Global Internet Threats
New DNS Traffic Analysis Techniques to Identify Global Internet Threats
 
How to bypass an IDS with netcat and linux
How to bypass an IDS with netcat and linuxHow to bypass an IDS with netcat and linux
How to bypass an IDS with netcat and linux
 
Distributed Denial of Service Attack - Detection And Mitigation
Distributed Denial of Service Attack - Detection And MitigationDistributed Denial of Service Attack - Detection And Mitigation
Distributed Denial of Service Attack - Detection And Mitigation
 

Similar to Tuning Kafka for High Throughput and Low Latency on Azure

The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...
The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...
The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...HostedbyConfluent
 
F_1330_Narkhede_Kafka .pptx
F_1330_Narkhede_Kafka .pptxF_1330_Narkhede_Kafka .pptx
F_1330_Narkhede_Kafka .pptxNIMITJAIN71
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
Apache kafka
Apache kafkaApache kafka
Apache kafkaMvkZ
 
Apache kafka
Apache kafkaApache kafka
Apache kafkaMvkZ
 
Apache kafka
Apache kafkaApache kafka
Apache kafkaMvkZ
 
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance Barriers
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance BarriersCeph Day Berlin: Ceph on All Flash Storage - Breaking Performance Barriers
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance BarriersCeph Community
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBScott Mansfield
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward
 
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:Tony Antony
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Community
 
Storage and performance, Whiptail
Storage and performance, Whiptail Storage and performance, Whiptail
Storage and performance, Whiptail Internet World
 
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
High Frequency Trading and NoSQL database
High Frequency Trading and NoSQL databaseHigh Frequency Trading and NoSQL database
High Frequency Trading and NoSQL databasePeter Lawrey
 
(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...
(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...
(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...Amazon Web Services
 
Ceph Day Taipei - Ceph on All-Flash Storage
Ceph Day Taipei - Ceph on All-Flash Storage Ceph Day Taipei - Ceph on All-Flash Storage
Ceph Day Taipei - Ceph on All-Flash Storage Ceph Community
 

Similar to Tuning Kafka for High Throughput and Low Latency on Azure (20)

The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...
The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...
The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...
 
F_1330_Narkhede_Kafka .pptx
F_1330_Narkhede_Kafka .pptxF_1330_Narkhede_Kafka .pptx
F_1330_Narkhede_Kafka .pptx
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance Barriers
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance BarriersCeph Day Berlin: Ceph on All Flash Storage - Breaking Performance Barriers
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance Barriers
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash Storage
 
Storage and performance, Whiptail
Storage and performance, Whiptail Storage and performance, Whiptail
Storage and performance, Whiptail
 
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
 
High Frequency Trading and NoSQL database
High Frequency Trading and NoSQL databaseHigh Frequency Trading and NoSQL database
High Frequency Trading and NoSQL database
 
(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...
(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...
(MED305) Achieving Consistently High Throughput for Very Large Data Transfers...
 
Ceph Day Taipei - Ceph on All-Flash Storage
Ceph Day Taipei - Ceph on All-Flash Storage Ceph Day Taipei - Ceph on All-Flash Storage
Ceph Day Taipei - Ceph on All-Flash Storage
 

More from Nitin Kumar

Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafkaNitin Kumar
 
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehnerNitin Kumar
 
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...Nitin Kumar
 
Ren cao kafka connect
Ren cao   kafka connectRen cao   kafka connect
Ren cao kafka connectNitin Kumar
 
Insta clustr seattle kafka meetup presentation bb
Insta clustr seattle kafka meetup presentation   bbInsta clustr seattle kafka meetup presentation   bb
Insta clustr seattle kafka meetup presentation bbNitin Kumar
 
EventHub for kafka ecosystems kafka meetup
EventHub for kafka ecosystems   kafka meetupEventHub for kafka ecosystems   kafka meetup
EventHub for kafka ecosystems kafka meetupNitin Kumar
 
Microsoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka serviceMicrosoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka serviceNitin Kumar
 
Net flix kafka seattle meetup
Net flix kafka seattle meetupNet flix kafka seattle meetup
Net flix kafka seattle meetupNitin Kumar
 
Brandon obrien streaming_data
Brandon obrien streaming_dataBrandon obrien streaming_data
Brandon obrien streaming_dataNitin Kumar
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Nitin Kumar
 
Microsoft kafka load imbalance
Microsoft   kafka load imbalanceMicrosoft   kafka load imbalance
Microsoft kafka load imbalanceNitin Kumar
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016Nitin Kumar
 
Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaNitin Kumar
 
Seattle kafka meetup nov 2015 published siphon
Seattle kafka meetup nov 2015 published  siphonSeattle kafka meetup nov 2015 published  siphon
Seattle kafka meetup nov 2015 published siphonNitin Kumar
 

More from Nitin Kumar (16)

Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafka
 
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
 
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
 
Ren cao kafka connect
Ren cao   kafka connectRen cao   kafka connect
Ren cao kafka connect
 
Insta clustr seattle kafka meetup presentation bb
Insta clustr seattle kafka meetup presentation   bbInsta clustr seattle kafka meetup presentation   bb
Insta clustr seattle kafka meetup presentation bb
 
EventHub for kafka ecosystems kafka meetup
EventHub for kafka ecosystems   kafka meetupEventHub for kafka ecosystems   kafka meetup
EventHub for kafka ecosystems kafka meetup
 
Kafka eos
Kafka eosKafka eos
Kafka eos
 
Microsoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka serviceMicrosoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka service
 
Net flix kafka seattle meetup
Net flix kafka seattle meetupNet flix kafka seattle meetup
Net flix kafka seattle meetup
 
Avvo fkafka
Avvo fkafkaAvvo fkafka
Avvo fkafka
 
Brandon obrien streaming_data
Brandon obrien streaming_dataBrandon obrien streaming_data
Brandon obrien streaming_data
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
 
Microsoft kafka load imbalance
Microsoft   kafka load imbalanceMicrosoft   kafka load imbalance
Microsoft kafka load imbalance
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
 
Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafka
 
Seattle kafka meetup nov 2015 published siphon
Seattle kafka meetup nov 2015 published  siphonSeattle kafka meetup nov 2015 published  siphon
Seattle kafka meetup nov 2015 published siphon
 

Recently uploaded

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 

Recently uploaded (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 

Tuning Kafka for High Throughput and Low Latency on Azure

Editor's Notes

  1. Introduce ourselves... We will share our experience and learnings from running one of world’s largest Kafka deployments. Besides underlying infrastructure considerations, we discuss several tunable Kafka broker and client configurations that affect performance. Add siphon logo
  2. Scale and numbers
  3. Performance could potentially have orthogonal dimensions. In a realtime data pipeline, we are interested in throughput, latency and in some critical cases reliability as well. 
  4. From our experience, performance requirements fall in three categories. Telemetry data ingestion for near real-time processes like security and intrusion detection applications is one that requires high throughput but is tolerant to high latency. On the other end, real time online spelling and grammar check become obsolete with high latency hence have stringent latency requirements. There are applications that require both high throughput and low latency such as availability monitoring apps but can tolerate data loss. Add the numbers (like in blog) Spend time on each use case
  5. This is a summary of the configurations that have proved significant to achieve the requirements in the three quadrants. A larger batch size is required for high throughput.  Swap headlines give little explanation
  6. To stress-test our system in general and the Kafka clusters specifically, we developed an application which constantly generates message batches of random bytes to a cluster’s front-end. spins 100 threads to send 1,000 messages of 1 KB random data to each topic, in 5 ms intervals. Event Server is used as a front-end web server which implements Kafka producer and consumer APIs. It serves like Kafka Proxy. We provision multiple Event Servers in a cluster to balance the load and manage produce requests sent from thousands of client machines to Kafka brokers. Each Event Server application runs in a docker container on scale-sets of Azure Standard F8s Linux VMs, and is allocated 7 CPUs and 12 GB of memory with a maximum Java heap size set to 9 GB. To handle the large amount of traffic generated by our stress tool, we run 20 instances of these Event Servers. ES instantiates multiple parallel Kafka producer threads. Each thread instantiates one producer. The number of sliding queues is controlled by thread pool size. HDInsight Kafka clusters, running Kafka 1.1 with 20 topics and 3 replicas. Number of partitions varies throughout the tests For our experiments, we ran Null sink connectors which consume messages from Kafka, discard them and then commit the offsets. This allowed us to measure both producer and consumer throughput, while eliminating any potential bottlenecks introduced by sending data to specific destinations. Event Server is used as a front-end web server which implements Kafka producer and consumer APIs. We provision multiple Event Servers in a cluster to balance the load and manage produce requests sent from thousands of client machines to Kafka brokers.
  7.  Azure Standard D4 V2 Linux VMs We never ran into high CPU utilization with this setup. On the other hand, the number of disks had a direct effect on throughput. We monitored insync replicas and consumer lag as well Batch Size: Controls batching on producer client Linger: It puts a ceiling on how long producers wait. In low-load scenarios, this improves throughput by sacrificing latency Acks: The number of acknowledgments the producer requires the leader to have received before considering a request complete Buffer memory: controls the amount of memory available for the producer for buffering. to support larger batching, we increased this setting to 1 GB
  8. How efficiently we are using available resources? CPU was not utilized fully. We monitored Disk usage and utilization Storage disks have limited IOPS (Input/Output Operations Per Second) and read/write bytes per second. When creating new partitions, Kafka stores each new partition on the disk with fewest existing partitions to balance them across the available disks. Despite this, when processing hundreds of replicas on each disk, Kafka can easily saturate the available disk throughput. The results show a correlation of increasing throughput with an increasing number of attached disks. Never tested with SSD
  9. Monitoring the current performance to identify bottlenecks: Min.insync:  this configuration specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. else: producer raises an exception Together with acks config set to all, this can guarantee that data is written to this many replicas. We monitor both request handler idle time and time spent waiting in the request queue: if high, not enough IO threads or CPU Response queue: you need more network threads Minimum bytes expected for each fetch response. If not enough bytes, wait up to replicaMaxWaitTimeMs This setting helps when high load with larger batches,
  10. the number of partitions per broker, not including replicas increasing the partition density adds an overhead related to metadata operations and per partition request/response between the partition leader and its followers. Even in the absence of data flowing through, partition replicas still fetch data from leaders, which results in extra processing for send and receive requests over the network.  More network and IO threads helps \100 partitions per topic, i.e., a total of 200 partitions per broker (we have 20 topics and 10 brokers) CPU usage also increases with a higher rate
  11. Each Kafka producer batches records for a single partition, optimizing network and IO requests issued to a partition leader. Therefore, increasing batch size could result in higher throughput. Under light load, this may increase Kafka send latency since the producer waits for a batch to be ready. The Linger.ms setting also controls batching. It puts a ceiling on how long producers wait before sending a batch, even if the batch is not full. In low-load scenarios, this improves throughput by sacrificing latency.  Using a larger batch.size makes compression more efficient. The buffer.memory controls total memory available to a producer for buffering. If records get sent faster than they can be transmitted to Kafka then and this buffer will get exceeded then additional send calls block up to max.block.ms after then Producer throws a TimeoutException.
  12. Producer required acks configuration determines the number of acknowledgments required by the partition leader before a write request is considered completed.  While the trend is obvious, it is interesting to quantify the effect of the requried ack settings. We can see that going from no reliabilty to max reliabiltiy (acks = 0 to acks = -1)  cuts the throughput in half and almost doubles the latency.
  13. Compression is beneficial and should be considered if there is a limitation on disk capacity.
  14. Blog  Contact info reiterarte