SlideShare a Scribd company logo
1 of 39
Download to read offline
Consumer offset management
in Kafka
Joel Koshy
Kafka meetup @ LinkedIn
March 24, 2015
Consumers and offsets
5 6 7 8 9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
4
PageViewEvent-0
EmailBounceEvent-0
PageViewEvent-0
EmailBounceEvent-0 35
12
Offset map
GroupId: audit-consumer
Store offsets in ZooKeeper
admin
brokers
config
consumers
controller
controller_epoch
audit-consumer
ids
owners
offsets
PageViewEvent
EmailBounceEvent
0
0
12
35
(Don’t) store offsets in ZooKeeper
•  Heavy write-load on ZooKeeper
•  Especially an issue
– during 0.7 to 0.8 migration
– and before we switched to SSDs
•  Non-ideal work-arounds
– Increase offset-commit intervals
– Filter commits if offsets have not moved
– Spread large offset commits over commit
interval
Offset management (ideals)
•  Durable
•  Support high write-load
•  Consistent reads
•  Atomic offset commits
•  Fast commits/fetches
Store offsets in a replicated log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets Next commit
Group
Offset
Partition
Store offsets in a replicated log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets
audit-consumer
EmailBounceEvent-0
248
Store offsets in a replicated log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets
audit-consumer
EmailBounceEvent-0
248
audit-consumer
PageViewEvent-0
323
Store offsets in a replicated log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets
audit-consumer
EmailBounceEvent-0
248
audit-consumer
PageViewEvent-0
323
mirrormaker
ClickEvent-0
54543
Store offsets in a
replicated, partitioned log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets, partition 3
audit-consumer
EmailBounceEvent-0
248
audit-consumer
PageViewEvent-0
323
mirrormaker
ClickEvent-0
54543
mirrormaker
ClickEvent-1
54444
mirrormaker
ClickEvent-1
54674
__consumer_offsets, partition 8
Partition è abs(GroupId.hashCode()) % NumPartitions
Store offsets in a
replicated, partitioned log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets, partition 3
audit-consumer
EmailBounceEvent-0
248
audit-consumer
PageViewEvent-0
323
mirrormaker
ClickEvent-0
54543
mirrormaker
ClickEvent-1
54444
mirrormaker
ClickEvent-1
54674
__consumer_offsets, partition 8
Offset commits append to the offsets topic partition
Offset fetches read from the offsets topic partition
Store offsets in a
replicated, partitioned log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets, partition 3
audit-consumer
EmailBounceEvent-0
248
audit-consumer
PageViewEvent-0
323
mirrormaker
ClickEvent-0
54543
mirrormaker
ClickEvent-1
54444
mirrormaker
ClickEvent-1
54674
__consumer_offsets, partition 8
[audit-consumer, PageViewEvent-0]
[audit-consumer, EmailBounceEvent-0]
[mirrormaker, ClickEvent-0]
[mirrormaker, ClickEvent-1]
Offsets cache
323
248
54674
54543
Offset commits append to the offsets topic partition + update the cache
Offset fetches read from the offsets topic partition cache
Store offsets in a
replicated, partitioned log
audit-consumer
PageViewEvent-0
240
audit-consumer
EmailBounceEvent-0
232
__consumer_offsets, partition 3
audit-consumer
EmailBounceEvent-0
248
audit-consumer
PageViewEvent-0
323
mirrormaker
ClickEvent-0
54543
mirrormaker
ClickEvent-1
54444
mirrormaker
ClickEvent-1
54674
__consumer_offsets, partition 8
[audit-consumer, PageViewEvent-0]
[audit-consumer, EmailBounceEvent-0]
[mirrormaker, ClickEvent-0]
[mirrormaker, ClickEvent-1]
Offsets cache
323
248
54674
54543
Offset commits append to the offsets topic partition + update the cache
Offset fetches read from the offsets topic partition cache
How do we GC older offset entries?
log.cleanup.policy = compact
0 1 2 3 4 5 6 7 8 9 10
K1 K2 K1 K1 K3 K2 K4 K5 K5 K2 K6
V1 V2 V3 V4 V5 V6 V7 V8 V9
V
10
V
11
3 4
K1 K3
V4 V5
6
K4
V7
8 10
K5 K6
V9
V
11
Compaction
Offset
Key
Value
11
K2
Ø
Offset
Key
Value
Store offsets in a
replicated, partitioned, compacted
log
audit-consumer
PageViewEvent-0
126312342
audit-consumer
EmailBounceEvent-0
59843
audit-consumer
PageViewEvent-0
126319628
audit-consumer
EmailBounceEvent-0
86243
audit-consumer
PageViewEvent-0
126398102
Key
Value
audit-consumer
EmailBounceEvent-0
86243
audit-consumer
PageViewEvent-0
126398102
Compaction
Key è [Group, Topic, Partition]
Value è Offset
Dealing with dead consumers
console-consumer-38587, console-consumer-94777, console-consumer-94774, console-consumer-31199,
console-consumer-51555, console-consumer-43182, mobileServiceConsumerDwwewewA13dafddesfasdfdee33,
console-consumer-57784, python-kafka-consumer-0959a04da7c241448beb0813f002e34b, console-
consumer-70750, console-consumer-94809, console-consumer-87470, touch-me-not, console-
consumer-43246, console-consumer-69811, python-kafka-consumer-82c2d653128840d5b6bcbfc5ac7f3abc,
console-consumer-33847, console-consumer-18217, console-consumer-87493, console-consumer-26414,
console-consumer-67299, voldemort-reader-jjkoshy, console-consumer-80245,
kafka_listener_for_comments, test-flow-staging, console-consumer-8441, console-consumer-67258, data-
processor-2, console-consumer-94869, console-consumer-55242, pinot-beta-hackday_1_2, console-
consumer-6601, cloud-host1, system-metrics-monitor-01, console-consumer-70859, console-
consumer-26477, page-view-test-flow-2, page-view-test-flow-1, python-kafka-consumer-
bf33d075b22d4ddfb82d4a055303e909, console-consumer-99768, console-consumer-45509, console-
consumer-21504, points-test_devel_l1_1686489164, console-consumer-14841, console-consumer-4098,
console-consumer-14746, console-consumer-94575, cloud-dcb-host147.company.com,
teacup_reporting_alex, console-consumer-4132, console-consumer-48171, ropod-dcb-host794.company.com,
console-consumer-63743, console-consumer-36147, console-consumer-48138, console-consumer-33595,
console-consumer-6808, console-consumer-31000, console-consumer- 73064, console-consumer-18050,
console-consumer-21683, share-message, ropod-dcb-host959.company.com, ropod-dcb-host949.company.com,
sensei-test_dcb_host138.company.com_1924844804, console-consumer-38654, console-consumer-92040,
console-consumer-67052, console-consumer-82690, console-consumer-92002, console-consumer-69687,
console-consumer-31077, console-consumer-94657, console-consumer-36064, console-consumer-45675,
console-consumer-45671, console-consumer-70625, MemberSettings-dcx, console-consumer-55513, member-
links-dcx, console-consumer-85367, opportunist-company, forum-queue, console-consumer-87912,
console-consumer-75909, console-consumer-12320, sensei-test_user2_808173709, ropod-dcb-
host937.company.com, console-consumer-8710, console-consumer-48390, python-kafka-
consumer-816cebafabb34dd5be6bfce59cbee411, console-consumer-8701, console-consumer-6122, console-
consumer-6142, metrics-dcb-monitor19, console-consumer-73329, console-consumer-87942, console-
consumer-80552, console-consumer-48368, autometrics-dcb-host13, …!
Dealing with dead consumers
•  For offsets older than offset retention
period:
– Append tombstone
– Remove offset entry from cache
Recommended settings for
offsets topic
Replication factor >= 3
min.insync.replicas >= 2
unclean.leader.election.enable False
offsets.commit.required.acks -1 (all)
How to commit/fetch offsets
audit-consumer
Consumer
instance
Broker 0
Broker 1
Broker 2
Broker 3
(controller)
__consumer_offsets-34: Leader: 2, ISR: 0, 1, 2
V
I
P
Consumer
metadata
request
Response
(manager=2)
How to commit/fetch offsets
audit-consumer
Consumer
instance
Broker 0
Broker 1
Broker 2
Broker 3
(controller)
__consumer_offsets-34: Leader: 2, ISR: 0, 1, 2
Offset
fetches
Offset
commits
cache
replication
When the offset manager moves
audit-consumer
Consumer
instance
Broker 0
Broker 1
Broker 2
Broker 3
(controller)
__consumer_offsets-34: Leader: 2, ISR: 0, 1, 2
cache
Become
Leader
load
cache
When the offset manager moves
audit-consumer
Consumer
instance
Broker 0
Broker 1
Broker 2
Broker 3
(controller)
__consumer_offsets-34: Leader: 2, ISR: 0, 1, 2
cache
Become
Leader
load
cache
Become
follower
XXXXXX
When the offset manager moves
audit-consumer
Consumer
instance
Broker 0
Broker 1
Broker 2
Broker 3
(controller)
Offset
fetches
Offset
commits
cache
__consumer_offsets-34: Leader: 0, ISR: 0, 1, 2
cache
X
X
When the offset manager moves
audit-consumer
Consumer
instance
Broker 0
Broker 1
Broker 2
Broker 3
(controller)
V
I
P
Consumer
metadata
request
cache
__consumer_offsets-34: Leader: 0, ISR: 0, 1, 2
cache
Response
(manager=0)
When the offset manager moves
audit-consumer
Consumer
instance
Broker 0
Broker 1
Broker 2
Broker 3
(controller)
cache
__consumer_offsets-34: Leader: 0, ISR: 0, 1, 2
cache
Offset
commits
Offset
fetches
replication
Offset{Commit,Fetch} API
ConsumerMetadataRequest
o Group Id: String
ConsumerMetadataResponse
o Error code: Short
o Offset manager: Kafka broker info
Offset{Commit,Fetch} API
OffsetCommitRequest
o groupId: String
o Offset map
§  Key è Topic-partition
§  Value è Partition-data
•  Offset: Long
•  Timestamp: Long
•  Metadata: String
KAFKA-1634: changes semantics of timestamp
to retention
Offset{Commit,Fetch} API
OffsetCommitResponse
o Response map
§  Key è Topic-partition
§  Value è Error code
Offset{Commit,Fetch} API
OffsetFetchRequest
o Group Id: String
o Partitions: List<Topic-partition>
OffsetFetchResponse
o Response map
§  Key è Topic-partition
§  Value è Partition-data
•  Offset: Long
•  Metadata: String
•  Error code: Short
Offset{Commit,Fetch} API
Code samples: http://bit.ly/1LTJBYo
Offset{Commit,Fetch} API
KafkaConsumer<K, V> consumer = new KafkaConsumer<K, V>(properties);!
…!
TopicPartition partition1 = new TopicPartition("topic1", 0);!
TopicPartition partition1 = new TopicPartition("topic1", 1);!
!
consumer.subscribe(partition1, partition2);!
!
Map<TopicPartition, Long> offsets = new LinkedHashMap<TopicPartition,
Long>();!
offsets.put(partition1, 123L);!
offsets.put(partition2, 4320L);!
…!
// commit offsets!
consumer.commit(offsets, CommitType.SYNC);!
…!
// fetch offsets!
long committedOffset = consumer.committed(partition1);!
!
How to read the offsets topic
To read everything, use the console consumer!
./bin/kafka-console-consumer.sh --topic __consumer_offsets --
zookeeper localhost:2181 --formatter "kafka.server.OffsetManager
$OffsetsMessageFormatter" --consumer.config config/
consumer.properties!
(Must set exclude.internal.topics = false in consumer.properties)
!
To read a single partition, use the simple-
consumer-shell
./bin/kafka-simple-consumer-shell.sh --topic __consumer_offsets --
partition 12 --broker-list localhost:9092 --formatter
"kafka.server.OffsetManager$OffsetsMessageFormatter"!
Inside the offsets topic
[Group, Topic, Partition]::[Offset, Metadata, Timestamp]
[audit-consumer,PageViewEvent,7]::OffsetAndMetadata[53568,NO_METADATA,1416363620711]!
[audit-consumer,service-log-event,5]::OffsetAndMetadata[168012,NO_METADATA,
1416363620711]!
[audit-consumer,EmailBounceEvent,4]::OffsetAndMetadata[8524676,NO_METADATA,
1416363620711]!
[audit-consumer,ClickEvent,0]::OffsetAndMetadata[8132292,NO_METADATA,1416363620711]!
[audit-consumer,metrics-event,1]::OffsetAndMetadata[1835900,NO_METADATA,1416363620711]!
[audit-consumer,CompanyEvent,0]::OffsetAndMetadata[109337,NO_METADATA,1416363620711]!
[audit-consumer,test-topic,1]::OffsetAndMetadata[352989,NO_METADATA,1416363620711]!
[audit-consumer,meetup-event,2]::OffsetAndMetadata[39961,NO_METADATA,1416363620711]!
[audit-consumer,push-topic,6]::OffsetAndMetadata[4210366,NO_METADATA,1416363620711]!
How to migrate/roll-back
Migrate from ZooKeeper to Kafka:
•  Config change
– offsets.storage=kafka
– dual.commit.enabled=true
•  Rolling bounce
•  Config change
– dual.commit.enabled=false
•  Rolling bounce
How to migrate/roll-back
Migrate from Kafka to ZooKeeper:
•  Config change
– dual.commit.enabled=true
•  Rolling bounce
•  Config change
– offsets.storage=zookeeper
– dual.commit.enabled=false
•  Rolling bounce
Key metrics to monitor
•  Consumer mbeans
–  Kafka commit rate
–  ZooKeeper commit rate (during migration)
•  Broker mbeans
–  Max-dirty ratio and other log cleaner metrics
–  Offset cache size
–  Group count
–  {ConsumerMetadata, OffsetCommit, OffsetFetch}
request metrics
0.8.3
•  Support compression in compacted topics
(KAFKA-1734)
•  Change offset commit “timestamp” to
mean retention period: KAFKA-1634
•  Offset client
Monitor it!
Acknowledgments
Kafka team @ LinkedIn
Jay Kreps, Jun Rao, Neha Narkhede @ Confluent
Tejas (2013 intern): http://lnkdin.me/p/tejaspatil1

More Related Content

What's hot

Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to KafkaAkash Vacher
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin PodvalMartin Podval
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022HostedbyConfluent
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Databricks
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...HostedbyConfluent
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...confluent
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...Databricks
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3SANG WON PARK
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019confluent
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...confluent
 
Jitney, Kafka at Airbnb
Jitney, Kafka at AirbnbJitney, Kafka at Airbnb
Jitney, Kafka at Airbnbalexismidon
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent
 

What's hot (20)

Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
 
Jitney, Kafka at Airbnb
Jitney, Kafka at AirbnbJitney, Kafka at Airbnb
Jitney, Kafka at Airbnb
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 

Similar to Consumer offset management in Kafka

Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015Joel Koshy
 
The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...confluent
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appNeil Avery
 
Cruise Control: Effortless management of Kafka clusters
Cruise Control: Effortless management of Kafka clustersCruise Control: Effortless management of Kafka clusters
Cruise Control: Effortless management of Kafka clustersPrateek Maheshwari
 
Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)
Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)
Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)yarusun
 
Monetdb basic bat operation
Monetdb basic bat operationMonetdb basic bat operation
Monetdb basic bat operationChen Wang
 
VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...
VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...
VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...VMworld
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarKarthik Ramasamy
 
Behind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by ExnessBehind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by ExnessMaxim Gaponov
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
Architecting Microservices Applications with Instant Analytics
Architecting Microservices Applications with Instant AnalyticsArchitecting Microservices Applications with Instant Analytics
Architecting Microservices Applications with Instant Analyticsconfluent
 
Understanding and Extending Prometheus AlertManager
Understanding and Extending Prometheus AlertManagerUnderstanding and Extending Prometheus AlertManager
Understanding and Extending Prometheus AlertManagerLee Calcote
 
ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...
ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...
ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...DynamicInfraDays
 
20160221 va interconnect_pub
20160221 va interconnect_pub20160221 va interconnect_pub
20160221 va interconnect_pubCanturk Isci
 
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...VMworld
 
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...VMworld
 
SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...
SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...
SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...South Tyrol Free Software Conference
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightDataStax Academy
 

Similar to Consumer offset management in Kafka (20)

Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015
 
The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming app
 
Cruise Control: Effortless management of Kafka clusters
Cruise Control: Effortless management of Kafka clustersCruise Control: Effortless management of Kafka clusters
Cruise Control: Effortless management of Kafka clusters
 
Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)
Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)
Verified AZ-104 Exam Dumps (V26.02) - Pass Microsoft AZ-104 Exam (2024)
 
Monetdb basic bat operation
Monetdb basic bat operationMonetdb basic bat operation
Monetdb basic bat operation
 
VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...
VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...
VMworld 2013: Part 2: How to Build a Self-Healing Data Center with vCenter Or...
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
 
Behind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by ExnessBehind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by Exness
 
Microservices development for DevOps
Microservices development for DevOpsMicroservices development for DevOps
Microservices development for DevOps
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
Architecting Microservices Applications with Instant Analytics
Architecting Microservices Applications with Instant AnalyticsArchitecting Microservices Applications with Instant Analytics
Architecting Microservices Applications with Instant Analytics
 
Understanding and Extending Prometheus AlertManager
Understanding and Extending Prometheus AlertManagerUnderstanding and Extending Prometheus AlertManager
Understanding and Extending Prometheus AlertManager
 
ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...
ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...
ContainerDays Boston 2016: "Autopilot: Running Real-world Applications in Con...
 
20160221 va interconnect_pub
20160221 va interconnect_pub20160221 va interconnect_pub
20160221 va interconnect_pub
 
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
 
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
 
SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...
SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...
SFScon 22 - Andrea Janes - Scalability assessment applied to microservice arc...
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 

Recently uploaded

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 

Recently uploaded (20)

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 

Consumer offset management in Kafka

  • 1. Consumer offset management in Kafka Joel Koshy Kafka meetup @ LinkedIn March 24, 2015
  • 2. Consumers and offsets 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 2 3 2 4 2 5 2 6 2 7 2 8 2 9 3 0 3 1 3 2 3 3 3 4 PageViewEvent-0 EmailBounceEvent-0 PageViewEvent-0 EmailBounceEvent-0 35 12 Offset map GroupId: audit-consumer
  • 3. Store offsets in ZooKeeper admin brokers config consumers controller controller_epoch audit-consumer ids owners offsets PageViewEvent EmailBounceEvent 0 0 12 35
  • 4. (Don’t) store offsets in ZooKeeper •  Heavy write-load on ZooKeeper •  Especially an issue – during 0.7 to 0.8 migration – and before we switched to SSDs •  Non-ideal work-arounds – Increase offset-commit intervals – Filter commits if offsets have not moved – Spread large offset commits over commit interval
  • 5. Offset management (ideals) •  Durable •  Support high write-load •  Consistent reads •  Atomic offset commits •  Fast commits/fetches
  • 6. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets Next commit Group Offset Partition
  • 7. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets audit-consumer EmailBounceEvent-0 248
  • 8. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323
  • 9. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543
  • 10. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 Partition è abs(GroupId.hashCode()) % NumPartitions
  • 11. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 Offset commits append to the offsets topic partition Offset fetches read from the offsets topic partition
  • 12. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 [audit-consumer, PageViewEvent-0] [audit-consumer, EmailBounceEvent-0] [mirrormaker, ClickEvent-0] [mirrormaker, ClickEvent-1] Offsets cache 323 248 54674 54543 Offset commits append to the offsets topic partition + update the cache Offset fetches read from the offsets topic partition cache
  • 13. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 [audit-consumer, PageViewEvent-0] [audit-consumer, EmailBounceEvent-0] [mirrormaker, ClickEvent-0] [mirrormaker, ClickEvent-1] Offsets cache 323 248 54674 54543 Offset commits append to the offsets topic partition + update the cache Offset fetches read from the offsets topic partition cache How do we GC older offset entries?
  • 14. log.cleanup.policy = compact 0 1 2 3 4 5 6 7 8 9 10 K1 K2 K1 K1 K3 K2 K4 K5 K5 K2 K6 V1 V2 V3 V4 V5 V6 V7 V8 V9 V 10 V 11 3 4 K1 K3 V4 V5 6 K4 V7 8 10 K5 K6 V9 V 11 Compaction Offset Key Value 11 K2 Ø Offset Key Value
  • 15. Store offsets in a replicated, partitioned, compacted log audit-consumer PageViewEvent-0 126312342 audit-consumer EmailBounceEvent-0 59843 audit-consumer PageViewEvent-0 126319628 audit-consumer EmailBounceEvent-0 86243 audit-consumer PageViewEvent-0 126398102 Key Value audit-consumer EmailBounceEvent-0 86243 audit-consumer PageViewEvent-0 126398102 Compaction Key è [Group, Topic, Partition] Value è Offset
  • 16. Dealing with dead consumers console-consumer-38587, console-consumer-94777, console-consumer-94774, console-consumer-31199, console-consumer-51555, console-consumer-43182, mobileServiceConsumerDwwewewA13dafddesfasdfdee33, console-consumer-57784, python-kafka-consumer-0959a04da7c241448beb0813f002e34b, console- consumer-70750, console-consumer-94809, console-consumer-87470, touch-me-not, console- consumer-43246, console-consumer-69811, python-kafka-consumer-82c2d653128840d5b6bcbfc5ac7f3abc, console-consumer-33847, console-consumer-18217, console-consumer-87493, console-consumer-26414, console-consumer-67299, voldemort-reader-jjkoshy, console-consumer-80245, kafka_listener_for_comments, test-flow-staging, console-consumer-8441, console-consumer-67258, data- processor-2, console-consumer-94869, console-consumer-55242, pinot-beta-hackday_1_2, console- consumer-6601, cloud-host1, system-metrics-monitor-01, console-consumer-70859, console- consumer-26477, page-view-test-flow-2, page-view-test-flow-1, python-kafka-consumer- bf33d075b22d4ddfb82d4a055303e909, console-consumer-99768, console-consumer-45509, console- consumer-21504, points-test_devel_l1_1686489164, console-consumer-14841, console-consumer-4098, console-consumer-14746, console-consumer-94575, cloud-dcb-host147.company.com, teacup_reporting_alex, console-consumer-4132, console-consumer-48171, ropod-dcb-host794.company.com, console-consumer-63743, console-consumer-36147, console-consumer-48138, console-consumer-33595, console-consumer-6808, console-consumer-31000, console-consumer- 73064, console-consumer-18050, console-consumer-21683, share-message, ropod-dcb-host959.company.com, ropod-dcb-host949.company.com, sensei-test_dcb_host138.company.com_1924844804, console-consumer-38654, console-consumer-92040, console-consumer-67052, console-consumer-82690, console-consumer-92002, console-consumer-69687, console-consumer-31077, console-consumer-94657, console-consumer-36064, console-consumer-45675, console-consumer-45671, console-consumer-70625, MemberSettings-dcx, console-consumer-55513, member- links-dcx, console-consumer-85367, opportunist-company, forum-queue, console-consumer-87912, console-consumer-75909, console-consumer-12320, sensei-test_user2_808173709, ropod-dcb- host937.company.com, console-consumer-8710, console-consumer-48390, python-kafka- consumer-816cebafabb34dd5be6bfce59cbee411, console-consumer-8701, console-consumer-6122, console- consumer-6142, metrics-dcb-monitor19, console-consumer-73329, console-consumer-87942, console- consumer-80552, console-consumer-48368, autometrics-dcb-host13, …!
  • 17. Dealing with dead consumers •  For offsets older than offset retention period: – Append tombstone – Remove offset entry from cache
  • 18. Recommended settings for offsets topic Replication factor >= 3 min.insync.replicas >= 2 unclean.leader.election.enable False offsets.commit.required.acks -1 (all)
  • 19. How to commit/fetch offsets audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 V I P Consumer metadata request Response (manager=2)
  • 20. How to commit/fetch offsets audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 Offset fetches Offset commits cache replication
  • 21. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 cache Become Leader load cache
  • 22. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 cache Become Leader load cache Become follower XXXXXX
  • 23. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) Offset fetches Offset commits cache __consumer_offsets-34: Leader: 0, ISR: 0, 1, 2 cache X X
  • 24. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) V I P Consumer metadata request cache __consumer_offsets-34: Leader: 0, ISR: 0, 1, 2 cache Response (manager=0)
  • 25. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) cache __consumer_offsets-34: Leader: 0, ISR: 0, 1, 2 cache Offset commits Offset fetches replication
  • 26. Offset{Commit,Fetch} API ConsumerMetadataRequest o Group Id: String ConsumerMetadataResponse o Error code: Short o Offset manager: Kafka broker info
  • 27. Offset{Commit,Fetch} API OffsetCommitRequest o groupId: String o Offset map §  Key è Topic-partition §  Value è Partition-data •  Offset: Long •  Timestamp: Long •  Metadata: String KAFKA-1634: changes semantics of timestamp to retention
  • 28. Offset{Commit,Fetch} API OffsetCommitResponse o Response map §  Key è Topic-partition §  Value è Error code
  • 29. Offset{Commit,Fetch} API OffsetFetchRequest o Group Id: String o Partitions: List<Topic-partition> OffsetFetchResponse o Response map §  Key è Topic-partition §  Value è Partition-data •  Offset: Long •  Metadata: String •  Error code: Short
  • 30. Offset{Commit,Fetch} API Code samples: http://bit.ly/1LTJBYo
  • 31. Offset{Commit,Fetch} API KafkaConsumer<K, V> consumer = new KafkaConsumer<K, V>(properties);! …! TopicPartition partition1 = new TopicPartition("topic1", 0);! TopicPartition partition1 = new TopicPartition("topic1", 1);! ! consumer.subscribe(partition1, partition2);! ! Map<TopicPartition, Long> offsets = new LinkedHashMap<TopicPartition, Long>();! offsets.put(partition1, 123L);! offsets.put(partition2, 4320L);! …! // commit offsets! consumer.commit(offsets, CommitType.SYNC);! …! // fetch offsets! long committedOffset = consumer.committed(partition1);! !
  • 32. How to read the offsets topic To read everything, use the console consumer! ./bin/kafka-console-consumer.sh --topic __consumer_offsets -- zookeeper localhost:2181 --formatter "kafka.server.OffsetManager $OffsetsMessageFormatter" --consumer.config config/ consumer.properties! (Must set exclude.internal.topics = false in consumer.properties) ! To read a single partition, use the simple- consumer-shell ./bin/kafka-simple-consumer-shell.sh --topic __consumer_offsets -- partition 12 --broker-list localhost:9092 --formatter "kafka.server.OffsetManager$OffsetsMessageFormatter"!
  • 33. Inside the offsets topic [Group, Topic, Partition]::[Offset, Metadata, Timestamp] [audit-consumer,PageViewEvent,7]::OffsetAndMetadata[53568,NO_METADATA,1416363620711]! [audit-consumer,service-log-event,5]::OffsetAndMetadata[168012,NO_METADATA, 1416363620711]! [audit-consumer,EmailBounceEvent,4]::OffsetAndMetadata[8524676,NO_METADATA, 1416363620711]! [audit-consumer,ClickEvent,0]::OffsetAndMetadata[8132292,NO_METADATA,1416363620711]! [audit-consumer,metrics-event,1]::OffsetAndMetadata[1835900,NO_METADATA,1416363620711]! [audit-consumer,CompanyEvent,0]::OffsetAndMetadata[109337,NO_METADATA,1416363620711]! [audit-consumer,test-topic,1]::OffsetAndMetadata[352989,NO_METADATA,1416363620711]! [audit-consumer,meetup-event,2]::OffsetAndMetadata[39961,NO_METADATA,1416363620711]! [audit-consumer,push-topic,6]::OffsetAndMetadata[4210366,NO_METADATA,1416363620711]!
  • 34. How to migrate/roll-back Migrate from ZooKeeper to Kafka: •  Config change – offsets.storage=kafka – dual.commit.enabled=true •  Rolling bounce •  Config change – dual.commit.enabled=false •  Rolling bounce
  • 35. How to migrate/roll-back Migrate from Kafka to ZooKeeper: •  Config change – dual.commit.enabled=true •  Rolling bounce •  Config change – offsets.storage=zookeeper – dual.commit.enabled=false •  Rolling bounce
  • 36. Key metrics to monitor •  Consumer mbeans –  Kafka commit rate –  ZooKeeper commit rate (during migration) •  Broker mbeans –  Max-dirty ratio and other log cleaner metrics –  Offset cache size –  Group count –  {ConsumerMetadata, OffsetCommit, OffsetFetch} request metrics
  • 37. 0.8.3 •  Support compression in compacted topics (KAFKA-1734) •  Change offset commit “timestamp” to mean retention period: KAFKA-1634 •  Offset client
  • 39. Acknowledgments Kafka team @ LinkedIn Jay Kreps, Jun Rao, Neha Narkhede @ Confluent Tejas (2013 intern): http://lnkdin.me/p/tejaspatil1