Kafka at Peak Performance

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.
Kafka at Peak Performance

Todd Palino

Who Am I?
3

Kafka At LinkedIn
 1100+ Kafka brokers
 Over 32,000 topics
 350,000+ Partitions
 875 Billion messages per day
 185 Terabytes In
 675 Terabytes Out
 Peak Load (whole site)
– 10.5 Million messages/sec
– 18.5 Gigabits/sec Inbound
– 70.5 Gigabits/sec Outbound
4
 1800+ Kafka brokers
 Over 79,000 topics
 1,130,000+ Partitions
 1.3 Trillion messages per day
 330 Terabytes In
 1.2 Petabytes Out
 Peak Load (single cluster)
– 2 Million messages/sec
– 4.7 Gigabits/sec Inbound
– 15 Gigabits/sec Outbound

What Will We Talk About?
 Picking Your Hardware
 Monitoring the Cluster
 Triaging Broker Performance Problems
 Conclusion
5

Hardware Selection
6

What’s Important To You?
 Message Retention - Disk size
 Message Throughput - Network capacity
 Producer Performance - Disk I/O
 Consumer Performance - Memory
7

Go Wide
 Kafka is well-suited to horizontal scaling
 RAIS - Redundant Array of Inexpensive Servers
 Also helps with CPU utilization
– Kafka needs to decompress and recompress every message batch
– KIP-31 will help with this by eliminating recompression
 Don’t co-locate Kafka
8

Disk Layout
 RAID
– Can survive a single disk failure (not RAID 0)
– Provides the broker with a single log directory
– Eats up disk I/O
 JBOD
– Gives Kafka all the disk I/O available
– Broker is not smart about balancing partitions
– If one disk fails, the entire broker stops
 Amazon EBS performance works!
9

Operating System Tuning
 Filesystem Options
– EXT or XFS
– Using unsafe mount options
 Virtual Memory
– Swappiness
– Dirty Pages
 Networking
10

Java
 Only use JDK 8 now
 Keep heap size small
– Even our largest brokers use a 6 GB heap
– Save the rest for page cache
 Garbage Collection - G1 all the way
– Basic tuning only
– Watch for humongous allocations
11

How Much Do You Need?
12

Buy The Book!
13
Early Access available now.
Covers all aspects of Kafka,
from setup to client
development to ongoing
administration and
troubleshooting.
Also discusses stream
processing and other use
cases.

Kafka Cluster Sizing
 How big for your local cluster?
– How much disk space do you have?
– How much network bandwidth do you have?
– CPU, memory, disk I/O
 How big for your aggregate cluster?
– In general, multiple the number of brokers by the number of local clusters
– May have additional concerns with lots of consumers
14

Topic Configuration
 Partition Counts for Local
– Many theories on how to do this correctly, but the answer is “it depends”
– How many consumers do you have?
– Do you have specific partition requirements?
– Keeping partition sizes manageable
 Partition Counts for Aggregate
– Multiply the number of partitions in a local cluster by the number of local clusters
– Periodically review partition counts in all clusters
 Message Retention
– If aggregate is where you really need the messages, only retain it in local for long
enough to cover mirror maker problems
15

Possible Broker Improvements
 Namespaces
– Namespace topics by datacenter
– Eliminate local clusters and just have aggregate
– Significant hardware savings
 JBOD Fixes
– Intelligent partition assignment
– Admin tools to move partitions between mount points
– Broker should not fail completely with a single disk failure
16

Administrative Improvements
 Multiple cluster management
– Topic management across clusters
– Visualization of mirror maker paths
 Better client monitoring
– Burrow for consumer monitoring
– No open source solution for producer monitoring (audit)
 End-to-end availability monitoring
17

Keeping An Eye On Things
18

Monitoring The Foundation
 CPU Load
 Network inbound and outbound
 Filehandle usage for Kafka
 Disk
– Free space - where you write logs, and where Kafka stores messages
– Free inodes
– I/O performance - at least average wait and percent utilization
 Garbage Collection
19

Broker Ground Rules
 Tuning
– Stick (mostly) with the defaults
– Set default cluster retention as appropriate
– Default partition count should be at least the number of brokers
 Monitoring
– Watch the right things
– Don’t try to alert on everything
 Triage and Resolution
– Solve problems, don’t mask them
20

Too Much Information!
 Monitoring teams hate Kafka
– Per-Topic metrics
– Per-Partition metrics
– Per-Client metrics
 Capture as much as you can
– Many metrics are useful while triaging an issue
 Clients want metrics on their own topics
 Only alert on what is needed to signal a problem
21

Broker Monitoring
 Bytes In and Out, Messages In
– Why not messages out?
 Partitions
– Count and Leader Count
– Under Replicated and Offline
 Threads
– Network pool, Request pool
– Max Dirty Percent
 Requests
– Rates and times - total, queue, local, and send
22

Topic Monitoring
 Bytes In, Bytes Out
 Messages In, Produce Rate, Produce Failure Rate
 Fetch Rate, Fetch Failure Rate
 Partition Bytes
 Log End Offset
– Why bother?
– KIP-32 will make this unnecessary
 Quota Throttling
 Provide this to your customers for them to alert on
23

Client Monitoring
 For consumers, use Burrow
– Monitor all partitions for all consumers
– Provides an easy to digest “good, warning, bad” state, with detail available
– Fast and free
 Producers are a little harder
– Several internal implementations of message auditing
– The community needs a good open source standard
 Cluster availability monitoring
– kafka-monitoring is coming soon from LinkedIn!
24

It’s Broken! Now What?
25

All The Best Ops People…
 Know more of what is happening than their customers
 Are proactive
 Fix bugs, not work around them
 This applies to our developers too!
26

Anticipating Trouble
 Trend cluster utilization and growth over time
 Use default configurations for quotas and retention to require customers to
talk to you
 Monitor request times
– If you are able to develop a consistent baseline, this is early warning
27

Under Replicated Partitions
 Count of number of partitions which are not fully replicated within the
cluster
 Also referred to as “replica lag”
 Primary indicator of problems within the cluster
28

Broker Performance Checks
 Are you still running 0.8?
 Are all the brokers in the cluster working?
 Are the network interfaces saturated?
– Reelect partition leaders
– Rebalance partitions in the cluster
– Spread out traffic more (increase partitions or brokers)
 Is the CPU utilization high? (especially iowait)
– Is another process competing for resources?
– Look for a bad disk
 Do you have really big messages?
29

Kafka’s OK, Now What?
 If Kafka is working properly, it’s probably a client issue
– Don’t throw it over the fence. Help your customers understand
 Common producer issues
– Batch size and linger time
– Receive and send buffers
– Sync vs. async, and acknowledgements
 Common consumer issues
– Garbage collection problems
– Min fetch bytes and max wait time
– Not enough partitions
30

Conclusion
31

One Ecosystem
 Kafka can scale to millions of messages per second, and more
– Operations must scale the cluster appropriately
– Developers must use the right tuning and go parallel
 Few problems are owned by only one side
– Expanding partitions often requires coordination
– Applications that need higher reliability drive cluster configurations
 Either we work together, or we fail separately
32

Would You Like To Know More?
 Presentations: http://www.slideshare.net/toddpalino
– More Datacenters, More Problems
– Kafka As A Service
– Always download the originals for slide notes!
 Blog Posts: https://engineering.linkedin.com/blog
– Development and SRE blogs on Kafka and other topics
 LinkedIn Open Source: https://github.com/linkedin/streaming
– Burrow Consumer Monitoring - https://github.com/linkedin/Burrow
– Kafka Admin Tools - https://github.com/linkedin/kafka-tools
33

Getting Involved With Kafka
 http://kafka.apache.org
 Join the mailing lists
– users@kafka.apache.org
– dev@kafka.apache.org
 irc.freenode.net - #apache-kafka
 Meetups
– Apache Kafka - http://www.meetup.com/http-kafka-apache-org
– Bay Area Samza - http://www.meetup.com/Bay-Area-Samza-Meetup/
 Contribute code
34

Data @ LinkedIn is Hiring!
 Streams Infrastructure
– Kafka pub/sub ecosystem
– Stream Processing Platform built on Apache Samza
– Next Generation change capture technology (incubating)
 LinkedIn
– Strong commitment to open source
– Do cool things and work with awesome people
 Join us in working on cutting edge stream processing infrastructures
– Please contact kparamasivam@linkedin.com
– Software developers and Site Reliability Engineers at all levels
35

Appendix
37

JDK Options
Heap Size -Xmx6g -Xms6g
Metaspace -XX:MetaspaceSize=96m -XX:MinMetaspaceFreeRatio=50
-XX:MaxMetaspaceFreeRatio=80
G1 Tuning -XX:+UseG1GC -XX:MaxGCPauseMillis=20
-XX:InitiatingHeapOccupancyPercent=35
-XX:G1HeapRegionSize=16M
GC Logging -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution
-XX:+PrintGCDetails -XX:+PrintGCDateStamps
-XX:+PrintTenuringDistribution
-Xloggc:/path/to/logs/gc.log -verbose:gc
Error Handling -XX:-HeapDumpOnOutOfMemoryError
-XX:ErrorFile=/path/to/logs/hs_err.log
38

OS Tuning Parameters
 Networking:
net.core.rmem_default = 124928
net.core.rmem_max = 2048000
net.core.wmem_default = 124928
net.core.wmem_max = 2048000
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_max_tw_buckets = 262144
net.ipv4.tcp_max_syn_backlog = 1024
39

OS Tuning Parameters (cont.)
 Virtual Memory
vm.oom_kill_allocating_task = 1
vm.max_map_count = 200000
vm.swappiness = 1
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 500
vm.dirty_ratio = 60
vm.dirty_background_ratio = 5
40

Kafka Broker Sensors
kafka.server:name=BytesInPerSec,type=BrokerTopicMetrics
kafka.server:name=BytesOutPerSec,type=BrokerTopicMetrics
kafka.server:name=MessagesInPerSec,type=BrokerTopicMetrics
kafka.server:name=PartitionCount,type=ReplicaManager
kafka.server:name=LeaderCount,type=ReplicaManager
kafka.server:name=UnderReplicatedPartitions,type=ReplicaManager
kafka.server:name=RequestHandlerAvgIdlePercent,type=KafkaRequestHandlerPool
kafka.controller:name=ActiveControllerCount,type=KafkaController
kafka.controller:name=OfflinePartitionsCount,type=KafkaController
kafka.log:name=max-dirty-percent,type=LogCleanerManager
kafka.network:name=NetworkProcessorAvgIdlePercent,type=SocketServer
kafka.network:name=RequestsPerSec=*,type=RequestMetrics
kafka.network:name=RequestQueueTimeMs,request=*,type=RequestMetrics
kafka.network:name=LocalTimeMs,request=*,type=RequestMetrics
kafka.network:name=RemoteTimeMs,request=*,type=RequestMetrics
kafka.network:name=ResponseQueueTimeMs,request=*,type=RequestMetrics
kafka.network:name=ResponseSendTimeMs,request=*,type=RequestMetrics
kafka.network:name=TotalTimeMs,request=*,type=RequestMetrics
41

Kafka Broker Sensors - Topics
kafka.server:name=BytesInPerSec,type=BrokerTopicMetrics,topics=*
kafka.server:name=BytesOutPerSec,type=BrokerTopicMetrics,topics=*
kafka.server:name=MessagesInPerSec,type=BrokerTopicMetrics,topics=*
kafka.server:name=TotalProduceRequestsPerSec,type=BrokerTopicMetrics,topic=*
kafka.server:name=FailedProduceRequestsPerSec,type=BrokerTopicMetrics,topic=*
kafka.server:name=TotalFetchRequestsPerSec,type=BrokerTopicMetrics,topic=*
kafka.server:name=FailedFetchRequestsPerSec,type=BrokerTopicMetrics,topic=*
kafka.log:type=Log,name=LogEndOffset,topic=*,partition=*
42

Kafka at Peak Performance

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Kafka at Peak Performance

Similar to Kafka at Peak Performance (20)

More from Todd Palino

More from Todd Palino (11)

Recently uploaded

Recently uploaded (20)

Kafka at Peak Performance