ZooKeeper ZooKeeper ZooKeeper
Kafka Kafka Kafka
Java app C# app
Go app Python app
State in Kafka Stateless – No Storage
Planning for storage and containers
State on local disk
C# app Go app
The Storage Questions
Throughput * Retention
• Control Center:
It is complicated
• Are SSDs
• Is shared
• RAID vs JBOD
• XFS or EXT4
• Zookeeper log
• Per topic
• Per broker
Page Cache JVM Heap Off Heap Memory
ZooKeeper 1-4 GB
Kafka The more the
# partitions * max fetch
+ compaction buffer +
Kafka Connect # tasks * # partitions *
memory buffer per
Kafka Streams 10GB Buffer Cache + 1MB
per partition or 50MB per
Rest Proxy ~ 1 GB
Schema Registry ~ 1 GB
Clients Java client – batching and retries Other clients – batching and retries
Control Center It is a streams app… 32GB It is a streams app…
• Keep an eye on Zookeeper and Kafka
• High CPU usually caused by misconfig
• Or by… Compression, encryption, high request rate
• 1GbE = 100MB/s rate (including replication!)
• Leave room for catching up
Special Considerations for Clouds
• Virtual cores are relatively weak
• Network is typically weak
• Shared storage is typically awesome
Planning a Deployment
of Confluent Platform
1. Which components do I need?
2. Small or Large cluster?
Kafka just for one team and one app?
Centralized cluster for larger organization?
3. Other requirements?
Availability, retention, latency, throughput.
4. How many clusters?
Place many related
on each node
Scale each component as
Some Stuff takes time/resources to change:
• Additional partitions
• Additional brokers
• Additional Zookeeper instances
• Additional Control Centers (impossible right now)
Monitor your cluster closely to know when to add resources
1. Key components in Confluent Platform and their requirements
2. Key resources and how to use them effectively
3. Planning your deployment
4. Monitoring in order to scale
5. Want to learn more? There is a paper!
What did we learn?