12. 12
ZooKeeper ZooKeeper ZooKeeper
Kafka Kafka Kafka
Java app C# app
Go app Python app
REST Proxy
Connect
Connectors
Streams
apps
Schema
Registry
Replicator
ADB
Control
Center
13. 13
State in Kafka Stateless – No Storage
Planning for storage and containers
State on local disk
ZooKeeperKafka
Streams
apps
Control
Center
Java app
C# app Go app
Python app
REST ProxyConnect
Connectors
Schema
Registry
Replicator
ADB
15. 15
The Storage Questions
How Much?
• Kafka:
Throughput * Retention
• Zookeeper:
Very little
• Control Center:
Lots
• Streams:
It is complicated
Hardware
• Are SSDs
worth it?
• Is shared
storage ok?
Configuration
• RAID vs JBOD
• XFS or EXT4
• Zookeeper log
Partitions
• Per topic
• Per broker
• Total
16. 16
Memory
Page Cache JVM Heap Off Heap Memory
ZooKeeper 1-4 GB
Kafka The more the
merrier
# partitions * max fetch
+ compaction buffer +
~10%
Kafka Connect # tasks * # partitions *
memory buffer per
partition
Kafka Streams 10GB Buffer Cache + 1MB
per partition or 50MB per
broker =~
32GB J
RocksDB
Rest Proxy ~ 1 GB
Schema Registry ~ 1 GB
Clients Java client – batching and retries Other clients – batching and retries
Control Center It is a streams app… 32GB It is a streams app…
17. 17
• CPU
• Keep an eye on Zookeeper and Kafka
• High CPU usually caused by misconfig
• Or by… Compression, encryption, high request rate
• Network
• 1GbE = 100MB/s rate (including replication!)
• Leave room for catching up
• Compress!
18. 18
Special Considerations for Clouds
• Virtual cores are relatively weak
• Network is typically weak
• Shared storage is typically awesome
20. 20
1. Which components do I need?
2. Small or Large cluster?
Kafka just for one team and one app?
Centralized cluster for larger organization?
3. Other requirements?
Availability, retention, latency, throughput.
4. How many clusters?
Ask yourself:
28. 28
1. Key components in Confluent Platform and their requirements
2. Key resources and how to use them effectively
3. Planning your deployment
4. Monitoring in order to scale
5. Want to learn more? There is a paper!
https://www.confluent.io/whitepaper/confluent-enterprise-reference-
architecture/
What did we learn?