Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Kafka - Patterns anti-patterns

3,272 views

Published on

2019 Devoxx fr presentation

Published in: Software
  • Login to see the comments

Apache Kafka - Patterns anti-patterns

  1. 1. #DevoxxFR Apache Kafka Patterns / AntiPatterns Florent Ramière @framiere Jean-Louis Boudart @jlboudart 1
  2. 2. 2 Problem ?
  3. 3. 3 Silos explained by Data Gravity concept As data accumulates (builds mass) there is a greater likelihood that additional services and applications will be attracted to this data. This is the same effect gravity has on objects around a planet. As the mass or density increases, so does the strength of gravitational pull.
  4. 4. 4 With
  5. 5. 5 How
  6. 6. 6 Store & ETL Process Publish & Subscribe In short
  7. 7. 7 From a simple idea
  8. 8. 8 From a simple idea
  9. 9. 9 with great properties ! • Scalability • Retention • Durability • Replication • Security • Resiliency • Throughput • Ordering • Exactly Once Semantic • Transaction • Idempotency • Immutability • …
  10. 10. 10 10 So goooooood
  11. 11. 11 11 What could potentially go wrong ?
  12. 12. 12
  13. 13. 13 13 …which is true for any data systems
  14. 14. 14 14 Not thinking about Durability
  15. 15. 15 Data durability If you didn’t think about it… it’s not durable!
  16. 16. 16
  17. 17. 17
  18. 18. 18
  19. 19. 19 19 And you might have lost data!
  20. 20. 20 Data durability Kafka is not waiting for a disk flush by default. Durability is achieved through replication.
  21. 21. 21
  22. 22. 22
  23. 23. 23
  24. 24. 24
  25. 25. 25 Data durability It depends on your configuration...
  26. 26. 26
  27. 27. 27
  28. 28. 28
  29. 29. 29
  30. 30. 30 Data durability acks=1 (default value) good for latency acks=all good for durability
  31. 31. 31
  32. 32. 32 acks=all The leader will wait for the full set of in-sync replicas to acknowledge the record.
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36 min.insync.replicas minimum number of replicas that must acknowledge. Default is 1.
  37. 37. 37
  38. 38. 38 38 Data Durability while Producing ? Tune it with the parameters acks and min.insync.replicas
  39. 39. 39 defaults The default values are optimized for availability & latency. If durability is more important, tune it!
  40. 40. 40 40 Deploying on multi datacenters ?
  41. 41. 41
  42. 42. 42 Multi-dc It’s quite complicated... It’s easy to make it wrong on many levels. It could be a 3h talk.
  43. 43. 43 Multi-dc Disaster recovery for multi datacenter
  44. 44. 44 44 What about the consumers ?
  45. 45. 45 consumers Consumer can read only committed data.
  46. 46. 46
  47. 47. 47 47 Think about data durability and decide of the best trade-off for you
  48. 48. 48 Throughput, latency, durability, availability Optimizing your Apache Kafka deployment
  49. 49. 49 49 Focusing only on the happy path
  50. 50. 50
  51. 51. 51
  52. 52. 52
  53. 53. 53 retries It will cause the client to resend any record whose send fails with a potentially transient error. Default value : 0
  54. 54. 54
  55. 55. 55 retries Use built in retries ! Bump it from 0 to infinity!
  56. 56. 56 retries But you are exposed to a different kind of issue…
  57. 57. 57
  58. 58. 58 enable.idempotence When set to 'true', the producer will ensure that exactly one copy of each message is written. Default value: false
  59. 59. 59
  60. 60. 60
  61. 61. 61 61 Use built in idempotency!
  62. 62. 62 62 But it does not save you from - Managing exception and failure - Developing Idempotent consumer
  63. 63. 63 63 No Idempotent consumer
  64. 64. 64
  65. 65. 65 65 At least once (default) At most once Exactly Once
  66. 66. 66
  67. 67. 67
  68. 68. 68
  69. 69. 69
  70. 70. 70
  71. 71. 71
  72. 72. 72 commit Manually committing aggressively... Add a huge workload on Apache Kafka
  73. 73. 73
  74. 74. 74 commit Manually committing aggressively... Does not provide exactly once semantic
  75. 75. 75 75 Embrace at least once
  76. 76. 76 76 Rely on Kafka Streams with Exactly Once !
  77. 77. 77 77 No exception handling
  78. 78. 78
  79. 79. 79 Future<RecordMetadata> send(ProducerRecord<K, V> record);
  80. 80. 80 Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback);
  81. 81. 81 producer.send(record, (metadata, exception) -> { });
  82. 82. 82 error handling We don’t expect the unexpected until the unexpected is expected.
  83. 83. 83
  84. 84. 84 error handling A message can not be processed
  85. 85. 85 error handling A message can not be processed A message doesn’t have the expected schema
  86. 86. 86 86 Retry
  87. 87. 87
  88. 88. 88 88 Infinite retry
  89. 89. 89
  90. 90. 90 90 Write to a dead letter queue and continue
  91. 91. 91
  92. 92. 92 92 Ignore and continue
  93. 93. 93
  94. 94. 94 94 No silver bullet
  95. 95. 95 95 Handle the exceptions ! https://eng.uber.com/reliable-reprocessing/
  96. 96. 96 96 No data governance
  97. 97. 97
  98. 98. 98
  99. 99. 99
  100. 100. 100 governance Changes in producers might impact consumers
  101. 101. 101 governance Schema registry
  102. 102. 102
  103. 103. 103
  104. 104. 104 104 Share Schemas
  105. 105. 105 105 Let bad citizens wander around
  106. 106. 106
  107. 107. 107 107 Leverage Security, ACL and Quota Security Authorization and ACLs Enforcing Client Quotas
  108. 108. 108 108 Installing prod on Sunday night
  109. 109. 109
  110. 110. 110 configuration If you use the default configuration… You will have issues!
  111. 111. 111 111 Please read the doc Running Kafka in Production Running ZooKeeper in Production
  112. 112. 112 112 Not configuring your OS
  113. 113. 113
  114. 114. 114 os Tune at least your open file descriptors and mmap count.
  115. 115. 115 115 Configure your os Running Kafka in Production
  116. 116. 116 116 Disregarding Apache Zookeeper
  117. 117. 117 117 Not understanding Ordering
  118. 118. 118 118 No monitoring
  119. 119. 119 119 Too much partitions
  120. 120. 120 120 Not enough partitions
  121. 121. 121 121 Partition key choice
  122. 122. 122 122 Topics vs Partitions
  123. 123. 123 123 Call external services in Kafka Streams
  124. 124. 124 124 Questions

×