Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Event streaming (Symfony World 2020)

82 views

Published on

Event Streaming: An alternative to CRUD & Batch processing. Some of the things that will go wrong with distributed systems.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Event streaming (Symfony World 2020)

  1. 1. @samuelroze Event Streaming Some things you want to know about. @samuelroze
  2. 2. @samuelroze Introduction • My name is Samuel Rozé, I am VPoE at Birdie Care. Core Team member of Symfony, for my work on Messenger. • This is an architecture talk. • We will briefly discuss the values of using stream processing event streaming. • We will see the consequences of living the dream of managing distributed systems. TL;DR: plenty of things will go wrong.
  3. 3. @samuelroze@samuelroze 1. Why …is event streaming even interesting?
  4. 4. @samuelroze Your product works… you split your services.
  5. 5. @samuelroze Your services are talking to each other.
  6. 6. @samuelroze Now you need to introduce targeted discounts…
  7. 7. @samuelroze Now you need to introduce targeted discounts…
  8. 8. @samuelroze How will this service get its data? 1. Pull via an API • A lot of data will be moved each time the “discount” service computes discounts for a customer. • “Discounts” is able to work only when the 3 other services are available (cascading failures). • “Discounts” needs to know about where are the other services and how to talk to them. 2. Using “batch” • Potentially contains loads of duplicated information (full load each time or the period is “over X days”) • Not real-time. “Wait a few days for your marketing preferences to be propagated” • A lot can go wrong with all services properly creating exports every night.
  9. 9. @samuelroze Event streaming • Events are flowing in real-time, from and to multiple services. • To receive a specific event, services don’t have to know who is sending events, just that they can expect these messages. • Much higher availability because data goes to the service that requires it when they are online. • (When bus does persistence) New consumers create their context by going through all the events that have happened in the system. • Writing code that works well with the nature of the distributed system is hard. • You need a real governance about how is the message bus used, they are your new API contracts.
  10. 10. @samuelroze Event streaming, as a diagram
  11. 11. @samuelroze Everything we are going to talk about is true for…
  12. 12. @samuelroze@samuelroze 2. What will go wrong? It’s not “if”.
  13. 13. @samuelroze Let’s start with a simple use-case. Here we write on the `Basket` entity for example
  14. 14. @samuelroze Your message is sent to a queue
  15. 15. @samuelroze@samuelroze Problem A Are you sure that the message 
 was sent to the queue?
  16. 16. @samuelroze A. Are you sure that your message has been sent?
  17. 17. @samuelroze A. Are you sure that your message has been sent?
  18. 18. @samuelroze A. Distributed transactions are not really a thing.
  19. 19. @samuelroze A. What might happen if we don’t care about that? • Your local “basket” table might have the new product but no worker receives the “ProductAddedToBasket” event. • Your local “basket” table might NOT have the new product but workers have received the “ProductAddedToBasket” event (most likely if you use database transactions for the entire request) • Imagine the event being about “payment successful” or even potentially life- changing like “fall in the home has been detected”… 😬
  20. 20. @samuelroze A. The outbox pattern
  21. 21. @samuelroze A. Publishing messages to bus consistently • In a nutshell, write your message & side effects to your database as part of one transaction and then get something else to pull the message from the database and send it to your queue. • With Symfony, the simplest is actually to use the Doctrine transport for Symfony Messenger, with the doctrine transaction middleware. • Alternatively, you can use a dedicated library for this. • EventSaucePHP/DoctrineOutboxMessageDispatcher • italolelis/outboxer
  22. 22. @samuelroze A. Using the Doctrine transport
  23. 23. @samuelroze@samuelroze Problem B You will receive duplicated messages.
  24. 24. @samuelroze B. “At least once delivery”
  25. 25. @samuelroze B. What might happen if we don’t care about that? • You consume twice “ProductAddedToBasket”: the product is added twice instead of just once (as per the user request). • Depending on your business logic, it might be very important. For example, what if it is about “Money added to bank account” or “Medication dose taken”.
  26. 26. @samuelroze B. You need some idempotence. • You will receive the same message multiple times, it’s just a matter of time. • There isn’t much a framework could do, you own the business logic; you need to handle it by yourself. • Use an idempotency key. A key that represents a single message and allows you to know whether or not it’s been processed already. • (By the way, this also applies to HTTP requests. Stripe’s API is a good example.)
  27. 27. @samuelroze B. Using the idempotency key in the handler • One option is to have your idempotency key is part of your message. Your team needs to know why it is useful and how to use it.
  28. 28. @samuelroze B. How to use your “idempotency key”
  29. 29. @samuelroze B. How to use your “idempotency key”
  30. 30. @samuelroze@samuelroze Problem C Processing messages in parallel.
  31. 31. @samuelroze C. Concurrently processing messages
  32. 32. @samuelroze C. What might happen if we don’t care about that? • You will lose some state in whatever you updated based on the events, at some point. • The easiest solution: don’t process things concurrently. But, not really practical when things start to scale.
  33. 33. @samuelroze C. Locking! Optimistic vs Pessimistic The optimist… • Assumes that everything will go right most of the time. • It validates that everything has happened as expected when writing its state to a consistent storage. • a.k.a. HTTP’s If-Match, … The pessimist… • Believes that I most cases, this won’t work. • Before doing any work, it ensures nobody else is doing it. • a.k.a. “mutex”, “advisory locks”, etc…
  34. 34. @samuelroze C. Pessimistic locking with Symfony Lock
  35. 35. @samuelroze C. Optimistic locking with Doctrine’s “versions"
  36. 36. @samuelroze C. Optimistic locking with Doctrine’s “versions" What’s happening behind the scene with optimistic locking:
  37. 37. @samuelroze@samuelroze Problem D Message ordering
  38. 38. @samuelroze D. Know when there is no ordering guarantee
  39. 39. @samuelroze D. What might happen if we don’t care about that? • Hopefully your business logic doesn’t rely too much on the events being ordered… make sure this is true. • For example, we rely on “access_granted” and “access_revoked” events to configure some permission rules. If they are consumed in the wrong order… this is a different meaning 💥
  40. 40. @samuelroze D. There are buses that guarantee order
  41. 41. @samuelroze D. They scale using partitions
  42. 42. @samuelroze D. Order guaranteed means blocking messages.
  43. 43. @samuelroze D. For the infrastructure to guarantee ordering… • You need a message bus that supports it (Kafka, SQS Fifo, Kinesis, etc…). • You need to carefully design your partitions (or “shards”) so that you know all message of a specific aggregate will always go to the same partition (a.k.a. routing keys). • You need to carefully manage all the errors. You can’t afford a wrong message blocking an entire topic. But you can’t really post-pone only one single message…
  44. 44. @samuelroze@samuelroze To wrap up… A few learnings (hopefully).
  45. 45. @samuelroze We’ve seen a few ways it can go wrong. • When publishing a message to a bus. Outbox pattern FTW. • When receiving multiple time the same message. Idempotence FTW. • When concurrently consuming messages.
 You need to use optimistic or pessimistic locking. • You can request ordering from your infrastructure. But needs careful partition design & error management.
  46. 46. @samuelroze Thank you! @samuelroze
  47. 47. @samuelroze Want to read more? • Martin Kleppman’s book. https://dataintensive.net • https://multithreaded.stitchfix.com/blog/2017/06/26/patterns-of-soa- idempotency-key/ • https://microservices.io/patterns/data/transactional-outbox.html

×