2. @samuelroze
Introduction
• My name is Samuel Rozé, I am VPoE at Birdie Care.
Core Team member of Symfony, for my work on Messenger.
• This is an architecture talk.
• We will briefly discuss the values of using stream processing event
streaming.
• We will see the consequences of living the dream of managing distributed
systems. TL;DR: plenty of things will go wrong.
8. @samuelroze
How will this service get its data?
1. Pull via an API
• A lot of data will be moved each
time the “discount” service computes
discounts for a customer.
• “Discounts” is able to work only
when the 3 other services are
available (cascading failures).
• “Discounts” needs to know about
where are the other services and
how to talk to them.
2. Using “batch”
• Potentially contains loads of duplicated
information (full load each time or the
period is “over X days”)
• Not real-time. “Wait a few days for your
marketing preferences to be propagated”
• A lot can go wrong with all services
properly creating exports every night.
9. @samuelroze
Event streaming
• Events are flowing in real-time, from and
to multiple services.
• To receive a specific event, services don’t
have to know who is sending events, just
that they can expect these messages.
• Much higher availability because data
goes to the service that requires it when they
are online.
• (When bus does persistence) New
consumers create their context by going
through all the events that have happened in
the system.
• Writing code that works well with the
nature of the distributed system is hard.
• You need a real governance about how
is the message bus used, they are your
new API contracts.
19. @samuelroze
A. What might happen if we don’t care about that?
• Your local “basket” table might have the new product but no worker receives
the “ProductAddedToBasket” event.
• Your local “basket” table might NOT have the new product but workers have
received the “ProductAddedToBasket” event (most likely if you use
database transactions for the entire request)
• Imagine the event being about “payment successful” or even potentially life-
changing like “fall in the home has been detected”… 😬
21. @samuelroze
A. Publishing messages to bus consistently
• In a nutshell, write your message & side effects to your database as part of
one transaction and then get something else to pull the message from the
database and send it to your queue.
• With Symfony, the simplest is actually to use the Doctrine transport for
Symfony Messenger, with the doctrine transaction middleware.
• Alternatively, you can use a dedicated library for this.
• EventSaucePHP/DoctrineOutboxMessageDispatcher
• italolelis/outboxer
25. @samuelroze
B. What might happen if we don’t care about that?
• You consume twice “ProductAddedToBasket”: the product is added twice
instead of just once (as per the user request).
• Depending on your business logic, it might be very important. For example,
what if it is about “Money added to bank account” or “Medication dose
taken”.
26. @samuelroze
B. You need some idempotence.
• You will receive the same message multiple times, it’s just a matter of time.
• There isn’t much a framework could do, you own the business logic; you
need to handle it by yourself.
• Use an idempotency key. A key that represents a single message and
allows you to know whether or not it’s been processed already.
• (By the way, this also applies to HTTP requests. Stripe’s API is a good
example.)
27. @samuelroze
B. Using the idempotency key in the handler
• One option is to have your idempotency key is part of your message. Your
team needs to know why it is useful and how to use it.
32. @samuelroze
C. What might happen if we don’t care about that?
• You will lose some state in whatever you updated based on the events, at
some point.
• The easiest solution: don’t process things concurrently. But, not really
practical when things start to scale.
33. @samuelroze
C. Locking! Optimistic vs Pessimistic
The optimist…
• Assumes that everything will go
right most of the time.
• It validates that everything has
happened as expected when
writing its state to a consistent
storage.
• a.k.a. HTTP’s If-Match, …
The pessimist…
• Believes that I most cases, this
won’t work.
• Before doing any work, it ensures
nobody else is doing it.
• a.k.a. “mutex”, “advisory locks”,
etc…
39. @samuelroze
D. What might happen if we don’t care about that?
• Hopefully your business logic doesn’t rely too much on the events being
ordered… make sure this is true.
• For example, we rely on “access_granted” and “access_revoked” events to
configure some permission rules. If they are consumed in the wrong
order… this is a different meaning 💥
43. @samuelroze
D. For the infrastructure to guarantee ordering…
• You need a message bus that supports it (Kafka, SQS Fifo, Kinesis, etc…).
• You need to carefully design your partitions (or “shards”) so that you know all
message of a specific aggregate will always go to the same partition (a.k.a.
routing keys).
• You need to carefully manage all the errors. You can’t afford a wrong
message blocking an entire topic. But you can’t really post-pone only one single message…
45. @samuelroze
We’ve seen a few ways it can go wrong.
• When publishing a message to a bus.
Outbox pattern FTW.
• When receiving multiple time the same message.
Idempotence FTW.
• When concurrently consuming messages.
You need to use optimistic or pessimistic locking.
• You can request ordering from your infrastructure.
But needs careful partition design & error management.