SlideShare a Scribd company logo
1 of 109
Download to read offline
Building
      Scalable,
Highly Concurrent &
   Fault-Tolerant
      Systems:
  Lessons Learned

     Jonas Bonér
       CTO Typesafe
      Twitter: @jboner
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again    Lessons
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                                                                     Learned
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again   through...
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again    Lessons
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                                                                     Learned
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again   through...
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                                                                      Agony
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again    Lessons
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                                                                     Learned
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again   through...
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                                                                      Agony
                                                                                                    and Pain
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again
I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                  I will never use distributed transactions again
                                                                                                    lots of
                                                                                                      Pain
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
I will never use distributed transactions again   I will never use distributed transactions again
Agenda
• It’s All Trade-offs
• Go Concurrent
• Go Reactive
• Go Fault-Tolerant
• Go Distributed
• Go Big
It’s all
Trade-offs
Performance
     vs
 Scalability
Latency
     vs
Throughput
Availability
    vs
Consistency
Go Concurrent
Shared mutable state
Shared mutable state
Together with threads...
Shared mutable state
              Together with threads...


...leads to
Shared mutable state
              Together with threads...

              ...code that is totally INDETERMINISTIC
...leads to
Shared mutable state
              Together with threads...

              ...code that is totally INDETERMINISTIC
...leads to
              ...and the root of all EVIL
Shared mutable state
              Together with threads...

              ...code that is totally INDETERMINISTIC
...leads to
              ...and the root of all EVIL


Please, avoid it at all cost
Shared mutable state
                     B L E
                 T
               U ! A
     Together with threads...

            M e!!
        IM at EVIL
              ...code that is totally INDETERMINISTIC

      e t
...leads to


     s s      ...and the root of all

    U avoid it at all cost
Please,
The problem with locks
• Locks do not compose
• Locks break encapsulation
• Taking too few locks
• Taking too many locks
• Taking the wrong locks
• Taking locks in the wrong order
• Error recovery is hard
You deserve better tools

• Dataflow Concurrency
• Actors
• Software Transactional Memory (STM)
• Agents
Dataflow Concurrency
• Deterministic
• Declarative
• Data-driven
  • Threads are suspended until data is available
  • Lazy & On-demand
• No difference between:
 • Concurrent code
 • Sequential code
• Examples: Akka & GPars
Actors
•Share NOTHING
•Isolated lightweight event-based processes
•Each actor has a mailbox (message queue)
•Communicates through asynchronous and
 non-blocking message passing
•Location transparent (distributable)
•Examples: Akka & Erlang
STM
• See the memory as a transactional dataset
• Similar to a DB: begin, commit, rollback (ACI)
• Transactions are retried upon collision
• Rolls back the memory on abort
• Transactions can nest and compose
• Use STM instead of abusing your database
  with temporary storage of “scratch” data
• Examples: Haskell, Clojure & Scala
Agents
• Reactive memory cells (STM Ref)
• Send a update function to the Agent, which
  1. adds it to an (ordered) queue, to be
  2. applied to the Agent asynchronously
• Reads are “free”, just dereferences the Ref
• Cooperates with STM
• Examples: Clojure & Akka
If we could start all over...
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
  • Dataflow
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
  • Dataflow
2. Add Indeterminism selectively - only where needed
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
  • Dataflow
2. Add Indeterminism selectively - only where needed
  • Actor/Agent-based Programming
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
  • Dataflow
2. Add Indeterminism selectively - only where needed
  • Actor/Agent-based Programming
3. Add Mutability selectively - only where needed
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
  • Dataflow
2. Add Indeterminism selectively - only where needed
  • Actor/Agent-based Programming
3. Add Mutability selectively - only where needed
  • Protected by Transactions (STM)
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
  • Dataflow
2. Add Indeterminism selectively - only where needed
  • Actor/Agent-based Programming
3. Add Mutability selectively - only where needed
  • Protected by Transactions (STM)
4. Finally - only if really needed
If we could start all over...
1. Start with a Deterministic, Declarative & Immutable core
  • Logic & Functional Programming
  • Dataflow
2. Add Indeterminism selectively - only where needed
  • Actor/Agent-based Programming
3. Add Mutability selectively - only where needed
  • Protected by Transactions (STM)
4. Finally - only if really needed
  • Add Monitors (Locks) and explicit Threads
Go Reactive
Never block
• ...unless you really have to
• Blocking kills scalability (and performance)
• Never sit on resources you don’t use
• Use non-blocking IO
• Be reactive
• How?
Go Async
  Design for reactive event-driven systems
1. Use asynchronous message passing
2. Use Iteratee-based IO
3. Use push not pull (or poll)
• Examples:
   • Akka or Erlang actors
   • Play’s reactive Iteratee IO
   • Node.js or JavaScript Promises
   • Server-Sent Events or WebSockets
   • Scala’s Futures library
Go Fault-Tolerant
Failure Recovery in Java/C/C# etc.
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control
• If this thread blows up you are screwed
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control
• If this thread blows up you are screwed
• So you need to do all explicit error handling
  WITHIN this single thread
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control
• If this thread blows up you are screwed
• So you need to do all explicit error handling
  WITHIN this single thread
• To make things worse - errors do not
  propagate between threads so there is NO
  WAY OF EVEN FINDING OUT that
  something have failed
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control
• If this thread blows up you are screwed
• So you need to do all explicit error handling
  WITHIN this single thread
• To make things worse - errors do not
  propagate between threads so there is NO
  WAY OF EVEN FINDING OUT that
  something have failed
• This leads to DEFENSIVE programming with:
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control
• If this thread blows up you are screwed
• So you need to do all explicit error handling
  WITHIN this single thread
• To make things worse - errors do not
  propagate between threads so there is NO
  WAY OF EVEN FINDING OUT that
  something have failed
• This leads to DEFENSIVE programming with:
  • Error handling TANGLED with business logic
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control
• If this thread blows up you are screwed
• So you need to do all explicit error handling
  WITHIN this single thread
• To make things worse - errors do not
  propagate between threads so there is NO
  WAY OF EVEN FINDING OUT that
  something have failed
• This leads to DEFENSIVE programming with:
  • Error handling TANGLED with business logic
  • SCATTERED all over the code base
Failure Recovery in Java/C/C# etc.
• You are given a SINGLE thread of control


                d o
• If this thread blows up you are screwed

               n !!
• So you need to do all explicit error handling

              a !
            c r
  WITHIN this single thread

           e te
• To make things worse - errors do not

          W et
  propagate between threads so there is NO
  WAY OF EVEN FINDING OUT that

            b
  something have failed
• This leads to DEFENSIVE programming with:
  • Error handling TANGLED with business logic
  • SCATTERED all over the code base
Just

Let It Crash
The right way
1. Isolated lightweight processes
2. Supervised processes
 • Each running process has a supervising process
 • Errors are sent to the supervisor (asynchronously)
 • Supervisor manages the failure
• Same semantics local as remote
• For example the Actor Model solves it nicely
Go Distributed
Performance
     vs
 Scalability
How do I know if I have a
 performance problem?
How do I know if I have a
 performance problem?

       If your system is
    slow for a single user
How do I know if I have a
  scalability problem?
How do I know if I have a
  scalability problem?

        If your system is
     fast for a single user
  but slow under heavy load
(Three) Misconceptions about
     Reliable Distributed Computing
                     - Werner Vogels

1. Transparency is the ultimate goal
2. Automatic object replication is desirable
3. All replicas are equal and deterministic

  Classic paper: A Note On Distributed Computing - Waldo et. al.
Fallacy 1
Transparent Distributed Computing
• Emulating Consistency and Shared
  Memory in a distributed environment
• Distributed Objects
 • “Sucks like an inverted hurricane” - Martin Fowler
• Distributed Transactions
 • ...don’t get me started...
Fallacy 2
                          RPC
• Emulating synchronous blocking method
  dispatch - across the network
• Ignores:
 • Latency
 • Partial failures
 • General scalability concerns, caching etc.
• “Convenience over Correctness” - Steve Vinoski
Instead
Instead
   Embrace the Network

    Use                                    it
                                    i th
                                   w
Asynchronous               don e

  Message             be
                  d
               an
   Passing
Guaranteed Delivery
  Delivery Semantics
   • No guarantees
   • At most once
   • At least once
   • Once and only once
It’s all lies.
It’s all lies.
The network is inherently unreliable
 and there is no such thing as 100%
        guaranteed delivery


       It’s all lies.
Guaranteed Delivery
Guaranteed Delivery
The question is what to guarantee
Guaranteed Delivery
  The question is what to guarantee
1. The message is - sent out on the network?
Guaranteed Delivery
   The question is what to guarantee
1. The message is - sent out on the network?
2. The message is - received by the receiver host’s NIC?
Guaranteed Delivery
   The question is what to guarantee
1. The message is - sent out on the network?
2. The message is - received by the receiver host’s NIC?
3. The message is - put on the receiver’s queue?
Guaranteed Delivery
   The question is what to guarantee
1. The message is - sent out on the network?
2. The message is - received by the receiver host’s NIC?
3. The message is - put on the receiver’s queue?
4. The message is - applied to the receiver?
Guaranteed Delivery
   The question is what to guarantee
1. The message is - sent out on the network?
2. The message is - received by the receiver host’s NIC?
3. The message is - put on the receiver’s queue?
4. The message is - applied to the receiver?
5. The message is - starting to be processed by the receiver?
Guaranteed Delivery
   The question is what to guarantee
1. The message is - sent out on the network?
2. The message is - received by the receiver host’s NIC?
3. The message is - put on the receiver’s queue?
4. The message is - applied to the receiver?
5. The message is - starting to be processed by the receiver?
6. The message is - has completed processing by the receiver?
Ok, then what to do?
1. Start with 0 guarantees (0 additional cost)
2. Add the guarantees you need - one by one
Ok, then what to do?
1. Start with 0 guarantees (0 additional cost)
2. Add the guarantees you need - one by one

  Different USE-CASES
            Different GUARANTEES
                Different COSTS
Ok, then what to do?
    1. Start with 0 guarantees (0 additional cost)
    2. Add the guarantees you need - one by one

       Different USE-CASES
                  Different GUARANTEES
                       Different COSTS
 For each additional guarantee you add you will either:
• decrease performance, throughput or scalability
• increase latency
Just
Just


Use ACKing
Just


Use ACKing
and be done with it
Latency
     vs
Throughput
You should strive for
maximal throughput
          with
acceptable latency
Go Big
Go Big
 Data
Big Data
  Imperative OO programming doesn't cut it
• Object-Mathematics Impedance Mismatch
• We need functional processing, transformations etc.
• Examples: Spark, Crunch/Scrunch, Cascading, Cascalog,
  Scalding, Scala Parallel Collections
• Hadoop have been called the:
  • “Assembly language of MapReduce programming”
  • “EJB of our time”
Big Data
     Batch processing doesn't cut it
• Ala Hadoop
• We need real-time data processing
• Examples: Spark, Storm, S4 etc.
• Watch“Why Big Data Needs To Be Functional”
  by Dean Wampler
Go Big
 DB
When is
 a RDBMS
   not
good enough?
Scalingreads
  to a RDBMS
  is   hard
Scalingwrites
   to a RDBMS
is   impossible
Do we
really need
a RDBMS?
Do we
really need
a RDBMS?
Sometimes...
Do we
really need
a RDBMS?
Do we
      really need
      a RDBMS?
But many times we don’t
Atomic
Consistent
Isolated
Durable
Availability
    vs
Consistency
Brewer’s



CAP
theorem
You can only pick   2
     Consistency
     Availability
     Partition tolerance
At a given point in time
Centralized system
• In a centralized system (RDBMS etc.)
  we don’t have network partitions,
  e.g. P in CAP
• So you get both:

        Consistency
        Availability
Distributed system
• In a distributed (scalable) system
  we will have network partitions,
  e.g. P in CAP
• So you get to only pick one:

       Consistency
       Availability
Basically Available
Soft state
Eventually consistent
Think about your data
             Then think again
• When do you need ACID?
• When is Eventual Consistency a better fit?
• Different kinds of data has different needs
• You need full consistency less than you think
How fast is fast enough?
• Never guess: Measure, measure and measure
• Start by defining a baseline
 • Where are we now?
• Define what is “good enough” - i.e. SLAs
 • Where do we want to go?
 • When are we done?
• Beware of micro-benchmarks
...or, when can we go for a beer?
• Never guess: Measure, measure and measure
• Start by defining a baseline
 • Where are we now?
• Define what is “good enough” - i.e. SLAs
 • Where do we want to go?
 • When are we done?
• Beware of micro-benchmarks
To sum things up...
1. Maximizing a specific metric impacts others
  • Every strategic decision involves a trade-off
  • There's no "silver bullet"
2. Applying yesterday's best practices to the
   problems faced today will lead to:
  • Waste of resources
  • Performance and scalability bottlenecks
  • Unreliable systems
SO
GO
...now home and build yourself
            Scalable,
      Highly Concurrent &
         Fault-Tolerant
            Systems
Thank You
Email: jonas@typesafe.com
Web: typesafe.com
Twitter: @jboner

More Related Content

What's hot

What's hot (20)

Quartz Scheduler
Quartz SchedulerQuartz Scheduler
Quartz Scheduler
 
Scaling containers with keda
Scaling containers  with kedaScaling containers  with keda
Scaling containers with keda
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Confluent Cloud Networking | Rajan Sundaram, Confluent
Confluent Cloud Networking | Rajan Sundaram, ConfluentConfluent Cloud Networking | Rajan Sundaram, Confluent
Confluent Cloud Networking | Rajan Sundaram, Confluent
 
Java Memory Management Tricks
Java Memory Management Tricks Java Memory Management Tricks
Java Memory Management Tricks
 
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
 
elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리
 
Taking advantage of Prometheus relabeling
Taking advantage of Prometheus relabelingTaking advantage of Prometheus relabeling
Taking advantage of Prometheus relabeling
 
Amazon EFS (Elastic File System) 이해하고사용하기
Amazon EFS (Elastic File System) 이해하고사용하기Amazon EFS (Elastic File System) 이해하고사용하기
Amazon EFS (Elastic File System) 이해하고사용하기
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Splunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operatorSplunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operator
 
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
 
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
 
Kafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityKafka Tutorial: Kafka Security
Kafka Tutorial: Kafka Security
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & Grafana
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at Scale
 

Viewers also liked

Akka in Practice: Designing Actor-based Applications
Akka in Practice: Designing Actor-based ApplicationsAkka in Practice: Designing Actor-based Applications
Akka in Practice: Designing Actor-based Applications
NLJUG
 
Akka cluster overview at 010dev
Akka cluster overview at 010devAkka cluster overview at 010dev
Akka cluster overview at 010dev
Roland Kuhn
 

Viewers also liked (7)

Akka in Practice: Designing Actor-based Applications
Akka in Practice: Designing Actor-based ApplicationsAkka in Practice: Designing Actor-based Applications
Akka in Practice: Designing Actor-based Applications
 
Akka cluster overview at 010dev
Akka cluster overview at 010devAkka cluster overview at 010dev
Akka cluster overview at 010dev
 
Slides - Intro to Akka.Cluster
Slides - Intro to Akka.ClusterSlides - Intro to Akka.Cluster
Slides - Intro to Akka.Cluster
 
Introduction to Actor Model and Akka
Introduction to Actor Model and AkkaIntroduction to Actor Model and Akka
Introduction to Actor Model and Akka
 
Introduction to Akka
Introduction to AkkaIntroduction to Akka
Introduction to Akka
 
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
 
Building Reactive Systems with Akka (in Java 8 or Scala)
Building Reactive Systems with Akka (in Java 8 or Scala)Building Reactive Systems with Akka (in Java 8 or Scala)
Building Reactive Systems with Akka (in Java 8 or Scala)
 

More from Jonas Bonér

Kalix: Tackling the The Cloud to Edge Continuum
Kalix: Tackling the The Cloud to Edge ContinuumKalix: Tackling the The Cloud to Edge Continuum
Kalix: Tackling the The Cloud to Edge Continuum
Jonas Bonér
 
Cloudstate—Towards Stateful Serverless
Cloudstate—Towards Stateful ServerlessCloudstate—Towards Stateful Serverless
Cloudstate—Towards Stateful Serverless
Jonas Bonér
 
Life Beyond the Illusion of Present
Life Beyond the Illusion of PresentLife Beyond the Illusion of Present
Life Beyond the Illusion of Present
Jonas Bonér
 
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVMState: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
Jonas Bonér
 

More from Jonas Bonér (18)

We are drowning in complexity—can we do better?
We are drowning in complexity—can we do better?We are drowning in complexity—can we do better?
We are drowning in complexity—can we do better?
 
Kalix: Tackling the The Cloud to Edge Continuum
Kalix: Tackling the The Cloud to Edge ContinuumKalix: Tackling the The Cloud to Edge Continuum
Kalix: Tackling the The Cloud to Edge Continuum
 
The Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native ApplicationsThe Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native Applications
 
Cloudstate—Towards Stateful Serverless
Cloudstate—Towards Stateful ServerlessCloudstate—Towards Stateful Serverless
Cloudstate—Towards Stateful Serverless
 
Designing Events-first Microservices
Designing Events-first MicroservicesDesigning Events-first Microservices
Designing Events-first Microservices
 
How Events Are Reshaping Modern Systems
How Events Are Reshaping Modern SystemsHow Events Are Reshaping Modern Systems
How Events Are Reshaping Modern Systems
 
Reactive Microsystems: The Evolution of Microservices at Scale
Reactive Microsystems: The Evolution of Microservices at ScaleReactive Microsystems: The Evolution of Microservices at Scale
Reactive Microsystems: The Evolution of Microservices at Scale
 
From Microliths To Microsystems
From Microliths To MicrosystemsFrom Microliths To Microsystems
From Microliths To Microsystems
 
Without Resilience, Nothing Else Matters
Without Resilience, Nothing Else MattersWithout Resilience, Nothing Else Matters
Without Resilience, Nothing Else Matters
 
Life Beyond the Illusion of Present
Life Beyond the Illusion of PresentLife Beyond the Illusion of Present
Life Beyond the Illusion of Present
 
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsGo Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
 
Reactive Supply To Changing Demand
Reactive Supply To Changing DemandReactive Supply To Changing Demand
Reactive Supply To Changing Demand
 
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Go Reactive: Event-Driven, Scalable, Resilient & Responsive SystemsGo Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
 
Introducing Akka
Introducing AkkaIntroducing Akka
Introducing Akka
 
Event Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspectiveEvent Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspective
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVMState: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
 
Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Building Scalable, Highly Concurrent & Fault Tolerant Systems - Lessons Learned

  • 1. Building Scalable, Highly Concurrent & Fault-Tolerant Systems: Lessons Learned Jonas Bonér CTO Typesafe Twitter: @jboner
  • 2. I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again
  • 3. I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Lessons I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Learned I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again through... I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again
  • 4. I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Lessons I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Learned I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again through... I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Agony I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again
  • 5. I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Lessons I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Learned I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again through... I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again Agony and Pain I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again lots of Pain I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again I will never use distributed transactions again
  • 6. Agenda • It’s All Trade-offs • Go Concurrent • Go Reactive • Go Fault-Tolerant • Go Distributed • Go Big
  • 7.
  • 9. Performance vs Scalability
  • 10. Latency vs Throughput
  • 11. Availability vs Consistency
  • 14. Shared mutable state Together with threads...
  • 15. Shared mutable state Together with threads... ...leads to
  • 16. Shared mutable state Together with threads... ...code that is totally INDETERMINISTIC ...leads to
  • 17. Shared mutable state Together with threads... ...code that is totally INDETERMINISTIC ...leads to ...and the root of all EVIL
  • 18. Shared mutable state Together with threads... ...code that is totally INDETERMINISTIC ...leads to ...and the root of all EVIL Please, avoid it at all cost
  • 19. Shared mutable state B L E T U ! A Together with threads... M e!! IM at EVIL ...code that is totally INDETERMINISTIC e t ...leads to s s ...and the root of all U avoid it at all cost Please,
  • 20. The problem with locks • Locks do not compose • Locks break encapsulation • Taking too few locks • Taking too many locks • Taking the wrong locks • Taking locks in the wrong order • Error recovery is hard
  • 21. You deserve better tools • Dataflow Concurrency • Actors • Software Transactional Memory (STM) • Agents
  • 22. Dataflow Concurrency • Deterministic • Declarative • Data-driven • Threads are suspended until data is available • Lazy & On-demand • No difference between: • Concurrent code • Sequential code • Examples: Akka & GPars
  • 23. Actors •Share NOTHING •Isolated lightweight event-based processes •Each actor has a mailbox (message queue) •Communicates through asynchronous and non-blocking message passing •Location transparent (distributable) •Examples: Akka & Erlang
  • 24. STM • See the memory as a transactional dataset • Similar to a DB: begin, commit, rollback (ACI) • Transactions are retried upon collision • Rolls back the memory on abort • Transactions can nest and compose • Use STM instead of abusing your database with temporary storage of “scratch” data • Examples: Haskell, Clojure & Scala
  • 25. Agents • Reactive memory cells (STM Ref) • Send a update function to the Agent, which 1. adds it to an (ordered) queue, to be 2. applied to the Agent asynchronously • Reads are “free”, just dereferences the Ref • Cooperates with STM • Examples: Clojure & Akka
  • 26. If we could start all over...
  • 27. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core
  • 28. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming
  • 29. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming • Dataflow
  • 30. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming • Dataflow 2. Add Indeterminism selectively - only where needed
  • 31. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming • Dataflow 2. Add Indeterminism selectively - only where needed • Actor/Agent-based Programming
  • 32. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming • Dataflow 2. Add Indeterminism selectively - only where needed • Actor/Agent-based Programming 3. Add Mutability selectively - only where needed
  • 33. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming • Dataflow 2. Add Indeterminism selectively - only where needed • Actor/Agent-based Programming 3. Add Mutability selectively - only where needed • Protected by Transactions (STM)
  • 34. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming • Dataflow 2. Add Indeterminism selectively - only where needed • Actor/Agent-based Programming 3. Add Mutability selectively - only where needed • Protected by Transactions (STM) 4. Finally - only if really needed
  • 35. If we could start all over... 1. Start with a Deterministic, Declarative & Immutable core • Logic & Functional Programming • Dataflow 2. Add Indeterminism selectively - only where needed • Actor/Agent-based Programming 3. Add Mutability selectively - only where needed • Protected by Transactions (STM) 4. Finally - only if really needed • Add Monitors (Locks) and explicit Threads
  • 37. Never block • ...unless you really have to • Blocking kills scalability (and performance) • Never sit on resources you don’t use • Use non-blocking IO • Be reactive • How?
  • 38. Go Async Design for reactive event-driven systems 1. Use asynchronous message passing 2. Use Iteratee-based IO 3. Use push not pull (or poll) • Examples: • Akka or Erlang actors • Play’s reactive Iteratee IO • Node.js or JavaScript Promises • Server-Sent Events or WebSockets • Scala’s Futures library
  • 40. Failure Recovery in Java/C/C# etc.
  • 41. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control
  • 42. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed
  • 43. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread
  • 44. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed
  • 45. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed • This leads to DEFENSIVE programming with:
  • 46. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed • This leads to DEFENSIVE programming with: • Error handling TANGLED with business logic
  • 47. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed • This leads to DEFENSIVE programming with: • Error handling TANGLED with business logic • SCATTERED all over the code base
  • 48. Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control d o • If this thread blows up you are screwed n !! • So you need to do all explicit error handling a ! c r WITHIN this single thread e te • To make things worse - errors do not W et propagate between threads so there is NO WAY OF EVEN FINDING OUT that b something have failed • This leads to DEFENSIVE programming with: • Error handling TANGLED with business logic • SCATTERED all over the code base
  • 50.
  • 51. The right way 1. Isolated lightweight processes 2. Supervised processes • Each running process has a supervising process • Errors are sent to the supervisor (asynchronously) • Supervisor manages the failure • Same semantics local as remote • For example the Actor Model solves it nicely
  • 53. Performance vs Scalability
  • 54. How do I know if I have a performance problem?
  • 55. How do I know if I have a performance problem? If your system is slow for a single user
  • 56. How do I know if I have a scalability problem?
  • 57. How do I know if I have a scalability problem? If your system is fast for a single user but slow under heavy load
  • 58. (Three) Misconceptions about Reliable Distributed Computing - Werner Vogels 1. Transparency is the ultimate goal 2. Automatic object replication is desirable 3. All replicas are equal and deterministic Classic paper: A Note On Distributed Computing - Waldo et. al.
  • 59. Fallacy 1 Transparent Distributed Computing • Emulating Consistency and Shared Memory in a distributed environment • Distributed Objects • “Sucks like an inverted hurricane” - Martin Fowler • Distributed Transactions • ...don’t get me started...
  • 60. Fallacy 2 RPC • Emulating synchronous blocking method dispatch - across the network • Ignores: • Latency • Partial failures • General scalability concerns, caching etc. • “Convenience over Correctness” - Steve Vinoski
  • 62. Instead Embrace the Network Use it i th w Asynchronous don e Message be d an Passing
  • 63. Guaranteed Delivery Delivery Semantics • No guarantees • At most once • At least once • Once and only once
  • 66. The network is inherently unreliable and there is no such thing as 100% guaranteed delivery It’s all lies.
  • 68. Guaranteed Delivery The question is what to guarantee
  • 69. Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network?
  • 70. Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC?
  • 71. Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC? 3. The message is - put on the receiver’s queue?
  • 72. Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC? 3. The message is - put on the receiver’s queue? 4. The message is - applied to the receiver?
  • 73. Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC? 3. The message is - put on the receiver’s queue? 4. The message is - applied to the receiver? 5. The message is - starting to be processed by the receiver?
  • 74. Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC? 3. The message is - put on the receiver’s queue? 4. The message is - applied to the receiver? 5. The message is - starting to be processed by the receiver? 6. The message is - has completed processing by the receiver?
  • 75. Ok, then what to do? 1. Start with 0 guarantees (0 additional cost) 2. Add the guarantees you need - one by one
  • 76. Ok, then what to do? 1. Start with 0 guarantees (0 additional cost) 2. Add the guarantees you need - one by one Different USE-CASES Different GUARANTEES Different COSTS
  • 77. Ok, then what to do? 1. Start with 0 guarantees (0 additional cost) 2. Add the guarantees you need - one by one Different USE-CASES Different GUARANTEES Different COSTS For each additional guarantee you add you will either: • decrease performance, throughput or scalability • increase latency
  • 78. Just
  • 80. Just Use ACKing and be done with it
  • 81. Latency vs Throughput
  • 82. You should strive for maximal throughput with acceptable latency
  • 85. Big Data Imperative OO programming doesn't cut it • Object-Mathematics Impedance Mismatch • We need functional processing, transformations etc. • Examples: Spark, Crunch/Scrunch, Cascading, Cascalog, Scalding, Scala Parallel Collections • Hadoop have been called the: • “Assembly language of MapReduce programming” • “EJB of our time”
  • 86. Big Data Batch processing doesn't cut it • Ala Hadoop • We need real-time data processing • Examples: Spark, Storm, S4 etc. • Watch“Why Big Data Needs To Be Functional” by Dean Wampler
  • 88. When is a RDBMS not good enough?
  • 89. Scalingreads to a RDBMS is hard
  • 90. Scalingwrites to a RDBMS is impossible
  • 92. Do we really need a RDBMS? Sometimes...
  • 94. Do we really need a RDBMS? But many times we don’t
  • 96. Availability vs Consistency
  • 98. You can only pick 2 Consistency Availability Partition tolerance At a given point in time
  • 99. Centralized system • In a centralized system (RDBMS etc.) we don’t have network partitions, e.g. P in CAP • So you get both: Consistency Availability
  • 100. Distributed system • In a distributed (scalable) system we will have network partitions, e.g. P in CAP • So you get to only pick one: Consistency Availability
  • 102. Think about your data Then think again • When do you need ACID? • When is Eventual Consistency a better fit? • Different kinds of data has different needs • You need full consistency less than you think
  • 103. How fast is fast enough? • Never guess: Measure, measure and measure • Start by defining a baseline • Where are we now? • Define what is “good enough” - i.e. SLAs • Where do we want to go? • When are we done? • Beware of micro-benchmarks
  • 104. ...or, when can we go for a beer? • Never guess: Measure, measure and measure • Start by defining a baseline • Where are we now? • Define what is “good enough” - i.e. SLAs • Where do we want to go? • When are we done? • Beware of micro-benchmarks
  • 105. To sum things up... 1. Maximizing a specific metric impacts others • Every strategic decision involves a trade-off • There's no "silver bullet" 2. Applying yesterday's best practices to the problems faced today will lead to: • Waste of resources • Performance and scalability bottlenecks • Unreliable systems
  • 106. SO
  • 107. GO
  • 108. ...now home and build yourself Scalable, Highly Concurrent & Fault-Tolerant Systems
  • 109. Thank You Email: jonas@typesafe.com Web: typesafe.com Twitter: @jboner