It is widely understood that our software needs to become reactive; we need to consider responsiveness, maintainability, elasticity and scalability from the outset. Not all systems need to implement all these to the same degree, as specific project requirements will determine where effort is most wisely spent. But, in the vast majority of cases, the need to go reactive will demand that we design our applications differently.
In this presentation Dr. Roland Kuhn will explore several architecture elements that are commonly found in reactive systems, like the circuit breaker, various replication techniques, and flow control protocols. These patterns are language agnostic and also independent of the abundant choice of reactive programming frameworks and libraries. They are well-specified starting points for exploring the design space of a concrete problem: thinking is strictly required!
This webinar is based off of Dr. Kuhn’s session, Reactive Design Sessions, presented at WJAX and Code Mesh.
9. Implementation: Message-Driven
• focus on communication between components
• model message flows and protocols
• common transports: async HTTP, *MQ, Actors
9
13. Simple Component Pattern
• SingleResponsibilityPrinciple formulated by
DeMarco in «Structured analysis and system
specification» (Yourdon, New York, 1979)
• “maximize cohesion and minimize coupling”
• “a class should have only one reason to change”
(UncleBobMartin’sformulationforOOD)
13
14. Example: the Batch Job Service
• users submit jobs
• planning and validation rules
• execution on elastic compute cluster
• users query job status and results
14
19. Let-It-Crash Pattern
• Candea & Fox: “Crash-Only Software”
(USENIX HotOS IX, 2003)
• transient and rare failures are hard to detect and fix
• write component such that full restart is always o.k.
• simplified failure model leads to more reliability
19
20. Let-It-Crash Pattern
• Erlang philosophy from day one
• popularized by Netflix Chaos Monkey
• make sure that system is resilient by arbitrarily performing
recovery restarts
• exercise failure recovery code paths for real
• failure will happen, fault-avoidance is doomed
20
23. Circuit Breaker Pattern
• well-known, inspired by electrical engineering
• first published by M. Nygard in «Release It!»
• protects both ways:
• allows client to avoid long failure timeouts
• gives service some breathing room to recover
23
24. Circuit Breaker Example
24
private object StorageFailed extends RuntimeException
private def sendToStorage(job: Job): Future[StorageStatus] = {
// make an asynchronous request to the storage subsystem
val f: Future[StorageStatus] = ???
// map storage failures to Future failures to alert the breaker
f.map {
case StorageStatus.Failed => throw StorageFailed
case other => other
}
}
private val breaker = CircuitBreaker(
system.scheduler, // used for scheduling timeouts
5, // number of failures in a row when it trips
300.millis, // timeout for each service call
30.seconds) // time before trying to close after tripping
def persist(job: Job): Future[StorageStatus] =
breaker
.withCircuitBreaker(sendToStorage(job))
.recover {
case StorageFailed => StorageStatus.Failed
case _: TimeoutException => StorageStatus.Unknown
case _: CircuitBreakerOpenException => StorageStatus.Failed
}
26. Saga Pattern: Background
• Microservice Architecture means distribution of
knowledge, no more central database instance
• Pat Helland:
• “Life Beyond Distributed Transactions”, CIDR 2007
• “Memories, Guesses, and Apologies”, MSDN blog 2007
• What about transactions that affect multiple
microservices?
26
27. Saga Pattern
• Garcia-Molina & Salem: “SAGAS”, ACM, 1987
• Bank transfer avoiding lock of both accounts:
• T₁: transfer money from X to local working account
• T₂: transfer money from local working account to Y
• C₁: compensate failure by transferring money back to X
• Compensating transactions are executed during
Saga rollback
• concurrent Sagas can see intermediate state
27
28. Saga Pattern
• backward recovery:
T₁ T₂ T₃ C₃ C₂ C₁
• forward recovery with save-points:
T₁ (sp) T₂ (sp) T₃ (sp) T₄
• in practice Sagas need to be persistent to recover
after hardware failures, meaning backward recovery
will also use save-points
28
29. Example: Bank Transfer
29
trait Account {
def withdraw(amount: BigDecimal, id: Long): Future[Unit]
def deposit(amount: BigDecimal, id: Long): Future[Unit]
}
case class Transfer(amount: BigDecimal, x: Account, y: Account)
sealed trait Event
case class TransferStarted(amount: BigDecimal, x: Account, y: Account) extends Event
case object MoneyWithdrawn extends Event
case object MoneyDeposited extends Event
case object RolledBack extends Event
30. Example: Bank Transfer
30
class TransferSaga(id: Long) extends PersistentActor {
import context.dispatcher
override val persistenceId: String = s"transaction-$id"
override def receiveCommand: PartialFunction[Any, Unit] = {
case Transfer(amount, x, y) =>
persist(TransferStarted(amount, x, y))(withdrawMoney)
}
def withdrawMoney(t: TransferStarted): Unit = {
t.x.withdraw(t.amount, id).map(_ => MoneyWithdrawn).pipeTo(self)
context.become(awaitMoneyWithdrawn(t.amount, t.x, t.y))
}
def awaitMoneyWithdrawn(amount: BigDecimal, x: Account, y: Account): Receive = {
case m @ MoneyWithdrawn => persist(m)(_ => depositMoney(amount, x, y))
}
...
}
32. Example: Bank Transfer
32
override def receiveRecover: PartialFunction[Any, Unit] = {
var start: TransferStarted = null
var last: Event = null
{
case t: TransferStarted => { start = t; last = t }
case e: Event => last = e
case RecoveryCompleted =>
last match {
case null => // wait for initialization
case t: TransferStarted => withdrawMoney(t)
case MoneyWithdrawn => depositMoney(start.amount, start.x, start.y)
case MoneyDeposited => context.stop(self)
case RolledBack => context.stop(self)
}
}
}
33. Saga Pattern: Reactive Full Circle
• Garcia-Molina & Salem note:
• “search for natural divisions of the work being performed”
• “it is the database itself that is naturally partitioned into
relatively independent components”
• “the database and the saga should be designed so that
data passed from one sub-transaction to the next via local
storage is minimized”
• fully aligned with Simple Components and isolation
33
35. Conclusion
• reactive systems are distributed
• this requires new (old) architecture patterns
• … helped by new (old) code patterns & abstractions
• none of this is dead easy: thinking is required!
35