7. system level requirements
•
fast response time = high availability
•
slow response time = low availability
•
instant (secure) consistency = ACID (Atomicity,
Consistency, Isolation und Durability)
•
time window consistency = BASE (Basically
Available, Soft state, Eventual consistency)
8. common mistakes
•
The network is fail-safe
•
The latency time is equal to zero
•
The data throughput is infinite
•
The network is secure
•
The network topology will not change
•
There is always only one network administrator
•
The cost of data transport can be calculated as zero,
•
The network is homogeneous
10. CAP
You can only pick
2
Consistency
Availability
Partition tolerance
At a given point in time!
11. „…a distributed system cannot satisfy all three
of these guarantees at the same time…“
– Brewer
12. Centralized Systems
In a centralized system (RDBMS etc.) we don’t have
network partitions e.g. P in CAP
you GET BOTH:
Consistency
Availability
13. Distributed Systems
In a distributed system we have
network partitions e.g. P in CAP
You can get to ONLY PICK ONE:
Consistency
Availability
14. CAP in practice
•
… there are only 2 types of systems
•
•
•
CP!
AP!
… there is only one choice. In case of existing network
partitions, what do you prefer?
1.
Consistency
2.
Availability
19. Data Model
different needs
•
•
•
•
•
•
Reporting / Search?
State Changes?
Business behaviors and rules?
Algorithms / Calculations?
ORM + rich domain model is a
anti pattern!
The reading of an object ended
very often in charge of the
entire database.
20. Storage
different needs
•
•
•
When do you need ACID?
When Eventually
Consistent better fit?
Should Reporting / Search
strictly consistent?
Scaling out READS to a RDBMS is hard (sharing/replication)
Scaling out WRITES to a RDBMS is impossible!
21. So… one fits all?
„A single model cannot be appropriate for
reporting, searching and transactional
behavior.“
–Greg Young 2008
39. scaling benefits
host the two services (read / write) differently
we can host the read service on 25 servers
write service on one server
40. read side benefits
•
top-down driven and optimized view data model
•
denormalized schema - no joins
•
easy queries (select * from DB)
•
no mapping
•
no ORM mismatches
41. write side benefits
•
behavioral driven
•
exposes only functionality
•
no read access
•
(event driven) state machine
•
forms a consistency boundary
44. Events
„It’s really become clear to me in the last couple
of years that we need a new building block and
that is the Domain Events.“
–Eric Evans, 2009
„State transitions are an important part of our
problem space and should be modeled within
our domain“
–Greg Young, 2008
49. events & commands
•
•
•
•
•
•
•
•
•
uniquely identifiable
self contained
pure data structure - no behavior
Observable
Time relevant
non-blocking - async execution
conceptual immutable e.g. new, deep copy,
clone
events represent serialized function calls
conceptual duality between function calls and
messages
51. flow in a nutshell
•
Write side receive Commands and publish
Events
•
All state changes are represented by Business
Events
•
Read side is updated as a result of the published
Events
•
All Queries go directly to the Reporting, the
Business-Logic is not involved
53. Common benefits
•
Focus to Business
•
allow to manage business and technology risks separately
•
Technology agnostic, platform neutral and easy externals integration
•
Fully encapsulated domain that only exposes behavior
•
Queries do not use the business model
•
No object-relational (ORM) impedance mismatch
•
Scalable and Flexible and Testable
•
Reduced complexity and simplify system architecture
54. benefits Event-Driven
•
Explicit error/failure modeling
•
Easy integration with external systems
•
Location Transparency
•
Reactive Behavior
•
Bullet-proof auditing and historical tracing
58. trade-offs
•
initially feels complex
•
Consistency Boundaries (isn’t a top level architecture)
•
Monitoring / Heartbeats
•
Stale Data
•
Collaborative Domains
•
Data Integration
•
Modeling with time
•
Manage idempotency and event de-duplication
64. Event Sourcing
Every state change is materialized in an Event
All Events are sent to an EventProcessor
EventProcessor stores all events in an Event Log
System can be reset and Event Log replayed
Many different EventListeners can be added to
EventProcessor (or listen directly on the Event Log)
No need for ORM, just persist the Events
66. recover state
f(state, event) → state
1. EventProcessor record & replay all events in the
context of a consistency boundary
2. projections - loop over a series of events and
aggregate to a state
67. Event Processor
Restoring State from Deltas
•
Event Processor are tracking events as to what has
changed
•
Current state is constructed by replaying events
•
Data is not persisted in a structure but as a series of
transactions
•
No ORM is needed
68. Projections
Query language of Event Streams
Filter, Map, Reduce over Event Streams
Projection is about deriving current state from the
stream of events
Given the stream of events, we can transform them
to any structural representation
69.
70.
71.
72. Event Sourcing
benefits
•
•
•
•
•
•
•
Recording all state changes
is a business transaction history
Event History + Time Machine
Replay summary of total records is the last state
any regeneration of models
projections as a useful event stream query language
time related analysis and statistics (forecastings)
73. Event Sourcing
benefits
•
No object-relational impedance mismatch
•
time ordered series of business transactions
•
Support future ways of looking at data
•
Performance and scalability
•
Testability
•
Reconstruct production scenarios
•
Long-lived processes, tracking and auditability
82. Idempotency
•
natural idempotent operations are mostly easy
•
•
•
nicely for state machines with alter state without side-effects
turn the light ON - duplicate calls generate some result - the light is
on
not naturally idempotent operations are hard
•
only process ones that I see are new
•
Balance += Amount - increases balance again and again
83. Idempotency
•
de-duplicated messages by query a log store
•
check incoming message by an identifier agains a log store
and reject if a identifier match
•
check against a correlation identifier (e.g. ETags,
TransactionID)
•
check against a computed HASH of the complete message
(like GIT commits)