SlideShare a Scribd company logo
1 of 114
03/19/14 1
Synchronization
03/19/14 2
Synchronization and Coordination
 Distributed Algorithms
 Time and Clocks
 Global State
 Concurrency Control
 Distributed Transactions
03/19/14 3
Synchronization
 How processes cooperate and synchronize
with each other?
 Simultaneously access a shared resources ,
such as printer,
 Agree on the ordering of events, such as
whether message m1 from process P was
sent before or after message m2 from process
Q.
03/19/14 4
DISTRIBUTED ALGORITHMS
Algorithms that are intended to work in a distributed
environment.
 Used to accomplish tasks such as:
 Communication
 Accessing resources
 Allocating resources
 etc.
 Synchronization and coordination linked to
distributed algorithms
 Achieved using distributed algorithms
03/19/14 5
SYNCHRONOUS Vs ASYNCHRONOUS
DISTRIBUTED SYSTEMS
Timing model of a distributed system
 Affected by:
 Execution speed/time of processes
 Communication delay
 Clocks & clock drift
 (Partial) failure
03/19/14 6
Synchronous Distributed System:
Time variance is bounded
 Execution: bounded execution speed and time
 Communication: bounded transmission delay
 Clocks: bounded clock drift (and differences in clocks)
 Effect:
 Easier to design distributed algorithms
 Very restrictive requirements
• Limit concurrent use of network
• Require precise clocks and synchronization
03/19/14 7
Asynchronous Distributed System
Time variance is not bounded
 Execution: different steps can have varying duration
 Communication: transmission delays vary widely
 Clocks: arbitrary clock drift
 Effect:
 Allows no assumption about time intervals
• Most asynch DS problems hard to solve
• Solution for asynch DS is also a solution for synch DS
 Most real distributed systems are hybrid synch and asynch.
03/19/14 8
EVALUATING DISTRIBUTEDA LGORITHMS
 General Properties:
 Performance
• number of messages exchanged
• response/wait time
• delay
• throughput
• complexity:
 Efficiency
• resource usage: memory, CPU, etc.
 Scalability
 Reliability
• number of points of failure (low is good)
03/19/14 9
SYNCHRONIZATION AND COORDINATION
 making sure that processes doing the right thing
at the right time.
 allowing processes to synchronize and
coordinate their actions
 Two fundamental issues:
• Coordination (the right thing)
• Synchronization (the right time)
03/19/14 10
COORDINATION
 Coordination refers to coordinating the actions of separate processes
relative to each other and allowing them to agree on global state (such as
values of a shared variable).
“Coordinate actions and agree on values.”
 Coordinate Actions:
 What actions will occur
 Who will perform actions
 Agree on Values:
 Agree on global value
 Agree on environment
 Agree on state
03/19/14 11
Cont …
 Examples of coordination include ensuring
that processes agree on –
 what actions will be performed?
 who will be performing actions?
 and the state of the system?
03/19/14 12
SYNCHRONIZATION
 Synchronization is coordination with respect
to time, and refers to the ordering of events
and execution of instructions in time.
“Ordering of all actions”
 Total ordering of events
(whether message m1 from process P was sent before or after message m2
from process Q)
 Ordering of access to resources
03/19/14 13
Cont …
 Examples of synchronization include,
 Ordering distributed events and ensuring that
a process performs an action at a particular
time.
03/19/14 14
MAIN ISSUES
 Time and Clocks: synchronizing clocks and using time
in distributed algorithms
 Global State: how to acquire knowledge of the system’s
global state
 Concurrency Control: coordinating concurrent access
to resources
 Coordination: when do processes need to coordinate and
how do they do it
03/19/14 15
TIME AND CLOCKS
03/19/14 16
TIME AND CLOCKS
 It is often important to know when events
occurred and in what order they occurred.
 In a non-distributed system dealing with time
is trivial as there is a single shared clock,
where all processes see the same time.
03/19/14 17
Cont …
 In a distributed system, on the other hand,
each computer has its own clock.
 Because no clock is perfect each of these
clocks has its own skew which causes clocks
on different computers to drift and eventually
become out of sync.
03/19/14 18
CLOCKS
 Computer Clocks:
 oscillates at known frequency
 Clock Skew:
 different computers run at slightly different rates
 Clocks get out of sync
 Maximum drift rate
 Timestamps:
 Used to denote at which time an event occurred
03/19/14 19
PHYSICAL CLOCKS
 Based on actual time:
• Physical clocks keep track of physical time.
• In distributed systems that based on actual time it is necessary
to keep individual computer clocks synchronized.
• The clocks can be synchronized to global time (external
synchronization), or to each other (internal synchronization).
03/19/14 20
PHYSICAL CLOCKS
 Examples:
 Performing events at an exact time (turn lights on/off,
lock/unlock gates)
 sorting of events (for security, for profiling)
 Tracking (tracking a moving object with separate cameras)
03/19/14 21
SYNCHRONIZING PHYSICAL CLOCKS
 External Synchronization:
 Clocks synchronize to an external time source
 Synchronize with UTC every δ seconds
 Internal Synchronization:
 Clocks synchronize locally

Only synchronized with each other
 Time Server:
 Server that has the correct time
 Server that calculates the correct time
03/19/14 22
What can go wrong?
 Updating a replicated database: Update 1 adds 100 rupees to
an account, Update 2 calculates and adds 1% interest to the
same account.
 Clock reading might be different for different computers!
 Inconsistent state
 Can use Lamport’s to totally order
03/19/14 23
Computer Clock
 We need to measure time accurately:
 to know the time an event occurred at a computer
 Algorithms for clock synchronization useful
for
 concurrency control based on timestamp ordering
uthenticity of requests e.g. in Kerberos etc
 Each computer in a DS has its own internal
clock
03/19/14 24
Clock Synchronization
 There exists a time server receiving signals
from a UTC source
 Cristian’s algorithm
 There is no UTC source available
 Berkley’s algorithm
 Exact time does not matter!
 Lamport’s algorithm
03/19/14 25
Clock Sync. Algorithm: Cristian's
 Cristian’s algorithm requires clients to periodically
synchronize with a central time server (typically a server with
a UTC receiver)
 Every computer periodically asks the “time server” for the
current time
 The server responds ASAP with the current time CUTC
 The client sets its clock to CUTC
03/19/14 26
Berkeley Algorithm
 An algorithm for internal synchronization of a group
of computers
 A master polls to collect clock values from the
others (slaves)
 The master uses round trip times to estimate the
slaves’ clock values
 It takes an average
 It sends the required adjustment to the slaves
 If master fails, can elect a new master to take over
03/19/14 27
The Berkeley Algorithm
a) The time daemon asks all the other machines for their clock
values
b) The machines answer
c) Takes an average & tells everyone how to adjust their clock
03/19/14 28
Berkeley Algorithm
03/19/14 29
Berkeley Algorithm
 Average time: 3:05
03/19/14 30
LOGICAL CLOCKS
 The relative ordering of Events is more important than actual
physical time in many applications
 In a single process the ordering of events (e.g., state changes) is
trivial
 In a distributed system, however, besides local ordering of events,
all processes must also agree on ordering of causally related events
(e.g., sending and receiving of a single message)
 Local ordering:
 System consists of N processes pi
 If pi observes e before e′ , we have e →e′
 Global ordering:
 Lamport’s happened before relation →
 Smallest relation, such that
1. e →e ′
2. For every message m , send (m ) → receive (m )
3. Transitivity: e→ e′ and e′ →e ′′ implies e →e′′
03/19/14 31
ORDERING EVENTS
 Event ordering linked with concept of
causality:
 Saying that event a happened before event b is
same as saying that event a casually affects
event b
 If events a and b happen on processes that do not
exchange any data, their exact ordering is not
important
03/19/14 32
Relation “has happened before”
 Smallest relation satisfying the three
conditions:
 If a and b are events in the same process and a
comes before b, then a → b
 If a is the sending of a message by a process and b
its receipt by another process then
a → b
 If a → b and b → c then a → c.
03/19/14 33
Relation “has happened before”
 We cannot always order events: relation “has
happened” before is only a preorder
 If a did not happen before b, it cannot
causally affect b.
03/19/14 34
LOGICAL CLOCKS
03/19/14 35
Lamport Algorithm
 Each process increments local clock between
any two successive events
 Message contains a timestamp
 Upon receiving a message, if received
timestamp is ahead, receiver fast forward its
clock to be one more than sending time
03/19/14 36
Lamport Algorithm
03/19/14 37
VECTOR CLOCKS
03/19/14 38
VECTOR CLOCKS
Tutorial- Lamport and Vector Clock
 For the figure below, write down the lamport clock and
vector clock value at each event.
03/19/14 39
Tutorial- Lamport and Vector Clock
03/19/14 40
For the figure below, write down the lamport clock
and vector clock value at each event.
03/19/14 41
GLOBAL STATE
03/19/14 42
Determining Global States
 Global State
“The global state of a distributed computation is the
set of local states of all individual processes involved
in the computation plus the state of the
communication channels.”
03/19/14 43
Determining Global States
 Knowing the global state of distributed system may be useful
for many reasons.
For example:
 How to know that system is dealing with deadlock or
distributed computation has correctly terminated?
03/19/14 44
GLOBAL STATE
Determining global properties: often difficult, but essential for
some applications.
 Distributed garbage collection:
Do any references exist to a given object?: Collects remote server
objects that are no longer referenced by any client in the network
.
 Distributed deadlock detection:
Do processes wait in a cycle for each other?
 Distributed algorithm termination detection:
Did a set of processes finish all activity? (Consider messages
in transit)
All these properties are stable: once they occur, they do
not disappear without outside intervention.
03/19/14 45
Analysis of GLOBAL STATE (Distributed
Snapshot)
 A distributed snapshot reflects a state in which the
distributed system might have been.
 A snapshot reflects a consistent global state
 Example: if we have recorded that a process Q has
received a message from another process P,
 then we should also recorded that process P has
actually sent that message
 Otherwise, a snapshot will contain the recording of
messages that have been received but never sent,
 Which indicate the inconsistent global state
 The reverse condition (P has sent a message that Q
has not yet received) is allowed
03/19/14 46
More on States
 process state
 memory state + register state + open files + kernel buffers
+ …
Or
 application specific info like transactions completed,
functions executed etc,.
 channel state
 “Messages in transit” i.e. those messages that have been
sent but not yet received
03/19/14 47
Why global state determination is difficult in
Distributed Systems?
 Distributed State :
Have to collect information that is spread across
several machines!!
 Only Local knowledge :
A process in the computation does not know the
state of other processes.
03/19/14 48
Difficulties due to Non Determinism
 Deterministic Computation
 At any point in computation there is at most one event
that can happen next.
 Non-Deterministic Computation
 At any point in computation there can be more than one
event that can happen next.
03/19/14 49
Deterministic Computation Example
A Variant of producer-consumer example
 Producer code:
while (1)
{
produce m;
send m;
wait for ack;
}
 Consumer code:
while (1)
{
recv m;
consume m;
send ack;
}
03/19/14 50
Example: Initial State
m
03/19/14 51
Example
m
03/19/14 52
Example
m
03/19/14 53
Example
a
03/19/14 54
Example
a
03/19/14 55
Example
a
03/19/14 56
Deterministic state diagram
03/19/14 57
Non-deterministic computation
3 processes
m1
m2
m3
p
q
r
03/19/14 58
Election Algorithms
 In distributed computing, leader election is the
process of designating a single process as the
organizer of some task distributed among several
computers (nodes).
 Before the task is begun, all network nodes are
unaware which node will serve as the "leader," or
coordinator, of the task.
 After a leader election algorithm has been run,
however, each node throughout the network
recognizes a particular, unique node as the task
leader.
03/19/14 59
Election Algorithms
 Many distributed algorithms require one process to
act as coordinator, initiator, or otherwise perform
special role.
 All processes in distributed systems may be equal
characteristics, there is no way to select one of them
to be special.
 Assume have some “ID” that is a number, that every
process knows the process number of every other process.
 “elect” process with the highest number as leader
03/19/14 60
Election Algorithms
 Election
 Goal: ensure that the election achieves an
agreement among all the processes
 the P with the highest id is elected
03/19/14 61
The Bully Algorithm
 When a process notices that the coordinator in no
longer responding to requests, it initiates an
election.
 A process ,P, holds an election as follows
1. P sends an ELECTION message to all process with higher
numbers.
2. If no one responds, P wins the election and becomes
coordinator.
3. If one of the higher-ups answers, it takes over. P’s job is
done
4. Finally only a P (the new coordinator) will remain and it
will inform the other by sending a msg
5. If a process is restarted, the first action is to trigger an
election
03/19/14 62
The Bully Algorithm (1)
 Process 4 notices 7 down
 Process 4 holds an election
 Process 5 and 6 respond, telling 4 to stop
 Now 5 and 6 each hold an election
03/19/14 63
The Bully Algorithm (2)
a) Process 6 tells process 5 to stop
b) Process 6 wins and tells everyone
 Eventually “biggest” (bully) wins
 If processes 7 comes up, starts elections again
03/19/14 64
A Ring Algorithm
 processes are ordered in such way that each
process knows who its successor is.
 When a P suspects a coordinator fault
 Sends an ELECTION msg containing its Pid to
the next (successor); if it is down, the sender skips
over the successor & the msg is sent to the next of
the ring
 Each member of the network receives and
propagates a msg to the next, adding its id
03/19/14 65
A Ring Algorithm
 When a msg return to a P who sent it (verify
by inspecting the list), the msg is turned into
COORDINATOR and circulated, to report:
 New coordinator: P of the list with highest id
 Multiple messages can circulate over the network
03/19/14 66
A Ring Algorithm
 Once around, change to COORDINATOR (biggest)
• Even if two ELECTIONS started at once, everyone
will pick same leader
03/19/14 67
CONCURRENCY
Concurrency in a Non-Distributed System:
 Concurrency in distributed systems are familiar from
the study of OS and multithreaded programming
problems
 Prevent race conditions (race conditions that occur when concurrent
processes access shared resources)
 Critical sections
 In non-distributed system above problems are solved
by implementing mutual exclusion using local
primitives such as
• Locks
• Semaphores
03/19/14 68
 Concurrency in a Distributed System:
Distributed System introduces more challenges of
concurrency due to
 No directly shared resources (e.g., memory)
 No global state
 No global clock
 No centralized algorithms
 presence of communication delays
03/19/14 69
DISTRIBUTED MUTUAL EXCLUSION
 Concurrent access to distributed resources
 Must prevent race conditions during critical
regions
Requirements:
1. Safety: At most one process may execute the
critical section at a time
2. Liveness: Requests to enter and exit the critical
section ultimately succeed
3. Ordering: Requests are processed in happened-
before ordering
03/19/14 70
METHOD 1: CENTRAL SERVER
Simplest approach: use a central server that
controls the entering and exiting of critical
sections.
 Requests to enter and exit a critical section are
sent to a lock server (or coordinator)
 Permission to enter is granted by receiving a
token
 When critical section left, token is returned to
the server
03/19/14 71
A Centralized Algorithm
a) Process 1 asks the coordinator for permission to enter a
critical region. Permission is granted
b) Process 2 then asks permission to enter the same critical
region. The coordinator does not reply. (Or, can say
“denied”)
c) When process 1 exits the critical region, it tells the
coordinator, when then replies to 2.
Cont...
 This scheme is easy to implement, but it does
not scale well due to the central authority.
 Moreover, it is vulnerable to failure of the
central server.
03/19/14 72
03/19/14 73
A Distributed Algorithm
a) Processes 0 and 2 want to enter the same critical
region at the same moment.
b) Process 1 doesn’t want to enter into critical region,
says “OK”. Process 0 has the lowest timestamp, so it
wins. Queues up “OK” for 2.
c) When process 0 is done, it sends an OK to 2 so can
now enter the critical region.
03/19/14 74
M ETHOD 2: TOKEN RING
Implementation:
 All processes are organised in a logical ring structure
 A token message is forwarded along the ring
 Before entering the critical section, a process has to wait until the
token comes
 Must retain the token until the critical section is left
03/19/14 75
Properties:
 Ring limits scalability
 Token messages consume bandwidth
 Failing nodes or channels can break the ring (token
might be lost)
03/19/14 76
METHOD 3: USING MULTICASTS AND LOGICAL
CLOCKS
 Algorithm by Ricart & Agrawala: proposed an
algorithm for distributed mutual execution that makes use of logical
clocks.
 Processes pi maintain a Lamport clock and all process must
be able to communicate pairwise
 At any moment, each process is in one of three states:
1. Released: Outside of critical section
2. Wanted: Waiting to enter critical section
3. Held: Inside critical section
03/19/14 77
 Process behaviour:
1. If a process wants to enter a critical section, it
•multicasts a message( Li, pi) and
•waits until it has received a reply from every process
2. If a process is in Released , it immediately replies to any
request to enter the critical section
3. If a process is in Held , it delays replying until it is finished
with the critical section
4. If a process is in Wanted , it replies to a request immediately
only if the requesting timestamp is smaller than the
one in its own request
03/19/14 78
Properties:
 Better scalability, but multicast leads to
increasing overhead.
 Unfortunately, failure of any peer process
can deny all other process entry to the
critical section.
03/19/14 79
MUTUAL EXCLUSION: A COMPARISON
Messages Exchanged : Messages per entry/exit of critical section
o Centralised:3 (two to enter and one to leave)
o Ring:1→ ∞
o Multicast:2(n−1)
Delay: Delay before entering critical section
o Centralised: 2
o Ring: 0→n−1
o Multicast:2(n−1)
Reliability:
o Centralised: coordinator crashes
o Ring: lost token, process crashes
o Multicast: any process crashes
03/19/14 80
DISTRIBUTED TRANSACTIONS
03/19/14 81
TRANSACTIONS
 Transaction:
 Comes from database world: the concept of a transaction originates from the
database community as a mechanism to maintain the consistency of
databases.
 Defines a sequence of operations
 Atomic in presence of multiple clients and failures
 Mutual Exclusion :
 Protect shared data against simultaneous access
 Allow multiple data items to be modified in single atomic action
 Transaction Model:
 Operations:
• Begin Transaction
• End Transaction
• Read
• Write
 End of Transaction:
• Commit
• Abort
03/19/14 82
ACID PROPERTIES
 atomic: must follow an "all or nothing" rule. Each
transaction is said to be atomic if when one part of
the transaction fails, the entire transaction fails and
database state is left unchanged.
 consistent: concurrent transactions will not produce
inconsistent results;
 isolated: transactions do not interfere with each
other i.e. no intermediate state of a transaction is
visible outside , it refers to the requirement that other
operations cannot access data that has been modified
during a transaction that has not yet completed.
 durable: after a commit, results are permanent (even
if server or hardware fails)
03/19/14 83
CLASSIFICATION OF
TRANSACTIONS
 Flat: sequence of operations that satisfies ACID
 Nested: hierarchy of transactions
 Distributed: (flat) transaction that is executed on distributed
data
 Flat Transactions:
 Simple
 Failure all changes undone
BeginTransaction
accountA -= 100;
accountB += 50;
accountC += 25;
accountD += 25;
EndTransaction
03/19/14 84
NESTED TRANSACTION
 Subdivide a complex transaction into many smaller sub-
transactions
 As a result, if a single sub-transaction fails, the work that
went into others is not necessary wasted
 Example:
 Booking a flight
• Sydney Manila
• Manila Amsterdam
• Amsterdam Toronto
 What to do?
 Abort whole transaction
 Partially commit transaction and try alternative for aborted part
 Commit nonaborted parts of transaction
03/19/14 85
Cont…
Subtransactions and parent transactions
Parent transaction may commit even if some
subtransactions aborted
Parent transaction aborts all subtransactions
abort
03/19/14 86
Cont…
 Subtransactions:
 Subtransaction can abort any time
 Subtransaction cannot commit until parent ready
to commit
 Subtransaction either aborts or commits provisionally
 Provisionally committed subtransaction reports
provisional commit list, containing all its provisionally
committed subtransactions, to parent
 On abort, all subtransactions in that list are aborted.
03/19/14 87
TRANSACTION IMPLEMENTATION
 Private Workspace:
 Perform all tentative operations on a shadow copy of the server state
 Atomically swap with main copy on Commit
 Discard shadow on Abort.
03/19/14 88
Cont…
 Writeahead Log:
 In-place update with writeahead logging
 Log is reverted when a transaction Aborts
03/19/14 89
CONCURRENCY CONTROL
 It is often necessary to allow transactions to occur
simultaneously (for example, to allow multiple travel
agents to simultaneously reserve seats on the same
flight).
 Due to the consistency and isolation properties of
transactions concurrent transaction must not be
allowed to interfere with each other.
 Concurrency control algorithms for transactions
guarantee that multiple transactions can be executed
simultaneously while providing a result that is the
same as if they were executed one after another.
03/19/14 90
CONCURRENCY CONTROL
 Simultaneous Transactions:
 Clients accessing bank accounts
 Travel agents booking flights
 Problems:
 Simultaneous transactions may interfere
• Lost update
• Inconsistent retrieval
 Consistency and Isolation require that there is no interference
 Concurrency Control Algorithms:
 Guarantee that multiple transactions can be executed simultaneously
while still being isolated.
 As though transactions executed one after another
03/19/14 91
CONFLICTS AND SERIALISABILITY
 conflict: operations (from the same, or different
transactions) that operate on same data
 read-write conflict: one of the operations is a write
 write-write conflict: more than one operation is a
write
 Define a Schedule of operations:
 Total ordering (interleaving) of operations
 Legal schedules provide results as though the transactions
were serialised (i.e., performed one after another) (serial
equivalence)
03/19/14 92
MANAGING CONCURRENCY
 Transaction Managers:
03/19/14 93
Dealing with Concurrency:
 Locking
 Timestamp Ordering
 Optimistic Control
03/19/14 94
LOCKING
 Pessimistic approach: prevent illegal
schedules
 The locking algorithms require that each
transaction obtains a lock from a scheduler
process before performing a read or a write
operation.
 The scheduler is responsible for granting and
releasing locks in such a way that legal schedules
are produced.
 Ensures that only valid schedules result
03/19/14 95
TWO PHASE LOCKING (2PL)
The most widely used locking approach is two-phase locking (2PL).
In this approach a lock for a data item is granted to a process if no
conflicting locks are held by other processes (otherwise the process
requesting the lock blocks until the lock is available again).
Lock is not released until operation executed by data manager
No more locks granted after a release has taken place
All schedules formed using 2PL are serialisable.
This results in a growing phase of the transaction where locks are
acquired, and a shrinking phase where locks are released.
03/19/14 96
PROBLEMS WITH LOCKING
 Deadlock
 Cascaded Aborts: If a transaction (T1) reads
the results of a write of another transaction
(T2) that is subsequently aborted, then the
first transaction (T1) will also have to be
aborted.
TIMESTAMP ORDERING
 Each transaction has unique timestamp (ts(Ti))
 Each operation (TS(W), TS(R)) receives its transaction’s timestamp
 Each data item has two timestamps:
• read timestamp: tsRD(x) - the timestamp of the last read x
• write timestamp: tsWR(x) - the timestamp of the last committed write x
 Timestamp ordering rule:
• write request only valid if TS(W) > tsWR and TS(W) ≥ tsRD
• read request only valid if TS(R) > tsWR
 Conflict resolution:
• Operation with lower timestamp executed first
03/19/14 98
OPTIMISTIC CONTROL
 Assume that no conflicts will occur.
 Detect & resolve conflicts at commit time
 A transaction is split into three phases:
• Working (using shadow copies)
• Validation
• Update
 In the working phase operations are carried out on
shadow copies with no attempt to detect or order
conflicting operations.
03/19/14 99
Cont….
 In the validation phase the scheduler attempts
to detect conflicts with other transactions that
were in progress during the working phase.
 If conflicts are detected then one of the
conflicting transactions are aborted.
03/19/14 100
Cont…
 In the update phase, assuming that the
transaction was not aborted, all the updates
made on the shadow copy are made
permanent.
03/19/14 101
DISTRIBUTED TRANSACTIONS
 In distributed system, a single transaction
will, in general, involve several servers:
 transaction may require several services,
 transaction involves files stored on different
servers
 All servers must agree to Commit or Abort,
and do this atomically.
03/19/14 102
Distributed Flat Transaction:
03/19/14 103
Distributed Nested Transaction:
03/19/14 104
DISTRIBUTED CONCURRENCY
CONTROL
03/19/14 105
DISTRIBUTED LOCKING
 Distributed 2PL:
 Data can be replicated
 Scheduler on each machine responsible for locking own
data
 Read lock: contact any replica
 Write lock: contact all replicas
03/19/14 106
ATOMICITY AND DISTRIBUTED
TRANSACTIONS
 Distributed Transaction Organisation:
 Each distributed transaction has a coordinator, the
server handling the initial BeginTransaction call
 Coordinator maintains a list of workers, i.e. other
servers involved in the transaction
 Each worker needs to know coordinator
 Coordinator is responsible for ensuring that whole
transaction is atomically committed or aborted
• Require a distributed commit protocol.
03/19/14 107
DISTRIBUTED ATOMIC COMMIT
 Transaction may only be able to commit when all
workers are ready to commit (e.g. validation in
optimistic concurrency)
 Distributed commit requires at least two phases:
1. Voting phase: all workers vote on commit, coordinator
then decides whether to commit or abort.
2. Completion phase: all workers commit or abort
according to decision.
Basic protocol is called two-phase commit (2PC)
03/19/14 108
Two-phase commit: Coordinator
1. Sends CanCommit, receives yes, abort;
2. Sends DoCommit, DoAbort
If yes messages are received from all workers, the
coordinator sends a DoCommit message to all workers,
which then implement the Commit. However, if any of the
workers replies with abort instead of yes, the coordinator
sends a DoAbort message to all workers to trigger a
collective Abort.
03/19/14 109
Two-phase commit: Worker
1. Receives CanCommit, sends yes, abort;
2. Receives DoCommit, DoAbort
A worker on receiving a CanCommit message answers with
either a yes or abort message, depending on the workers
internal state. Afterward, it commits or aborts depending on
the instructions from the coordinator.
The only extra case is where the worker is in an abort cycle
and immediately answers a CanCommit with an abort.
03/19/14 110
 Failures can be due to:
 server failures/hosts:
• If a host fails in the 2PC protocol, then, after being
restarted, it aborts all transactions.
restarting worker aborts all transactions
 Failure of communication channels/network:
• coordinator aborts after timeout.
03/19/14 111
Two-phase commit with timeouts: Worker
 On timeout sends GetDecision
 Whenever a worker that is ready to commit when receiving a
CanCommit message times out while waiting for the decision
from the coordinator, it sends a special GetDecision message
that triggers the coordinator to resend the decision on whether
to Commit or Abort.
 These messages are handled by the coordinator once the
decision between committing and aborting has been made.
Moreover, the coordinator resends CanCommit messages if a
worker does not issue a timely reply.
03/19/14 112
Two-phase commit with timeouts:
Coordinator
 On timeout re-sends CanCommit, On GetDecision repeats
decision
03/19/14 113
Limitations(2PC)
 Once node voted “yes”, cannot change its
mind, even if crashes.
 Atomic state update to ensure “yes” vote is stable.
 If coordinator crashes, all workers may be
blocked.
 Can use different protocols (e.g. three-phase
commit),
 in some circumstances workers can obtain result
from other workers.
03/19/14 114
SUMMARY
 Distributed Algorithms:
 Timing models
 Time:
 Clock synchronisation
 Logical clocks
 Vector clocks
 Concurrency Control:
 Distributed mutual exclusion
 Distributed transactions
 2PC

More Related Content

What's hot

Communications is distributed systems
Communications is distributed systemsCommunications is distributed systems
Communications is distributed systemsSHATHAN
 
Synchronization Pradeep K Sinha
Synchronization Pradeep K SinhaSynchronization Pradeep K Sinha
Synchronization Pradeep K SinhaJawwad Rafiq
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systemssumitjain2013
 
Synchronization in distributed computing
Synchronization in distributed computingSynchronization in distributed computing
Synchronization in distributed computingSVijaylakshmi
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systemsmridul mishra
 
Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)Robert Grossman
 
Group Communication (Distributed computing)
Group Communication (Distributed computing)Group Communication (Distributed computing)
Group Communication (Distributed computing)Sri Prasanna
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactionsNilu Desai
 
message passing vs shared memory
message passing vs shared memorymessage passing vs shared memory
message passing vs shared memoryHamza Zahid
 
Message and Stream Oriented Communication
Message and Stream Oriented CommunicationMessage and Stream Oriented Communication
Message and Stream Oriented CommunicationDilum Bandara
 
Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)Sri Prasanna
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architectureMaulik Togadiya
 
Foult Tolerence In Distributed System
Foult Tolerence In Distributed SystemFoult Tolerence In Distributed System
Foult Tolerence In Distributed SystemRajan Kumar
 

What's hot (20)

Communications is distributed systems
Communications is distributed systemsCommunications is distributed systems
Communications is distributed systems
 
Synchronization Pradeep K Sinha
Synchronization Pradeep K SinhaSynchronization Pradeep K Sinha
Synchronization Pradeep K Sinha
 
Resource management
Resource managementResource management
Resource management
 
Comet Cloud
Comet CloudComet Cloud
Comet Cloud
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
CS6601 DISTRIBUTED SYSTEMS
CS6601 DISTRIBUTED SYSTEMSCS6601 DISTRIBUTED SYSTEMS
CS6601 DISTRIBUTED SYSTEMS
 
Synchronization in distributed computing
Synchronization in distributed computingSynchronization in distributed computing
Synchronization in distributed computing
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systems
 
Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)
 
Group Communication (Distributed computing)
Group Communication (Distributed computing)Group Communication (Distributed computing)
Group Communication (Distributed computing)
 
Distributed Coordination-Based Systems
Distributed Coordination-Based SystemsDistributed Coordination-Based Systems
Distributed Coordination-Based Systems
 
Underlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computingUnderlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computing
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactions
 
message passing vs shared memory
message passing vs shared memorymessage passing vs shared memory
message passing vs shared memory
 
Message and Stream Oriented Communication
Message and Stream Oriented CommunicationMessage and Stream Oriented Communication
Message and Stream Oriented Communication
 
Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)
 
11. dfs
11. dfs11. dfs
11. dfs
 
6.distributed shared memory
6.distributed shared memory6.distributed shared memory
6.distributed shared memory
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architecture
 
Foult Tolerence In Distributed System
Foult Tolerence In Distributed SystemFoult Tolerence In Distributed System
Foult Tolerence In Distributed System
 

Similar to Synchronization in distributed systems

Similar to Synchronization in distributed systems (20)

3. syncro. in distributed system
3. syncro. in distributed system3. syncro. in distributed system
3. syncro. in distributed system
 
6.Distributed Operating Systems
6.Distributed Operating Systems6.Distributed Operating Systems
6.Distributed Operating Systems
 
Chapter 5-Synchronozation.ppt
Chapter 5-Synchronozation.pptChapter 5-Synchronozation.ppt
Chapter 5-Synchronozation.ppt
 
clock synchronization in Distributed System
clock synchronization in Distributed System clock synchronization in Distributed System
clock synchronization in Distributed System
 
Chapter 6-Synchronozation2.ppt
Chapter 6-Synchronozation2.pptChapter 6-Synchronozation2.ppt
Chapter 6-Synchronozation2.ppt
 
Clock.pdf
Clock.pdfClock.pdf
Clock.pdf
 
Chap 5
Chap 5Chap 5
Chap 5
 
Unit iii-Synchronization
Unit iii-SynchronizationUnit iii-Synchronization
Unit iii-Synchronization
 
Synch
SynchSynch
Synch
 
Ds ppt imp.
Ds ppt imp.Ds ppt imp.
Ds ppt imp.
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Time in distributed systmes
Time in distributed systmesTime in distributed systmes
Time in distributed systmes
 
Clock synchronization
Clock synchronizationClock synchronization
Clock synchronization
 
Hu3513551364
Hu3513551364Hu3513551364
Hu3513551364
 
L12.FA20.ppt
L12.FA20.pptL12.FA20.ppt
L12.FA20.ppt
 
Os solved question paper
Os solved question paperOs solved question paper
Os solved question paper
 
slides.06.pptx
slides.06.pptxslides.06.pptx
slides.06.pptx
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed SystemsClock Synchronization in Distributed Systems
Clock Synchronization in Distributed Systems
 
distribute system.
distribute system.distribute system.
distribute system.
 
RTS Short Topics
RTS Short TopicsRTS Short Topics
RTS Short Topics
 

Recently uploaded

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 

Recently uploaded (20)

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 

Synchronization in distributed systems

  • 2. 03/19/14 2 Synchronization and Coordination  Distributed Algorithms  Time and Clocks  Global State  Concurrency Control  Distributed Transactions
  • 3. 03/19/14 3 Synchronization  How processes cooperate and synchronize with each other?  Simultaneously access a shared resources , such as printer,  Agree on the ordering of events, such as whether message m1 from process P was sent before or after message m2 from process Q.
  • 4. 03/19/14 4 DISTRIBUTED ALGORITHMS Algorithms that are intended to work in a distributed environment.  Used to accomplish tasks such as:  Communication  Accessing resources  Allocating resources  etc.  Synchronization and coordination linked to distributed algorithms  Achieved using distributed algorithms
  • 5. 03/19/14 5 SYNCHRONOUS Vs ASYNCHRONOUS DISTRIBUTED SYSTEMS Timing model of a distributed system  Affected by:  Execution speed/time of processes  Communication delay  Clocks & clock drift  (Partial) failure
  • 6. 03/19/14 6 Synchronous Distributed System: Time variance is bounded  Execution: bounded execution speed and time  Communication: bounded transmission delay  Clocks: bounded clock drift (and differences in clocks)  Effect:  Easier to design distributed algorithms  Very restrictive requirements • Limit concurrent use of network • Require precise clocks and synchronization
  • 7. 03/19/14 7 Asynchronous Distributed System Time variance is not bounded  Execution: different steps can have varying duration  Communication: transmission delays vary widely  Clocks: arbitrary clock drift  Effect:  Allows no assumption about time intervals • Most asynch DS problems hard to solve • Solution for asynch DS is also a solution for synch DS  Most real distributed systems are hybrid synch and asynch.
  • 8. 03/19/14 8 EVALUATING DISTRIBUTEDA LGORITHMS  General Properties:  Performance • number of messages exchanged • response/wait time • delay • throughput • complexity:  Efficiency • resource usage: memory, CPU, etc.  Scalability  Reliability • number of points of failure (low is good)
  • 9. 03/19/14 9 SYNCHRONIZATION AND COORDINATION  making sure that processes doing the right thing at the right time.  allowing processes to synchronize and coordinate their actions  Two fundamental issues: • Coordination (the right thing) • Synchronization (the right time)
  • 10. 03/19/14 10 COORDINATION  Coordination refers to coordinating the actions of separate processes relative to each other and allowing them to agree on global state (such as values of a shared variable). “Coordinate actions and agree on values.”  Coordinate Actions:  What actions will occur  Who will perform actions  Agree on Values:  Agree on global value  Agree on environment  Agree on state
  • 11. 03/19/14 11 Cont …  Examples of coordination include ensuring that processes agree on –  what actions will be performed?  who will be performing actions?  and the state of the system?
  • 12. 03/19/14 12 SYNCHRONIZATION  Synchronization is coordination with respect to time, and refers to the ordering of events and execution of instructions in time. “Ordering of all actions”  Total ordering of events (whether message m1 from process P was sent before or after message m2 from process Q)  Ordering of access to resources
  • 13. 03/19/14 13 Cont …  Examples of synchronization include,  Ordering distributed events and ensuring that a process performs an action at a particular time.
  • 14. 03/19/14 14 MAIN ISSUES  Time and Clocks: synchronizing clocks and using time in distributed algorithms  Global State: how to acquire knowledge of the system’s global state  Concurrency Control: coordinating concurrent access to resources  Coordination: when do processes need to coordinate and how do they do it
  • 16. 03/19/14 16 TIME AND CLOCKS  It is often important to know when events occurred and in what order they occurred.  In a non-distributed system dealing with time is trivial as there is a single shared clock, where all processes see the same time.
  • 17. 03/19/14 17 Cont …  In a distributed system, on the other hand, each computer has its own clock.  Because no clock is perfect each of these clocks has its own skew which causes clocks on different computers to drift and eventually become out of sync.
  • 18. 03/19/14 18 CLOCKS  Computer Clocks:  oscillates at known frequency  Clock Skew:  different computers run at slightly different rates  Clocks get out of sync  Maximum drift rate  Timestamps:  Used to denote at which time an event occurred
  • 19. 03/19/14 19 PHYSICAL CLOCKS  Based on actual time: • Physical clocks keep track of physical time. • In distributed systems that based on actual time it is necessary to keep individual computer clocks synchronized. • The clocks can be synchronized to global time (external synchronization), or to each other (internal synchronization).
  • 20. 03/19/14 20 PHYSICAL CLOCKS  Examples:  Performing events at an exact time (turn lights on/off, lock/unlock gates)  sorting of events (for security, for profiling)  Tracking (tracking a moving object with separate cameras)
  • 21. 03/19/14 21 SYNCHRONIZING PHYSICAL CLOCKS  External Synchronization:  Clocks synchronize to an external time source  Synchronize with UTC every δ seconds  Internal Synchronization:  Clocks synchronize locally  Only synchronized with each other  Time Server:  Server that has the correct time  Server that calculates the correct time
  • 22. 03/19/14 22 What can go wrong?  Updating a replicated database: Update 1 adds 100 rupees to an account, Update 2 calculates and adds 1% interest to the same account.  Clock reading might be different for different computers!  Inconsistent state  Can use Lamport’s to totally order
  • 23. 03/19/14 23 Computer Clock  We need to measure time accurately:  to know the time an event occurred at a computer  Algorithms for clock synchronization useful for  concurrency control based on timestamp ordering uthenticity of requests e.g. in Kerberos etc  Each computer in a DS has its own internal clock
  • 24. 03/19/14 24 Clock Synchronization  There exists a time server receiving signals from a UTC source  Cristian’s algorithm  There is no UTC source available  Berkley’s algorithm  Exact time does not matter!  Lamport’s algorithm
  • 25. 03/19/14 25 Clock Sync. Algorithm: Cristian's  Cristian’s algorithm requires clients to periodically synchronize with a central time server (typically a server with a UTC receiver)  Every computer periodically asks the “time server” for the current time  The server responds ASAP with the current time CUTC  The client sets its clock to CUTC
  • 26. 03/19/14 26 Berkeley Algorithm  An algorithm for internal synchronization of a group of computers  A master polls to collect clock values from the others (slaves)  The master uses round trip times to estimate the slaves’ clock values  It takes an average  It sends the required adjustment to the slaves  If master fails, can elect a new master to take over
  • 27. 03/19/14 27 The Berkeley Algorithm a) The time daemon asks all the other machines for their clock values b) The machines answer c) Takes an average & tells everyone how to adjust their clock
  • 30. 03/19/14 30 LOGICAL CLOCKS  The relative ordering of Events is more important than actual physical time in many applications  In a single process the ordering of events (e.g., state changes) is trivial  In a distributed system, however, besides local ordering of events, all processes must also agree on ordering of causally related events (e.g., sending and receiving of a single message)  Local ordering:  System consists of N processes pi  If pi observes e before e′ , we have e →e′  Global ordering:  Lamport’s happened before relation →  Smallest relation, such that 1. e →e ′ 2. For every message m , send (m ) → receive (m ) 3. Transitivity: e→ e′ and e′ →e ′′ implies e →e′′
  • 31. 03/19/14 31 ORDERING EVENTS  Event ordering linked with concept of causality:  Saying that event a happened before event b is same as saying that event a casually affects event b  If events a and b happen on processes that do not exchange any data, their exact ordering is not important
  • 32. 03/19/14 32 Relation “has happened before”  Smallest relation satisfying the three conditions:  If a and b are events in the same process and a comes before b, then a → b  If a is the sending of a message by a process and b its receipt by another process then a → b  If a → b and b → c then a → c.
  • 33. 03/19/14 33 Relation “has happened before”  We cannot always order events: relation “has happened” before is only a preorder  If a did not happen before b, it cannot causally affect b.
  • 35. 03/19/14 35 Lamport Algorithm  Each process increments local clock between any two successive events  Message contains a timestamp  Upon receiving a message, if received timestamp is ahead, receiver fast forward its clock to be one more than sending time
  • 39. Tutorial- Lamport and Vector Clock  For the figure below, write down the lamport clock and vector clock value at each event. 03/19/14 39
  • 40. Tutorial- Lamport and Vector Clock 03/19/14 40 For the figure below, write down the lamport clock and vector clock value at each event.
  • 42. 03/19/14 42 Determining Global States  Global State “The global state of a distributed computation is the set of local states of all individual processes involved in the computation plus the state of the communication channels.”
  • 43. 03/19/14 43 Determining Global States  Knowing the global state of distributed system may be useful for many reasons. For example:  How to know that system is dealing with deadlock or distributed computation has correctly terminated?
  • 44. 03/19/14 44 GLOBAL STATE Determining global properties: often difficult, but essential for some applications.  Distributed garbage collection: Do any references exist to a given object?: Collects remote server objects that are no longer referenced by any client in the network .  Distributed deadlock detection: Do processes wait in a cycle for each other?  Distributed algorithm termination detection: Did a set of processes finish all activity? (Consider messages in transit) All these properties are stable: once they occur, they do not disappear without outside intervention.
  • 45. 03/19/14 45 Analysis of GLOBAL STATE (Distributed Snapshot)  A distributed snapshot reflects a state in which the distributed system might have been.  A snapshot reflects a consistent global state  Example: if we have recorded that a process Q has received a message from another process P,  then we should also recorded that process P has actually sent that message  Otherwise, a snapshot will contain the recording of messages that have been received but never sent,  Which indicate the inconsistent global state  The reverse condition (P has sent a message that Q has not yet received) is allowed
  • 46. 03/19/14 46 More on States  process state  memory state + register state + open files + kernel buffers + … Or  application specific info like transactions completed, functions executed etc,.  channel state  “Messages in transit” i.e. those messages that have been sent but not yet received
  • 47. 03/19/14 47 Why global state determination is difficult in Distributed Systems?  Distributed State : Have to collect information that is spread across several machines!!  Only Local knowledge : A process in the computation does not know the state of other processes.
  • 48. 03/19/14 48 Difficulties due to Non Determinism  Deterministic Computation  At any point in computation there is at most one event that can happen next.  Non-Deterministic Computation  At any point in computation there can be more than one event that can happen next.
  • 49. 03/19/14 49 Deterministic Computation Example A Variant of producer-consumer example  Producer code: while (1) { produce m; send m; wait for ack; }  Consumer code: while (1) { recv m; consume m; send ack; }
  • 58. 03/19/14 58 Election Algorithms  In distributed computing, leader election is the process of designating a single process as the organizer of some task distributed among several computers (nodes).  Before the task is begun, all network nodes are unaware which node will serve as the "leader," or coordinator, of the task.  After a leader election algorithm has been run, however, each node throughout the network recognizes a particular, unique node as the task leader.
  • 59. 03/19/14 59 Election Algorithms  Many distributed algorithms require one process to act as coordinator, initiator, or otherwise perform special role.  All processes in distributed systems may be equal characteristics, there is no way to select one of them to be special.  Assume have some “ID” that is a number, that every process knows the process number of every other process.  “elect” process with the highest number as leader
  • 60. 03/19/14 60 Election Algorithms  Election  Goal: ensure that the election achieves an agreement among all the processes  the P with the highest id is elected
  • 61. 03/19/14 61 The Bully Algorithm  When a process notices that the coordinator in no longer responding to requests, it initiates an election.  A process ,P, holds an election as follows 1. P sends an ELECTION message to all process with higher numbers. 2. If no one responds, P wins the election and becomes coordinator. 3. If one of the higher-ups answers, it takes over. P’s job is done 4. Finally only a P (the new coordinator) will remain and it will inform the other by sending a msg 5. If a process is restarted, the first action is to trigger an election
  • 62. 03/19/14 62 The Bully Algorithm (1)  Process 4 notices 7 down  Process 4 holds an election  Process 5 and 6 respond, telling 4 to stop  Now 5 and 6 each hold an election
  • 63. 03/19/14 63 The Bully Algorithm (2) a) Process 6 tells process 5 to stop b) Process 6 wins and tells everyone  Eventually “biggest” (bully) wins  If processes 7 comes up, starts elections again
  • 64. 03/19/14 64 A Ring Algorithm  processes are ordered in such way that each process knows who its successor is.  When a P suspects a coordinator fault  Sends an ELECTION msg containing its Pid to the next (successor); if it is down, the sender skips over the successor & the msg is sent to the next of the ring  Each member of the network receives and propagates a msg to the next, adding its id
  • 65. 03/19/14 65 A Ring Algorithm  When a msg return to a P who sent it (verify by inspecting the list), the msg is turned into COORDINATOR and circulated, to report:  New coordinator: P of the list with highest id  Multiple messages can circulate over the network
  • 66. 03/19/14 66 A Ring Algorithm  Once around, change to COORDINATOR (biggest) • Even if two ELECTIONS started at once, everyone will pick same leader
  • 67. 03/19/14 67 CONCURRENCY Concurrency in a Non-Distributed System:  Concurrency in distributed systems are familiar from the study of OS and multithreaded programming problems  Prevent race conditions (race conditions that occur when concurrent processes access shared resources)  Critical sections  In non-distributed system above problems are solved by implementing mutual exclusion using local primitives such as • Locks • Semaphores
  • 68. 03/19/14 68  Concurrency in a Distributed System: Distributed System introduces more challenges of concurrency due to  No directly shared resources (e.g., memory)  No global state  No global clock  No centralized algorithms  presence of communication delays
  • 69. 03/19/14 69 DISTRIBUTED MUTUAL EXCLUSION  Concurrent access to distributed resources  Must prevent race conditions during critical regions Requirements: 1. Safety: At most one process may execute the critical section at a time 2. Liveness: Requests to enter and exit the critical section ultimately succeed 3. Ordering: Requests are processed in happened- before ordering
  • 70. 03/19/14 70 METHOD 1: CENTRAL SERVER Simplest approach: use a central server that controls the entering and exiting of critical sections.  Requests to enter and exit a critical section are sent to a lock server (or coordinator)  Permission to enter is granted by receiving a token  When critical section left, token is returned to the server
  • 71. 03/19/14 71 A Centralized Algorithm a) Process 1 asks the coordinator for permission to enter a critical region. Permission is granted b) Process 2 then asks permission to enter the same critical region. The coordinator does not reply. (Or, can say “denied”) c) When process 1 exits the critical region, it tells the coordinator, when then replies to 2.
  • 72. Cont...  This scheme is easy to implement, but it does not scale well due to the central authority.  Moreover, it is vulnerable to failure of the central server. 03/19/14 72
  • 73. 03/19/14 73 A Distributed Algorithm a) Processes 0 and 2 want to enter the same critical region at the same moment. b) Process 1 doesn’t want to enter into critical region, says “OK”. Process 0 has the lowest timestamp, so it wins. Queues up “OK” for 2. c) When process 0 is done, it sends an OK to 2 so can now enter the critical region.
  • 74. 03/19/14 74 M ETHOD 2: TOKEN RING Implementation:  All processes are organised in a logical ring structure  A token message is forwarded along the ring  Before entering the critical section, a process has to wait until the token comes  Must retain the token until the critical section is left
  • 75. 03/19/14 75 Properties:  Ring limits scalability  Token messages consume bandwidth  Failing nodes or channels can break the ring (token might be lost)
  • 76. 03/19/14 76 METHOD 3: USING MULTICASTS AND LOGICAL CLOCKS  Algorithm by Ricart & Agrawala: proposed an algorithm for distributed mutual execution that makes use of logical clocks.  Processes pi maintain a Lamport clock and all process must be able to communicate pairwise  At any moment, each process is in one of three states: 1. Released: Outside of critical section 2. Wanted: Waiting to enter critical section 3. Held: Inside critical section
  • 77. 03/19/14 77  Process behaviour: 1. If a process wants to enter a critical section, it •multicasts a message( Li, pi) and •waits until it has received a reply from every process 2. If a process is in Released , it immediately replies to any request to enter the critical section 3. If a process is in Held , it delays replying until it is finished with the critical section 4. If a process is in Wanted , it replies to a request immediately only if the requesting timestamp is smaller than the one in its own request
  • 78. 03/19/14 78 Properties:  Better scalability, but multicast leads to increasing overhead.  Unfortunately, failure of any peer process can deny all other process entry to the critical section.
  • 79. 03/19/14 79 MUTUAL EXCLUSION: A COMPARISON Messages Exchanged : Messages per entry/exit of critical section o Centralised:3 (two to enter and one to leave) o Ring:1→ ∞ o Multicast:2(n−1) Delay: Delay before entering critical section o Centralised: 2 o Ring: 0→n−1 o Multicast:2(n−1) Reliability: o Centralised: coordinator crashes o Ring: lost token, process crashes o Multicast: any process crashes
  • 81. 03/19/14 81 TRANSACTIONS  Transaction:  Comes from database world: the concept of a transaction originates from the database community as a mechanism to maintain the consistency of databases.  Defines a sequence of operations  Atomic in presence of multiple clients and failures  Mutual Exclusion :  Protect shared data against simultaneous access  Allow multiple data items to be modified in single atomic action  Transaction Model:  Operations: • Begin Transaction • End Transaction • Read • Write  End of Transaction: • Commit • Abort
  • 82. 03/19/14 82 ACID PROPERTIES  atomic: must follow an "all or nothing" rule. Each transaction is said to be atomic if when one part of the transaction fails, the entire transaction fails and database state is left unchanged.  consistent: concurrent transactions will not produce inconsistent results;  isolated: transactions do not interfere with each other i.e. no intermediate state of a transaction is visible outside , it refers to the requirement that other operations cannot access data that has been modified during a transaction that has not yet completed.  durable: after a commit, results are permanent (even if server or hardware fails)
  • 83. 03/19/14 83 CLASSIFICATION OF TRANSACTIONS  Flat: sequence of operations that satisfies ACID  Nested: hierarchy of transactions  Distributed: (flat) transaction that is executed on distributed data  Flat Transactions:  Simple  Failure all changes undone BeginTransaction accountA -= 100; accountB += 50; accountC += 25; accountD += 25; EndTransaction
  • 84. 03/19/14 84 NESTED TRANSACTION  Subdivide a complex transaction into many smaller sub- transactions  As a result, if a single sub-transaction fails, the work that went into others is not necessary wasted  Example:  Booking a flight • Sydney Manila • Manila Amsterdam • Amsterdam Toronto  What to do?  Abort whole transaction  Partially commit transaction and try alternative for aborted part  Commit nonaborted parts of transaction
  • 85. 03/19/14 85 Cont… Subtransactions and parent transactions Parent transaction may commit even if some subtransactions aborted Parent transaction aborts all subtransactions abort
  • 86. 03/19/14 86 Cont…  Subtransactions:  Subtransaction can abort any time  Subtransaction cannot commit until parent ready to commit  Subtransaction either aborts or commits provisionally  Provisionally committed subtransaction reports provisional commit list, containing all its provisionally committed subtransactions, to parent  On abort, all subtransactions in that list are aborted.
  • 87. 03/19/14 87 TRANSACTION IMPLEMENTATION  Private Workspace:  Perform all tentative operations on a shadow copy of the server state  Atomically swap with main copy on Commit  Discard shadow on Abort.
  • 88. 03/19/14 88 Cont…  Writeahead Log:  In-place update with writeahead logging  Log is reverted when a transaction Aborts
  • 89. 03/19/14 89 CONCURRENCY CONTROL  It is often necessary to allow transactions to occur simultaneously (for example, to allow multiple travel agents to simultaneously reserve seats on the same flight).  Due to the consistency and isolation properties of transactions concurrent transaction must not be allowed to interfere with each other.  Concurrency control algorithms for transactions guarantee that multiple transactions can be executed simultaneously while providing a result that is the same as if they were executed one after another.
  • 90. 03/19/14 90 CONCURRENCY CONTROL  Simultaneous Transactions:  Clients accessing bank accounts  Travel agents booking flights  Problems:  Simultaneous transactions may interfere • Lost update • Inconsistent retrieval  Consistency and Isolation require that there is no interference  Concurrency Control Algorithms:  Guarantee that multiple transactions can be executed simultaneously while still being isolated.  As though transactions executed one after another
  • 91. 03/19/14 91 CONFLICTS AND SERIALISABILITY  conflict: operations (from the same, or different transactions) that operate on same data  read-write conflict: one of the operations is a write  write-write conflict: more than one operation is a write  Define a Schedule of operations:  Total ordering (interleaving) of operations  Legal schedules provide results as though the transactions were serialised (i.e., performed one after another) (serial equivalence)
  • 92. 03/19/14 92 MANAGING CONCURRENCY  Transaction Managers:
  • 93. 03/19/14 93 Dealing with Concurrency:  Locking  Timestamp Ordering  Optimistic Control
  • 94. 03/19/14 94 LOCKING  Pessimistic approach: prevent illegal schedules  The locking algorithms require that each transaction obtains a lock from a scheduler process before performing a read or a write operation.  The scheduler is responsible for granting and releasing locks in such a way that legal schedules are produced.  Ensures that only valid schedules result
  • 95. 03/19/14 95 TWO PHASE LOCKING (2PL) The most widely used locking approach is two-phase locking (2PL). In this approach a lock for a data item is granted to a process if no conflicting locks are held by other processes (otherwise the process requesting the lock blocks until the lock is available again). Lock is not released until operation executed by data manager No more locks granted after a release has taken place All schedules formed using 2PL are serialisable. This results in a growing phase of the transaction where locks are acquired, and a shrinking phase where locks are released.
  • 96. 03/19/14 96 PROBLEMS WITH LOCKING  Deadlock  Cascaded Aborts: If a transaction (T1) reads the results of a write of another transaction (T2) that is subsequently aborted, then the first transaction (T1) will also have to be aborted.
  • 97. TIMESTAMP ORDERING  Each transaction has unique timestamp (ts(Ti))  Each operation (TS(W), TS(R)) receives its transaction’s timestamp  Each data item has two timestamps: • read timestamp: tsRD(x) - the timestamp of the last read x • write timestamp: tsWR(x) - the timestamp of the last committed write x  Timestamp ordering rule: • write request only valid if TS(W) > tsWR and TS(W) ≥ tsRD • read request only valid if TS(R) > tsWR  Conflict resolution: • Operation with lower timestamp executed first
  • 98. 03/19/14 98 OPTIMISTIC CONTROL  Assume that no conflicts will occur.  Detect & resolve conflicts at commit time  A transaction is split into three phases: • Working (using shadow copies) • Validation • Update  In the working phase operations are carried out on shadow copies with no attempt to detect or order conflicting operations.
  • 99. 03/19/14 99 Cont….  In the validation phase the scheduler attempts to detect conflicts with other transactions that were in progress during the working phase.  If conflicts are detected then one of the conflicting transactions are aborted.
  • 100. 03/19/14 100 Cont…  In the update phase, assuming that the transaction was not aborted, all the updates made on the shadow copy are made permanent.
  • 101. 03/19/14 101 DISTRIBUTED TRANSACTIONS  In distributed system, a single transaction will, in general, involve several servers:  transaction may require several services,  transaction involves files stored on different servers  All servers must agree to Commit or Abort, and do this atomically.
  • 105. 03/19/14 105 DISTRIBUTED LOCKING  Distributed 2PL:  Data can be replicated  Scheduler on each machine responsible for locking own data  Read lock: contact any replica  Write lock: contact all replicas
  • 106. 03/19/14 106 ATOMICITY AND DISTRIBUTED TRANSACTIONS  Distributed Transaction Organisation:  Each distributed transaction has a coordinator, the server handling the initial BeginTransaction call  Coordinator maintains a list of workers, i.e. other servers involved in the transaction  Each worker needs to know coordinator  Coordinator is responsible for ensuring that whole transaction is atomically committed or aborted • Require a distributed commit protocol.
  • 107. 03/19/14 107 DISTRIBUTED ATOMIC COMMIT  Transaction may only be able to commit when all workers are ready to commit (e.g. validation in optimistic concurrency)  Distributed commit requires at least two phases: 1. Voting phase: all workers vote on commit, coordinator then decides whether to commit or abort. 2. Completion phase: all workers commit or abort according to decision. Basic protocol is called two-phase commit (2PC)
  • 108. 03/19/14 108 Two-phase commit: Coordinator 1. Sends CanCommit, receives yes, abort; 2. Sends DoCommit, DoAbort If yes messages are received from all workers, the coordinator sends a DoCommit message to all workers, which then implement the Commit. However, if any of the workers replies with abort instead of yes, the coordinator sends a DoAbort message to all workers to trigger a collective Abort.
  • 109. 03/19/14 109 Two-phase commit: Worker 1. Receives CanCommit, sends yes, abort; 2. Receives DoCommit, DoAbort A worker on receiving a CanCommit message answers with either a yes or abort message, depending on the workers internal state. Afterward, it commits or aborts depending on the instructions from the coordinator. The only extra case is where the worker is in an abort cycle and immediately answers a CanCommit with an abort.
  • 110. 03/19/14 110  Failures can be due to:  server failures/hosts: • If a host fails in the 2PC protocol, then, after being restarted, it aborts all transactions. restarting worker aborts all transactions  Failure of communication channels/network: • coordinator aborts after timeout.
  • 111. 03/19/14 111 Two-phase commit with timeouts: Worker  On timeout sends GetDecision  Whenever a worker that is ready to commit when receiving a CanCommit message times out while waiting for the decision from the coordinator, it sends a special GetDecision message that triggers the coordinator to resend the decision on whether to Commit or Abort.  These messages are handled by the coordinator once the decision between committing and aborting has been made. Moreover, the coordinator resends CanCommit messages if a worker does not issue a timely reply.
  • 112. 03/19/14 112 Two-phase commit with timeouts: Coordinator  On timeout re-sends CanCommit, On GetDecision repeats decision
  • 113. 03/19/14 113 Limitations(2PC)  Once node voted “yes”, cannot change its mind, even if crashes.  Atomic state update to ensure “yes” vote is stable.  If coordinator crashes, all workers may be blocked.  Can use different protocols (e.g. three-phase commit),  in some circumstances workers can obtain result from other workers.
  • 114. 03/19/14 114 SUMMARY  Distributed Algorithms:  Timing models  Time:  Clock synchronisation  Logical clocks  Vector clocks  Concurrency Control:  Distributed mutual exclusion  Distributed transactions  2PC