SlideShare a Scribd company logo
1 of 60
Download to read offline
The CAP Theorem : Brewer’s Conjecture and
   the Feasibility of Consistent, Available,
      Partition-Tolerant Web Services

           Aleksandar Bradic, Vast.com


             nosqlsummer | Belgrade
                 28 August 2010
CAP Theorem




  Conjecture by Eric Brewer at PODC 2000 :
     It is impossible for a web service to provide following three
      guarantees :
            Consistency
            Availability
            Partition-tolerance
CAP Theorem




      Consistency - all nodes should see the same data at the same
       time
      Availability - node failures do not prevent survivors from
       continuing to operate
      Partition-tolerance - the system continues to operate despite
       arbitrary message loss
CAP Theorem



  CAP Theorem :
    It is impossible for a web service to provide following three
     guarantees :
             Consistency
             Availability
             Partition-tolerance
       A distributed system can satisfy any two of these
        guarantees at the same time but not all three
CAP Theorem




  CAP Theorem
      Conjecture since 2000
      Established as theorem in 2002: Lynch, Nancy, and Seth
       Gilbert. Brewer’s conjecture and the feasibility of consistent,
       available, partition-tolerant web services. ACM SIGACT News,
       v. 33 issue 2, 2002, p. 51-59.
CAP Theorem




  A distributed system can satisfy any two of CAP guarantees at the
  same time but not all three:
       Consistency + Availability
       Consistency + Partition Tolerance
       Availability + Partition Tolerance
Consistency + Availability



   Examples:
        Single-site databases
        Cluster databases
        LDAP
        xFS file system
   Traits:
        2-phase commit
        cache validation protocols
Consistency + Partition Tolerance



   Examples:
        Distributed databases
        Distributed Locking
        Majority protocols
   Traits:
        Pessimistic locking
        Make minority partitions unavailable
Availability + Partition Tolerance



   Examples:
        Coda
        Web caching
        DNS
   Traits:
        expiration/leases
        conflict resolution
        optimistic
Enterprise System CAP Classification



       RDBMS : CA (Master/Slave replication, Sharding)
       Amazon Dynamo : AP (Read-repair, application hooks)
       Terracota : CA (Quorum vote, majority partition survival)
       Apache Cassandra : AP (Partitioning, Read-repair)
       Apache Zookeeper: AP (Consensus protocol)
       Google BigTable : CA
       Apache CouchDB : AP
Enterprise System CAP Classification
Some Techniques:




       Consistent Hashing
       Vector Clocks
       Sloppy Quorum
       Merkle trees
       Gossip-based protocols
       ...
Proof of the CAP Theorem




  Lynch, Nancy, and Seth Gilbert. Brewer’s conjecture and the
  feasibility of consistent, available, partition-tolerant web services.
  ACM SIGACT News, v. 33 issue 2, 2002, p. 51-59.
Formal Model




  Formalization of the notion of Consistency, Availability and
  Partition Tolerance :
       Atomic Data Object
       Available Data Object
       Partition Tolerance
Atomic Data Objects




   Atomic/Linearizable Consistency:
        There must exist a total order on all operation such that each
         operation looks as if it were completed at a single instant
        This is equivalent to requiring requests on the distributed
         shared memory to act as if they are executing on single node,
         responding to operations one at the time
Available Data Objects


       For a distributed system to be continuously available, every
        request received by a non-failing node in the system must
        result in a response
       That is, any algorithm used by service must eventually
        terminate
       (In some ways, this is weak definition of availability : it puts
        no bounds on how long the algorithm may run before
        terminating, and therefore allows unbounded computation)
       (On the other hand, when qualified by the need for partition
        tolerance, this can be seen as a strong definition of availability
        : even when severe network failures occur, every request must
        terminate)
Partition Tolerance

        In order to model partition tolerance, the network is allowed to
         lose arbitrary many messages sent from one node to another
        When a network is partitioned, all messages sent from nodes
         in one component of the partition to another component are
         lost.
        The atomicity requirement implies that every response will be
         atomic, even though arbitrary messages sent as part of the
         algorithm might not be delivered
        The availability requirement therefore implies that every node
         receiving request from a client must respond, even through
         arbitrary messages that are sent may be lost
        Partition Tolerance : No set of failures less than total network
         failure is allowed to cause the system to respond incorrectly
Formal Framework




       Asynchronous Network Model
       Partially Synchronous Network Model
Asynchronous Networks




       There is no clock
       Nodes must make decisions based only on messages received
        and local computation
Asynchronous Networks : Impossibility Result




   Theorem 1 : It is impossible in the asynchronous network model to
   implement a read/write data object that guarantees the following
   properties:
        Availability
        Atomic consistency
   in all fair executions (including those in which messages are lost)
Asynchronous Networks : Impossibility Result


   Proof (by contradiction) :
        Assume an algorithm A exists that meets the three criteria :
         atomicity, availability and partition tolerance
        We construct an execution of A in which there exists a
         request that returns and inconsistent response
        Assume that the network consists of at least two nodes. Thus
         it can be divided into two disjoint, non-empty sets G1 , G2
        Assume all messages between G1 and G2 are lost.
        If a write occurs in G 1 and read oddurs in G2 , then the read
         operation cannot return the results of earlier write operation.
Asynchronous Networks : Impossibility Result


   Formal proof:
        Let v0 be the initial value of the atomic object
        Let α1 be the prefix of an execution of A in which a single
         write of a value not equal to v0 occurs in G1 , ending with the
         termination of the write operation.
        assume that no other client requests occur in either G1 or G2 .
        assume that no messages from G1 are received in G2 and no
         messages from G2 are received in G1
        we know that write operation will complete (by the
         availability requirement)
Asynchronous Networks : Impossibility Result



       Let α2 be the prefix of an execution in which a single read
        occurs in G2 and no other client requests occur, ending with
        the termination of the read operation
       During α2 no messages from G2 are received in G1 and no
        messages from G1 are received in G 2
       We know that the read must return a value (by the
        availability requirement)
       The value returned by this execution must be v0 as no write
        operation has occurred in α2
Asynchronous Networks : Impossibility Result


       Let α be an execution beginning with α1 and continuing with
        α2 . To the nodes in G2 , α is indistinguishable from α2 , as all
        the messages from G1 to G2 are lost (in both α1 and α2 that
        together make up α), and α1 does not include any client
        requests to nodes in G2 .
       Therefore, in the α execution - the read request (from α2 )
        must still return v0 .
       However, the read request does not begin until after the write
        request (from α1 ) has completed
       This therefore contradicts the atomicity property,
        proving that no such algorithm exists
Asynchronous Networks : Impossibility Result
Asynchronous Networks : Impossibility Result




   Corollary 1.1:
   It is impossible in the asynchronous network model to implement a
   read/write data object that guarantees the following properties:
        Availability - in all fair executions
        Atomic consistency - in fair executions in which no messages
         are lost
Asynchronous Networks : Impossibility Result


   Proof:
        The main idea is that in the asynchronous model, and
         algorithm has no way of determining whether a message has
         been lost, or has been arbitrary delayed in the transmission
         channel
        Therefore if there existed an an algorithm that guaranteed
         atomic consistency in executions in which no messages were
         lost, there would exist an algorithm that guaranteed atomic
         consistency in all executions.
        This would violate Theorem 1
Asynchronous Networks : Impossibility Result




       Assume that there exists an algorithm A that always
        terminates, and guarantees atomic consistency in fair
        executions in which all messages are delivered
       Theorem 1 implies that A does not guarantee atomic
        consistency in al fair execution, so there exist some fair
        execution of α in which some response is not atomic
Asynchronous Networks : Impossibility Result



       At some finite point in execution α1 , the algorithm A returns
        a response that is not atomic.
       Let α be the prefix of α ending with the invalid response.
       Next, extend α to to a fair execution α in which all
        messages are delivered
       The execution α is now a fair execution in which all
        messages are delivered
       However, this execution is not atomic.
       Therefore no such algorithm A exists
Solutions in the Asynchronous Model




   While it is impossible to provide all three properties : atomicity,
   availability and partition tolerance, any two of these properties can
   be achieved:
        Atomic, Partition Tolerant
        Atomic, Available
        Atomic, Partition Tolerant
Atomic, Partition Tolerant
        If availability is not required , it is easy to achieve atomic data
         and partition tolerance
        The trivial system that ignores all requests meets these
         requirements
        Stronger liveness criterion : if all the messages in an execution
         are delivered, system is available and all operations terminate
        A simple centralized algorithm meets these requirements : a
         single designated node maintains the value of an object
        A node receiving request forwards the request to designated
         node which sends a response. When acknowledgement is
         received, the node sends a response to the client
        Many distributed databases provide this guarantee, especially
         algorithms based on distributed locking or quorums : if certain
         failure patterns occur, liveness condition is weakened and the
         service no longer returns response. If there are no failures,
         then liveness is guaranteed
Atomic, Available




       If there are no partitions - it is possible to provide atomic,
        available data
       Centralized algorithm with single designated node for
        maintaining value of an object meets these requirements
Available, Partition Tolerant



        It is possible to provide high availability and partition
         tolerance if atomic consistency is not required
        If there are no consistency requirements, the service can
         trivially return v0 , the initial value in response to every request
        It is possible to provide weakened consistency in an available,
         partition-tolerant setting
        Web caches are one example of weakly consistent network
Partially Synchronous Model
       In the real world, most networks are not purely asynchronous
       If we allow each node in the network to have a clock, it is
        possible to build a more powerful service
       In partially synchronous model - every node has a clock and
        all clocks increase at the same rate
       However, the clocks themselves are not synchronized, in that
        they might display different variables at the same real time
       In effect : the clocks act as timers : local state variables that
        the process can observe to measure how much time has passed
       A local timer can be used to schedule an action to occur a
        certain interval of time after some other event
       Furthermore, assume that every message is either delivered
        within a given, known time tmax or it is lost
       Also, every node processes a received message within a given,
        known time tlocal and local processing time
Partially Synchronous Networks : Impossibility Result




   Theorem 2: It is impossible in the partially synchronous network
   model to implement a read/write data object that guarantees the
   following properties:
        Availability
        Atomic consistency
   in all executions (even those in which messages are lost)
Partially Synchronous Networks : Impossibility Result



   Proof:
        Same methodology as in case of Theorem 1 is used
        we divide the network into two components {G1 , G2 } and
         construct an admissible execution in which a write happens in
         one component, followed by a read operation in the other
         component.
        This read operation can be shown to return inconsistent data
Partially Synchronous Networks : Impossibility Result
        We construct execution α1 : a single write request and
         acknowledgement occurs in G1 , and all messages between the
         two components {G1 , G2 } are lost
              
         Let α2 be an execution that begins with a long interval of
         time during which no client requests occur
        This interval must be at least long as the entire duration of α1
                           
         Then append to α2 the events of α2 in following manner : a
         single read request and response in G2 assuming all messages
         between the two components are lost
        Finally - we construct α by superimposing two execution α1
         and α2 

        The long interval of time in α2 ensures that the write request
         competes before the read request begins
        However, the read request returns the initial value, rather
         than the new value written by the write request, violating
         atomic consistency
Asynchronous Networks : Impossibility Result
Solutions in the Partially Synchronous Model
   Corollary 1.1(Asynchronous network model):
   It is impossible in the asynchronous network model to implement a
   read/write data object that guarantees the following properties:
        Availability - in all fair executions
        Atomic consistency - in fair executions in which no messages
         are lost
   In partially synchronous model - the analogue of Corollary 1.1 does
   not hold
        The proof of this corollary depends on nodes being unaware of
         when a message is lost
        There are partially synchronous algorithms that will return
         atomic data when all messages in an execution are delivered
         (i.e. there are no partitions) - and will only return inconsistent
         data when messages are lost
Solutions in the Partially Synchronous Model

        An example of such an algorithm is the centralized protocol
         with single object state store modified to time-out lost
         messages :
        On a read (or write) request, a message is sent to the central
         node
        If a response from central node is received, then the node
         delivers the requested data (or an acknowledgement)
        If no response is received within 2 ∗ tmsg + tlocal , then the
         node concludes that the message was lost
        The client is then sent a response : either the best known
         value of the local node (for a read operation) or an
         acknowledgement (for a write) operation. In this case, atomic
         consistency may be violated.
Solutions in the Partially Synchronous Model
Weaker Consistency Conditions
       While it is useful to guarantee that atomic data will be
        returned in executions in which all messages are delivered, it is
        equally important to specify what happens in executions in
        which some of the messages are lost
       We discuss possible weaker consistency condition that allows
        stale data to be returned when there are partitions, yet place
        formal requirements on the quality of stale data returned
       This consistency guarantee will require availability and atomic
        consistency in executions in which no messages are lost and is
        therefore impossible to guarantee in the asynchronous model
        as a result of corollary
       In the partially synchronous model it often makes sense to
        base guarantees on how long an algorithm has had to rectify a
        situation
       This consistency model ensures that if messages are delivered,
        then eventually some notion of atomicity is restored
Weaker Consistency Conditions



       In a atomic execution, we define a partial order of the read
        and write operations and then require that if one operation
        begins after another one ends, the former does not precede
        the latter in the partial order.
       We define a weaker guarantee, t-Connected Consistency,
        which defines a partial order in similar manner, but only
        requires that one operation not precede another if there is an
        interval between the operations in which all messages are
        delivered
Weaker Consistency Conditions
    A timed execution, α of a read-write object is t-Connected
   Consistent if two criteria hold. First in executions in which no
   messages are lost, the execution is atomic. Second, in executions
   in which messages are lost, there exists a partial order P on the
   operations in α such that :
       1. P orders all write operations, and orders all read operations
        with respect to the write operations
       2. The value returned by every read operation is exactly the
        one written by the previous write operation in P or the initial
        value if there is no such previous write in P
       3. The order in P is consistent with the order of read and
        write requests submitted at each node
       4. Assume that there exists an interval of time longer than t
        in which no messages are lost. Further, assume an operation,
        θ completes before the interval begins, and another operation
        φ, begins after the interval ends. Then φ does not precede θ
        in the partial order P
Weaker Consistency Conditions



   t-Connected Consistency
        This guarantee allows for some stale data when messages are
         lost, but provides a time limit on how long it takes for
         consistency to return, once the partition heals
        This definition can be generalized to provide consistency
         guarantees when only some of the nodes are connected and
         when connections are available only some of the time
Weaker Consistency Conditions


   A variant of ”centralized algorithm” is t-Connected Consistent.
   Assume node C is the centralized node. The algorithm behaves as
   follows:
        read at node A : A sends a request to C from the most
         recent value. If A receives a response from C within time
         2 ∗ tmsg + tlocal , it saves the value and returns it to the client.
         Otherwise, A concludes that a message was lost and it returns
         the value with the highest sequence number that has ever
         been received from C , or the initial value if no value has yet
         been received from C . (When a client read request occurs at
         C it acts like any other node, sending messages to itself)
Weaker Consistency Conditions

       write at A : A sends a message to C with the new value. A
        waits 2 ∗ tmsg + tlocal , or until it receives an acknowledgement
        from C and then sends an acknowledgement to the client. At
        this point, either C has learned of the new value, or a
        message was lost, or both eventsoccured. If A concludes that
        a message was lost, it periodically retransmits the value to C
        (along with all values lost during earlier write operations) until
        it receives an acknowledgement from C . (As in the case of
        read operations, when a client write request occurs at C it
        acts like any other node, sending messages to itself)
       New value is received at C : C serializes the write requests
        that it hears about by assigning them consecutive integer
        tags. Periodically C broadcasts the latest value and sequence
        number to all other nodes.
Weaker Consistency Conditions
Weaker Consistency Conditions
   Theorem 4 The modified centralized algorithm is t-Connected
   consistent
        Proof: In executions in which no messages are lost, the
         operations are atomic. An execution is atomic if every
         operation acts as if it is executed at a single instant. In this
         case - that single instant occurs when C processes the
         operation. C serializes the operation, ensuring atomic
         consistency in executions in which all messages are delivered.
        We then examine executions in which messages are lost. The
         partial order P is constructed as follows. Write operations are
         ordered by the sequence number assigned by the central
         node.Each read operation is sequenced after the write
         operation whose value it returns. It is clear by construction
         that the partial order P satisfies criteria 1 and 2 of the
         definition of t-Connected consistency. As the algorithm
         handles requests in order received, criterion 3 is also clearly
         true
Weaker Consistency Conditions




       In showing that the partial order respects criterion 4, there are
        four cases : write followed by read, write followed by write,
        read followed by read and read followed by write. Let time t
        be long enough for a write operation to complete (and for C
        to assign a sequence number to the new value), and from one
        of the periodic broadcasts from C to occur.
Weaker Consistency Conditions


   1. write followed by read :
        Assume a write occurs at Aw , after which an interval of time
         longer than t passes in which all messages are delivered. After
         this, a read is requested at some node. By the end of the
         interval, two things have happened. First, Aw has notified the
         central node of the new value, and the write operation has
         been assigned a sequence number. Second, the central node
         has rebroadcast that value (or later value in the partial order)
         to all other nodes during one of the periodic broadcasts. As a
         result , the read operation does not return an earlier value,
         and therefore it must come after the write in the partial order
         P
Weaker Consistency Conditions
Weaker Consistency Conditions


   2. write followed by write :
        Assume a write occurs at Aw after which an interval of time
         longer than t passes in which all messages are delivered. After
         this a write is requested at some node. As in the previous
         case, by the end of the interval in which messages are
         delivered , the central node has assigned a sequence number
         to the write operation at Aw . As a result, the later write
         operation is sequenced by the central node after the first write
         operation. Therefore the second write comes after the first
         write in the partial order P.
Weaker Consistency Conditions
Weaker Consistency Conditions


   3. read followed by read :
        Assume a read operation occurs at Br , after which an interval
         of time longer than t passes in which all messages are
         delivered. After this a read is requested at some node. Let φ
         be the write operation whose value value the first read
         operation at Br returns. By the end of the interval in which
         messages are delivered, the central node has assigned a
         sequence number to φ and has broadcast the value of φ (or a
         later value in the partial order) to all other nodes. As a result,
         the second read operation does not return a value earlier in
         the partial order than φ. Therefore the second read operation
         does not precede the first in the partial order P.
Weaker Consistency Conditions
Weaker Consistency Conditions


   4. read followed by write :
        Assume a read operation occurs at Br , after which an interval
         of time longer than t passes in which all messages are
         delivered. After this a write is requested at some node. Let φ
         be the write operation whose value the first read operation at
         Br returns. By the end of the interval in which messages are
         delivered, the central node has assigned a sequence number to
         φ and as a result all write operations beginning after the
         interval are serialized after φ. Therefore the write operation
         does not precede the read operation in the partial order P.
Weaker Consistency Conditions
Weaker Consistency Conditions




       Therefore P satisfies criterion 4. of the definition and this
        algorithm is t-Connected Consistent.
Conclusion


       We have shown that it impossible to reliably provide atomic,
        consistent data when there are partitions in the network
       It is feasible, however, to achieve any two of the three
        properties : consistency, availability and partition tolerance
       In an asynchronous model, when no clocks are available, the
        impossibility result is fairly strong : it is impossible to provide
        consistent data, even allowing stale data to be returned when
        messages are lost
       However, in partially synchronous models it is possible to
        achieve a practical compromise between consistency and
        availability

More Related Content

What's hot

Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsAya Mahmoud
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
 
Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data TechnologiesDATAVERSITY
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query ProcessingMythili Kannan
 
Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)Ravindra Dastikop
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMSkoolkampus
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Gyanmanjari Institute Of Technology
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesYoav Francis
 
Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architectureYisal Khan
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 

What's hot (20)

Temporal databases
Temporal databasesTemporal databases
Temporal databases
 
Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systems
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data Technologies
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
 
Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)
 
Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMS
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
 
CAP Theorem
CAP TheoremCAP Theorem
CAP Theorem
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architecture
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Distributed Coordination-Based Systems
Distributed Coordination-Based SystemsDistributed Coordination-Based Systems
Distributed Coordination-Based Systems
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 

Similar to The CAP Theorem

Impromptu ideas in respect of v2 v and other
Impromptu ideas in respect of v2 v and otherImpromptu ideas in respect of v2 v and other
Impromptu ideas in respect of v2 v and otherHarshit Srivastava
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Continuent
 
Ballpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless NetworksBallpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless NetworksEditor IJCATR
 
OSI reference model
OSI reference modelOSI reference model
OSI reference modelshanthishyam
 
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DEMC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DEAravind NC
 
Chapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptx
Chapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptxChapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptx
Chapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptxWillianApaza
 
Chapter 2 LAN redundancy
Chapter 2   LAN  redundancyChapter 2   LAN  redundancy
Chapter 2 LAN redundancyJosue Wuezo
 
Importance of sliding window protocol
Importance of sliding window protocolImportance of sliding window protocol
Importance of sliding window protocoleSAT Journals
 
Importance of sliding window protocol
Importance of sliding window protocolImportance of sliding window protocol
Importance of sliding window protocoleSAT Publishing House
 
Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...Seung-gyu Byeon
 
Please answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docxPlease answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docxajoy21
 

Similar to The CAP Theorem (20)

Network Layer
Network LayerNetwork Layer
Network Layer
 
Impromptu ideas in respect of v2 v and other
Impromptu ideas in respect of v2 v and otherImpromptu ideas in respect of v2 v and other
Impromptu ideas in respect of v2 v and other
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
 
Opnet lab 3 solutions
Opnet lab 3 solutionsOpnet lab 3 solutions
Opnet lab 3 solutions
 
Ballpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless NetworksBallpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless Networks
 
Computer network
Computer networkComputer network
Computer network
 
Mit6 02 f12_chap18
Mit6 02 f12_chap18Mit6 02 f12_chap18
Mit6 02 f12_chap18
 
Transport laye
Transport laye Transport laye
Transport laye
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
OSI reference model
OSI reference modelOSI reference model
OSI reference model
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
Unit 2
Unit 2Unit 2
Unit 2
 
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DEMC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
 
Chapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptx
Chapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptxChapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptx
Chapter V-Connecting LANs, Backbone Networks, and Virtual LANs.pptx
 
Chapter 2 LAN redundancy
Chapter 2   LAN  redundancyChapter 2   LAN  redundancy
Chapter 2 LAN redundancy
 
Importance of sliding window protocol
Importance of sliding window protocolImportance of sliding window protocol
Importance of sliding window protocol
 
Importance of sliding window protocol
Importance of sliding window protocolImportance of sliding window protocol
Importance of sliding window protocol
 
Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
Mncs 16-09-1주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
 
Please answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docxPlease answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docx
 

More from Aleksandar Bradic

Creativity, Language, Computation - Sketching in Hardware 2022
Creativity, Language, Computation - Sketching in Hardware 2022Creativity, Language, Computation - Sketching in Hardware 2022
Creativity, Language, Computation - Sketching in Hardware 2022Aleksandar Bradic
 
Technology and Power: Agency, Discourse and Community Formation
Technology and Power: Agency, Discourse and Community FormationTechnology and Power: Agency, Discourse and Community Formation
Technology and Power: Agency, Discourse and Community FormationAleksandar Bradic
 
Theses on AI User Experience Design - Sketching in Hardware 2020
Theses on AI User Experience Design - Sketching in Hardware 2020Theses on AI User Experience Design - Sketching in Hardware 2020
Theses on AI User Experience Design - Sketching in Hardware 2020Aleksandar Bradic
 
Go-to-Market & Scaling: An Engineering Perspective
Go-to-Market & Scaling: An Engineering PerspectiveGo-to-Market & Scaling: An Engineering Perspective
Go-to-Market & Scaling: An Engineering PerspectiveAleksandar Bradic
 
Human-Data Interfaces: A case for hardware-driven innovation
Human-Data Interfaces: A case for hardware-driven innovationHuman-Data Interfaces: A case for hardware-driven innovation
Human-Data Interfaces: A case for hardware-driven innovationAleksandar Bradic
 
Computational Complexity for Poets
Computational Complexity for PoetsComputational Complexity for Poets
Computational Complexity for PoetsAleksandar Bradic
 
Kolmogorov Complexity, Art, and all that
Kolmogorov Complexity, Art, and all thatKolmogorov Complexity, Art, and all that
Kolmogorov Complexity, Art, and all thatAleksandar Bradic
 
De Bruijn Sequences for Fun and Profit
De Bruijn Sequences for Fun and ProfitDe Bruijn Sequences for Fun and Profit
De Bruijn Sequences for Fun and ProfitAleksandar Bradic
 
Can Data Help Us Build Better Hardware Products ?
Can Data Help Us Build Better Hardware Products ? Can Data Help Us Build Better Hardware Products ?
Can Data Help Us Build Better Hardware Products ? Aleksandar Bradic
 
S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformAleksandar Bradic
 

More from Aleksandar Bradic (11)

Creativity, Language, Computation - Sketching in Hardware 2022
Creativity, Language, Computation - Sketching in Hardware 2022Creativity, Language, Computation - Sketching in Hardware 2022
Creativity, Language, Computation - Sketching in Hardware 2022
 
Technology and Power: Agency, Discourse and Community Formation
Technology and Power: Agency, Discourse and Community FormationTechnology and Power: Agency, Discourse and Community Formation
Technology and Power: Agency, Discourse and Community Formation
 
Theses on AI User Experience Design - Sketching in Hardware 2020
Theses on AI User Experience Design - Sketching in Hardware 2020Theses on AI User Experience Design - Sketching in Hardware 2020
Theses on AI User Experience Design - Sketching in Hardware 2020
 
Go-to-Market & Scaling: An Engineering Perspective
Go-to-Market & Scaling: An Engineering PerspectiveGo-to-Market & Scaling: An Engineering Perspective
Go-to-Market & Scaling: An Engineering Perspective
 
Human-Data Interfaces: A case for hardware-driven innovation
Human-Data Interfaces: A case for hardware-driven innovationHuman-Data Interfaces: A case for hardware-driven innovation
Human-Data Interfaces: A case for hardware-driven innovation
 
Computational Complexity for Poets
Computational Complexity for PoetsComputational Complexity for Poets
Computational Complexity for Poets
 
Kolmogorov Complexity, Art, and all that
Kolmogorov Complexity, Art, and all thatKolmogorov Complexity, Art, and all that
Kolmogorov Complexity, Art, and all that
 
De Bruijn Sequences for Fun and Profit
De Bruijn Sequences for Fun and ProfitDe Bruijn Sequences for Fun and Profit
De Bruijn Sequences for Fun and Profit
 
Can Data Help Us Build Better Hardware Products ?
Can Data Help Us Build Better Hardware Products ? Can Data Help Us Build Better Hardware Products ?
Can Data Help Us Build Better Hardware Products ?
 
Supplyframe Engineering
Supplyframe EngineeringSupplyframe Engineering
Supplyframe Engineering
 
S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 

The CAP Theorem

  • 1. The CAP Theorem : Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Aleksandar Bradic, Vast.com nosqlsummer | Belgrade 28 August 2010
  • 2. CAP Theorem Conjecture by Eric Brewer at PODC 2000 : It is impossible for a web service to provide following three guarantees : Consistency Availability Partition-tolerance
  • 3. CAP Theorem Consistency - all nodes should see the same data at the same time Availability - node failures do not prevent survivors from continuing to operate Partition-tolerance - the system continues to operate despite arbitrary message loss
  • 4. CAP Theorem CAP Theorem : It is impossible for a web service to provide following three guarantees : Consistency Availability Partition-tolerance A distributed system can satisfy any two of these guarantees at the same time but not all three
  • 5. CAP Theorem CAP Theorem Conjecture since 2000 Established as theorem in 2002: Lynch, Nancy, and Seth Gilbert. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, v. 33 issue 2, 2002, p. 51-59.
  • 6. CAP Theorem A distributed system can satisfy any two of CAP guarantees at the same time but not all three: Consistency + Availability Consistency + Partition Tolerance Availability + Partition Tolerance
  • 7. Consistency + Availability Examples: Single-site databases Cluster databases LDAP xFS file system Traits: 2-phase commit cache validation protocols
  • 8. Consistency + Partition Tolerance Examples: Distributed databases Distributed Locking Majority protocols Traits: Pessimistic locking Make minority partitions unavailable
  • 9. Availability + Partition Tolerance Examples: Coda Web caching DNS Traits: expiration/leases conflict resolution optimistic
  • 10. Enterprise System CAP Classification RDBMS : CA (Master/Slave replication, Sharding) Amazon Dynamo : AP (Read-repair, application hooks) Terracota : CA (Quorum vote, majority partition survival) Apache Cassandra : AP (Partitioning, Read-repair) Apache Zookeeper: AP (Consensus protocol) Google BigTable : CA Apache CouchDB : AP
  • 11. Enterprise System CAP Classification
  • 12. Some Techniques: Consistent Hashing Vector Clocks Sloppy Quorum Merkle trees Gossip-based protocols ...
  • 13. Proof of the CAP Theorem Lynch, Nancy, and Seth Gilbert. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, v. 33 issue 2, 2002, p. 51-59.
  • 14. Formal Model Formalization of the notion of Consistency, Availability and Partition Tolerance : Atomic Data Object Available Data Object Partition Tolerance
  • 15. Atomic Data Objects Atomic/Linearizable Consistency: There must exist a total order on all operation such that each operation looks as if it were completed at a single instant This is equivalent to requiring requests on the distributed shared memory to act as if they are executing on single node, responding to operations one at the time
  • 16. Available Data Objects For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response That is, any algorithm used by service must eventually terminate (In some ways, this is weak definition of availability : it puts no bounds on how long the algorithm may run before terminating, and therefore allows unbounded computation) (On the other hand, when qualified by the need for partition tolerance, this can be seen as a strong definition of availability : even when severe network failures occur, every request must terminate)
  • 17. Partition Tolerance In order to model partition tolerance, the network is allowed to lose arbitrary many messages sent from one node to another When a network is partitioned, all messages sent from nodes in one component of the partition to another component are lost. The atomicity requirement implies that every response will be atomic, even though arbitrary messages sent as part of the algorithm might not be delivered The availability requirement therefore implies that every node receiving request from a client must respond, even through arbitrary messages that are sent may be lost Partition Tolerance : No set of failures less than total network failure is allowed to cause the system to respond incorrectly
  • 18. Formal Framework Asynchronous Network Model Partially Synchronous Network Model
  • 19. Asynchronous Networks There is no clock Nodes must make decisions based only on messages received and local computation
  • 20. Asynchronous Networks : Impossibility Result Theorem 1 : It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: Availability Atomic consistency in all fair executions (including those in which messages are lost)
  • 21. Asynchronous Networks : Impossibility Result Proof (by contradiction) : Assume an algorithm A exists that meets the three criteria : atomicity, availability and partition tolerance We construct an execution of A in which there exists a request that returns and inconsistent response Assume that the network consists of at least two nodes. Thus it can be divided into two disjoint, non-empty sets G1 , G2 Assume all messages between G1 and G2 are lost. If a write occurs in G 1 and read oddurs in G2 , then the read operation cannot return the results of earlier write operation.
  • 22. Asynchronous Networks : Impossibility Result Formal proof: Let v0 be the initial value of the atomic object Let α1 be the prefix of an execution of A in which a single write of a value not equal to v0 occurs in G1 , ending with the termination of the write operation. assume that no other client requests occur in either G1 or G2 . assume that no messages from G1 are received in G2 and no messages from G2 are received in G1 we know that write operation will complete (by the availability requirement)
  • 23. Asynchronous Networks : Impossibility Result Let α2 be the prefix of an execution in which a single read occurs in G2 and no other client requests occur, ending with the termination of the read operation During α2 no messages from G2 are received in G1 and no messages from G1 are received in G 2 We know that the read must return a value (by the availability requirement) The value returned by this execution must be v0 as no write operation has occurred in α2
  • 24. Asynchronous Networks : Impossibility Result Let α be an execution beginning with α1 and continuing with α2 . To the nodes in G2 , α is indistinguishable from α2 , as all the messages from G1 to G2 are lost (in both α1 and α2 that together make up α), and α1 does not include any client requests to nodes in G2 . Therefore, in the α execution - the read request (from α2 ) must still return v0 . However, the read request does not begin until after the write request (from α1 ) has completed This therefore contradicts the atomicity property, proving that no such algorithm exists
  • 25. Asynchronous Networks : Impossibility Result
  • 26. Asynchronous Networks : Impossibility Result Corollary 1.1: It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: Availability - in all fair executions Atomic consistency - in fair executions in which no messages are lost
  • 27. Asynchronous Networks : Impossibility Result Proof: The main idea is that in the asynchronous model, and algorithm has no way of determining whether a message has been lost, or has been arbitrary delayed in the transmission channel Therefore if there existed an an algorithm that guaranteed atomic consistency in executions in which no messages were lost, there would exist an algorithm that guaranteed atomic consistency in all executions. This would violate Theorem 1
  • 28. Asynchronous Networks : Impossibility Result Assume that there exists an algorithm A that always terminates, and guarantees atomic consistency in fair executions in which all messages are delivered Theorem 1 implies that A does not guarantee atomic consistency in al fair execution, so there exist some fair execution of α in which some response is not atomic
  • 29. Asynchronous Networks : Impossibility Result At some finite point in execution α1 , the algorithm A returns a response that is not atomic. Let α be the prefix of α ending with the invalid response. Next, extend α to to a fair execution α in which all messages are delivered The execution α is now a fair execution in which all messages are delivered However, this execution is not atomic. Therefore no such algorithm A exists
  • 30. Solutions in the Asynchronous Model While it is impossible to provide all three properties : atomicity, availability and partition tolerance, any two of these properties can be achieved: Atomic, Partition Tolerant Atomic, Available Atomic, Partition Tolerant
  • 31. Atomic, Partition Tolerant If availability is not required , it is easy to achieve atomic data and partition tolerance The trivial system that ignores all requests meets these requirements Stronger liveness criterion : if all the messages in an execution are delivered, system is available and all operations terminate A simple centralized algorithm meets these requirements : a single designated node maintains the value of an object A node receiving request forwards the request to designated node which sends a response. When acknowledgement is received, the node sends a response to the client Many distributed databases provide this guarantee, especially algorithms based on distributed locking or quorums : if certain failure patterns occur, liveness condition is weakened and the service no longer returns response. If there are no failures, then liveness is guaranteed
  • 32. Atomic, Available If there are no partitions - it is possible to provide atomic, available data Centralized algorithm with single designated node for maintaining value of an object meets these requirements
  • 33. Available, Partition Tolerant It is possible to provide high availability and partition tolerance if atomic consistency is not required If there are no consistency requirements, the service can trivially return v0 , the initial value in response to every request It is possible to provide weakened consistency in an available, partition-tolerant setting Web caches are one example of weakly consistent network
  • 34. Partially Synchronous Model In the real world, most networks are not purely asynchronous If we allow each node in the network to have a clock, it is possible to build a more powerful service In partially synchronous model - every node has a clock and all clocks increase at the same rate However, the clocks themselves are not synchronized, in that they might display different variables at the same real time In effect : the clocks act as timers : local state variables that the process can observe to measure how much time has passed A local timer can be used to schedule an action to occur a certain interval of time after some other event Furthermore, assume that every message is either delivered within a given, known time tmax or it is lost Also, every node processes a received message within a given, known time tlocal and local processing time
  • 35. Partially Synchronous Networks : Impossibility Result Theorem 2: It is impossible in the partially synchronous network model to implement a read/write data object that guarantees the following properties: Availability Atomic consistency in all executions (even those in which messages are lost)
  • 36. Partially Synchronous Networks : Impossibility Result Proof: Same methodology as in case of Theorem 1 is used we divide the network into two components {G1 , G2 } and construct an admissible execution in which a write happens in one component, followed by a read operation in the other component. This read operation can be shown to return inconsistent data
  • 37. Partially Synchronous Networks : Impossibility Result We construct execution α1 : a single write request and acknowledgement occurs in G1 , and all messages between the two components {G1 , G2 } are lost Let α2 be an execution that begins with a long interval of time during which no client requests occur This interval must be at least long as the entire duration of α1 Then append to α2 the events of α2 in following manner : a single read request and response in G2 assuming all messages between the two components are lost Finally - we construct α by superimposing two execution α1 and α2 The long interval of time in α2 ensures that the write request competes before the read request begins However, the read request returns the initial value, rather than the new value written by the write request, violating atomic consistency
  • 38. Asynchronous Networks : Impossibility Result
  • 39. Solutions in the Partially Synchronous Model Corollary 1.1(Asynchronous network model): It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: Availability - in all fair executions Atomic consistency - in fair executions in which no messages are lost In partially synchronous model - the analogue of Corollary 1.1 does not hold The proof of this corollary depends on nodes being unaware of when a message is lost There are partially synchronous algorithms that will return atomic data when all messages in an execution are delivered (i.e. there are no partitions) - and will only return inconsistent data when messages are lost
  • 40. Solutions in the Partially Synchronous Model An example of such an algorithm is the centralized protocol with single object state store modified to time-out lost messages : On a read (or write) request, a message is sent to the central node If a response from central node is received, then the node delivers the requested data (or an acknowledgement) If no response is received within 2 ∗ tmsg + tlocal , then the node concludes that the message was lost The client is then sent a response : either the best known value of the local node (for a read operation) or an acknowledgement (for a write) operation. In this case, atomic consistency may be violated.
  • 41. Solutions in the Partially Synchronous Model
  • 42. Weaker Consistency Conditions While it is useful to guarantee that atomic data will be returned in executions in which all messages are delivered, it is equally important to specify what happens in executions in which some of the messages are lost We discuss possible weaker consistency condition that allows stale data to be returned when there are partitions, yet place formal requirements on the quality of stale data returned This consistency guarantee will require availability and atomic consistency in executions in which no messages are lost and is therefore impossible to guarantee in the asynchronous model as a result of corollary In the partially synchronous model it often makes sense to base guarantees on how long an algorithm has had to rectify a situation This consistency model ensures that if messages are delivered, then eventually some notion of atomicity is restored
  • 43. Weaker Consistency Conditions In a atomic execution, we define a partial order of the read and write operations and then require that if one operation begins after another one ends, the former does not precede the latter in the partial order. We define a weaker guarantee, t-Connected Consistency, which defines a partial order in similar manner, but only requires that one operation not precede another if there is an interval between the operations in which all messages are delivered
  • 44. Weaker Consistency Conditions A timed execution, α of a read-write object is t-Connected Consistent if two criteria hold. First in executions in which no messages are lost, the execution is atomic. Second, in executions in which messages are lost, there exists a partial order P on the operations in α such that : 1. P orders all write operations, and orders all read operations with respect to the write operations 2. The value returned by every read operation is exactly the one written by the previous write operation in P or the initial value if there is no such previous write in P 3. The order in P is consistent with the order of read and write requests submitted at each node 4. Assume that there exists an interval of time longer than t in which no messages are lost. Further, assume an operation, θ completes before the interval begins, and another operation φ, begins after the interval ends. Then φ does not precede θ in the partial order P
  • 45. Weaker Consistency Conditions t-Connected Consistency This guarantee allows for some stale data when messages are lost, but provides a time limit on how long it takes for consistency to return, once the partition heals This definition can be generalized to provide consistency guarantees when only some of the nodes are connected and when connections are available only some of the time
  • 46. Weaker Consistency Conditions A variant of ”centralized algorithm” is t-Connected Consistent. Assume node C is the centralized node. The algorithm behaves as follows: read at node A : A sends a request to C from the most recent value. If A receives a response from C within time 2 ∗ tmsg + tlocal , it saves the value and returns it to the client. Otherwise, A concludes that a message was lost and it returns the value with the highest sequence number that has ever been received from C , or the initial value if no value has yet been received from C . (When a client read request occurs at C it acts like any other node, sending messages to itself)
  • 47. Weaker Consistency Conditions write at A : A sends a message to C with the new value. A waits 2 ∗ tmsg + tlocal , or until it receives an acknowledgement from C and then sends an acknowledgement to the client. At this point, either C has learned of the new value, or a message was lost, or both eventsoccured. If A concludes that a message was lost, it periodically retransmits the value to C (along with all values lost during earlier write operations) until it receives an acknowledgement from C . (As in the case of read operations, when a client write request occurs at C it acts like any other node, sending messages to itself) New value is received at C : C serializes the write requests that it hears about by assigning them consecutive integer tags. Periodically C broadcasts the latest value and sequence number to all other nodes.
  • 49. Weaker Consistency Conditions Theorem 4 The modified centralized algorithm is t-Connected consistent Proof: In executions in which no messages are lost, the operations are atomic. An execution is atomic if every operation acts as if it is executed at a single instant. In this case - that single instant occurs when C processes the operation. C serializes the operation, ensuring atomic consistency in executions in which all messages are delivered. We then examine executions in which messages are lost. The partial order P is constructed as follows. Write operations are ordered by the sequence number assigned by the central node.Each read operation is sequenced after the write operation whose value it returns. It is clear by construction that the partial order P satisfies criteria 1 and 2 of the definition of t-Connected consistency. As the algorithm handles requests in order received, criterion 3 is also clearly true
  • 50. Weaker Consistency Conditions In showing that the partial order respects criterion 4, there are four cases : write followed by read, write followed by write, read followed by read and read followed by write. Let time t be long enough for a write operation to complete (and for C to assign a sequence number to the new value), and from one of the periodic broadcasts from C to occur.
  • 51. Weaker Consistency Conditions 1. write followed by read : Assume a write occurs at Aw , after which an interval of time longer than t passes in which all messages are delivered. After this, a read is requested at some node. By the end of the interval, two things have happened. First, Aw has notified the central node of the new value, and the write operation has been assigned a sequence number. Second, the central node has rebroadcast that value (or later value in the partial order) to all other nodes during one of the periodic broadcasts. As a result , the read operation does not return an earlier value, and therefore it must come after the write in the partial order P
  • 53. Weaker Consistency Conditions 2. write followed by write : Assume a write occurs at Aw after which an interval of time longer than t passes in which all messages are delivered. After this a write is requested at some node. As in the previous case, by the end of the interval in which messages are delivered , the central node has assigned a sequence number to the write operation at Aw . As a result, the later write operation is sequenced by the central node after the first write operation. Therefore the second write comes after the first write in the partial order P.
  • 55. Weaker Consistency Conditions 3. read followed by read : Assume a read operation occurs at Br , after which an interval of time longer than t passes in which all messages are delivered. After this a read is requested at some node. Let φ be the write operation whose value value the first read operation at Br returns. By the end of the interval in which messages are delivered, the central node has assigned a sequence number to φ and has broadcast the value of φ (or a later value in the partial order) to all other nodes. As a result, the second read operation does not return a value earlier in the partial order than φ. Therefore the second read operation does not precede the first in the partial order P.
  • 57. Weaker Consistency Conditions 4. read followed by write : Assume a read operation occurs at Br , after which an interval of time longer than t passes in which all messages are delivered. After this a write is requested at some node. Let φ be the write operation whose value the first read operation at Br returns. By the end of the interval in which messages are delivered, the central node has assigned a sequence number to φ and as a result all write operations beginning after the interval are serialized after φ. Therefore the write operation does not precede the read operation in the partial order P.
  • 59. Weaker Consistency Conditions Therefore P satisfies criterion 4. of the definition and this algorithm is t-Connected Consistent.
  • 60. Conclusion We have shown that it impossible to reliably provide atomic, consistent data when there are partitions in the network It is feasible, however, to achieve any two of the three properties : consistency, availability and partition tolerance In an asynchronous model, when no clocks are available, the impossibility result is fairly strong : it is impossible to provide consistent data, even allowing stale data to be returned when messages are lost However, in partially synchronous models it is possible to achieve a practical compromise between consistency and availability