SlideShare a Scribd company logo
1 of 32
MEGASTORE: Providing
Scalable, Highly Available
Storage for Interactive
Services
Guided By- Prof. Kong Li
Presented By- (TEAM 1)
Anumeha Shah(009423973)
Ankita Kapratwar (009413469)
Swapna Kulkarni(009264905)
What is Megastore
● Megastore combines the scalability and availability of NoSQL datastore
with ACID semantics of RDBMS in an innovative way so that it can meet
the requirement of interactive online services. Megastore provides both the
high consistency as well as high availability which can not be provided by
NoSQL or RDBMS alone.
● Megastore uses Paxos replication and consensus algorithm for high
availability and with low latency.
● Partitions the data to a fine granularity and ACID semantics within the
partition across wide area network with low latency.
Why Megastore
Online interactive services requires high availability as well as high
consistency.
● Online services are growing exceedingly as potential users are growing
exceedingly.
● More and more desktop services are moving to the cloud
● Opposing requirements of storage demands are arising and making the
storage challenging
Reasons for opposing requirements are:
● Applications should be scalable Services should be responsive.
● User should have consistent view of the data
● Services should be highly available services to be up for 24/7 services to be
Approach to Provide High Availability
and Consistency
Two approaches has been taken.
1. synchronous fault tolerant log replicator to provide availability.
2. To provide scalability partition the data into many small databases and
provide each database with its own log replicator.
Replications for High Availability
Need for replications:
● Replication is needed for high availability
● replication with in data center overcome the host specific failures
● But to overcome datacenter specific failure and regional disaster the data
should be replicated over geographically distributed datacenters.
Common Replication Strategies and
Issues
Asynchronous master/slave
● write ahead log entries are replicated by master node to at least one slave.
● Log appends acknowledgement at master and transmissions to slave
happens parallely.
● However if master fails then we can experience downtime till a slave
becomes master and also loss of data can occur.
Synchronous master/slave:
● Changes on masters and slave are done synchronously that is master
acknowledge the change once the changes are mirrored to slaves.
● This approach prevent data loss in failover of master to slave.
● However failures need timely detection using an external system because it
may cause high latency and user visible outage
Common Replication Strategies and
Issues Cont..
Optimistic Replication:
● There is no master.
● Any member can accept the changes and the changes propagates through
the group asynchronously. This approach provide high availability and
excellent latency
● However transactions are not possible as global mutation orderings are not
known at commit time.
Use of Paxos for Replication
● Paxos is fault tolerant consensus algorithm
● There is no master but group of similar peers
● A write ahead log can be replicated over all the peers.
● Any of the peer can initiate read or write.
● Log add the changes only if majority of the peers acknowledges the
changes.
● The other peers which did not acknowledge the change eventually
acknowledge.
● No distinguished failed state
Use of Paxos for Replication Cont..
Issues with Paxos replication Strategy
● If we have only one replicated log over wide area then it might suffer high
latencies which will limit the throughput.
● What if none of the replica is updated.
● What if majority of the replica does not acknowledge the writes
Solution
● Partition the data
● Multiple replicated logs.
● Each partition of the data will have its own replicated log.
● Synchronous log replication among the data centers.
Partitioning For Scalable Replication
Cross Entity Groups Operations.
Partitioning For Scalability and
Consistency
● Partition the data into entity groups
● Each partition is replicated across different data centers synchronously
and independently
● The data is stored in NoSQL datastore in datacenter
● Within an entity group the changes are done using single phase ACID
semantics.
● But across the entity group changes or operations are done using two
phase single commit using asynchronous messaging.
● These entity groups are logically distant not physically distant. So
operations across the different entity groups are local
● The traffic between the data centers is only for synchronous replications
Physical Layout
How to select entity group boundaries:
● Should not be too fine grained as it may require excessive cross group
operations. Group should also not contain large no of entities as it may
cause unnecessary writes.
Physical Layout
● Google’s big table as a storage system which is fault tolerant and scalable
● Applications keeps the data near the user or to a region where it is being
accessed the most and maintains replications near each other to avoid
failures and high latency during failures. Keeps the group of data which are
accessed together either close to each other or with in the same row.
● Implement cache for low latency
Data Model Overview
● Lies between abstract tuples of RDBMS and concrete row-column storage.
● Schema=>Set of tables =>contains entities=>contains properties
● Entity group will consist of a root entity along with all entities in child table
that references it
Data Model Cont..
Indexes
● This can be applied to any property
● Local Index- Used to find data within an entity group
● Global Index- Used to find entities without knowing in advance the entity
groups that contain them
● Storing Clause- Applications store additional properties from the primary
table for faster access at read time
● Repeated Indexes- For repeated properties
● Inline indexes: Extracting slices of information from child entities and
storing it in the data in parent for fast access. Implements many to many
links
Mapping to Bigtable
● Here the column name = Megastore table name + Property name
● Each Bigtable row stores transaction, metadata and log for the group
● Metadata is in the same row which allows to update atomically through a
single Bigtable transaction
● Index Entry- represented as a Bigtable row. Row key = Indexed property
values + primary key of indexed entity
Transactions and Concurrency control
● Entity group functions as a mini-database.
● Transaction writes mutations in write-ahead log, then mutations will apply to
data
● Multiple values can be stored in the same row/column pair with different
timestamps
● Multiversion Concurrency control- MVCC
● Readers and writers don’t block each other
Cont..
Reads-
a. Current- ensure that all committed writes are applied first, then read latest
committed transaction
b. Snapshot- reads the latest committed write operation
c. Inconsistent- Ignore the state of log and read latest value
Writes-
Begins with a current read to determine the next available log position.
Commit operation gathers mutations into a log entry, assigns it a timestamp
higher than any prev ones and appends to the log using Paxos
Transaction Lifecycle
READ- Obtain
Timestamp & Log
Position of last
committed transaction
Application Logic-
Read from Bigtable
and gather writes
into a log entry
Commit - Use
Paxos for
appending that
entry to log
Apply - Write
mutations to the
entities and
indexes
Clean Up - Delete
data that is no
longer required
Replication
● Initiation of reads and writes can be done from any replica
● Replication is done per entity group by synchronously replicating the
groups transaction log to a quorum of replicas
● Reads guarantees:
o Read will always observe the last-acknowledged write.
o After a write has been observed, all future reads observe that write
Megastore Architecture
Data Structures and Algorithms
Replicated Logs
Reads
Algorithm for a Current Read
● Query Local
● Find Position
● Catch-Up
● Validate
● Query Data
Reads
Timeline for reads for local replica A
Writes
Algorithm for writes
● Accept Leader
● Prepare
● Accept
● Invalidate
● Apply
Writes
Timeline for writes
Coordinator Availability
Failure Detection
● Google's Chubby lock service is used
● Writers are insulated from coordinator failure by testing whether a
coordinator has lost its locks
Validation Races
● Races between validates for earlier writes and invalidates for later
writes are protected in the coordinator by always sending the log
position associated with the action.
Operational Issues
Distribution of Availability
Production Metrics
Distribution of Average Latencies
Conclusion
● Megastore
● Paxos for Synchronization
● Bigtable Datastore
QUESTIONS??????
THANK YOU !

More Related Content

What's hot

File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing modelsishmecse13
 
Timestamp based protocol
Timestamp based protocolTimestamp based protocol
Timestamp based protocolVincent Chu
 
Vert.x for Microservices Architecture
Vert.x for Microservices ArchitectureVert.x for Microservices Architecture
Vert.x for Microservices ArchitectureIdan Fridman
 
Temporal database, Multimedia database, Access control, Flow control
Temporal database, Multimedia database, Access  control, Flow controlTemporal database, Multimedia database, Access  control, Flow control
Temporal database, Multimedia database, Access control, Flow controlPooja Dixit
 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedis Labs
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InSage Weil
 
Seminar Report on Google File System
Seminar Report on Google File SystemSeminar Report on Google File System
Seminar Report on Google File SystemVishal Polley
 
Airflow tutorials hands_on
Airflow tutorials hands_onAirflow tutorials hands_on
Airflow tutorials hands_onpko89403
 
The Google File System (GFS)
The Google File System (GFS)The Google File System (GFS)
The Google File System (GFS)Romain Jacotin
 

What's hot (20)

File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing models
 
Voldemort
VoldemortVoldemort
Voldemort
 
Timestamp based protocol
Timestamp based protocolTimestamp based protocol
Timestamp based protocol
 
Concurrency control
Concurrency controlConcurrency control
Concurrency control
 
Consistency in NoSQL
Consistency in NoSQLConsistency in NoSQL
Consistency in NoSQL
 
Vert.x for Microservices Architecture
Vert.x for Microservices ArchitectureVert.x for Microservices Architecture
Vert.x for Microservices Architecture
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Google File System
Google File SystemGoogle File System
Google File System
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
Temporal database, Multimedia database, Access control, Flow control
Temporal database, Multimedia database, Access  control, Flow controlTemporal database, Multimedia database, Access  control, Flow control
Temporal database, Multimedia database, Access control, Flow control
 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
 
Active database
Active databaseActive database
Active database
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
 
Seminar Report on Google File System
Seminar Report on Google File SystemSeminar Report on Google File System
Seminar Report on Google File System
 
Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
 
Mutimedia databases
Mutimedia databasesMutimedia databases
Mutimedia databases
 
Airflow tutorials hands_on
Airflow tutorials hands_onAirflow tutorials hands_on
Airflow tutorials hands_on
 
Middleware
MiddlewareMiddleware
Middleware
 
The Google File System (GFS)
The Google File System (GFS)The Google File System (GFS)
The Google File System (GFS)
 

Viewers also liked

Db presentation google_megastore
Db presentation google_megastoreDb presentation google_megastore
Db presentation google_megastoreAlanoud Alqoufi
 
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...Maciek Jozwiak
 
MORE Mega Store .........
MORE Mega Store .........MORE Mega Store .........
MORE Mega Store .........PESHWA ACHARYA
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
 
Cassandra Compression and Performance Evaluation
Cassandra Compression and Performance EvaluationCassandra Compression and Performance Evaluation
Cassandra Compression and Performance EvaluationSchubert Zhang
 

Viewers also liked (7)

Db presentation google_megastore
Db presentation google_megastoreDb presentation google_megastore
Db presentation google_megastore
 
Noha mega store
Noha mega storeNoha mega store
Noha mega store
 
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
 
Spanner
SpannerSpanner
Spanner
 
MORE Mega Store .........
MORE Mega Store .........MORE Mega Store .........
MORE Mega Store .........
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
 
Cassandra Compression and Performance Evaluation
Cassandra Compression and Performance EvaluationCassandra Compression and Performance Evaluation
Cassandra Compression and Performance Evaluation
 

Similar to Megastore by Google

BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesSrinath Perera
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streamingdatamantra
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterContinuent
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerFederico Palladoro
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IOPiyush Katariya
 
Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Dharma Shukla
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High AvailabilityMariaDB plc
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data gridBogdan Dina
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategyMariaDB plc
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouMariaDB plc
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedShubham Tagra
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBMariaDB plc
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®confluent
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategyMariaDB plc
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of stateYoni Farin
 
Enabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedEnabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedShubham Tagra
 

Similar to Megastore by Google (20)

Presto
PrestoPresto
Presto
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streaming
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on docker
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IO
 
Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High Availability
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speed
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDB
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of state
 
Enabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedEnabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speed
 

Recently uploaded

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 

Recently uploaded (20)

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 

Megastore by Google

  • 1. MEGASTORE: Providing Scalable, Highly Available Storage for Interactive Services Guided By- Prof. Kong Li Presented By- (TEAM 1) Anumeha Shah(009423973) Ankita Kapratwar (009413469) Swapna Kulkarni(009264905)
  • 2. What is Megastore ● Megastore combines the scalability and availability of NoSQL datastore with ACID semantics of RDBMS in an innovative way so that it can meet the requirement of interactive online services. Megastore provides both the high consistency as well as high availability which can not be provided by NoSQL or RDBMS alone. ● Megastore uses Paxos replication and consensus algorithm for high availability and with low latency. ● Partitions the data to a fine granularity and ACID semantics within the partition across wide area network with low latency.
  • 3. Why Megastore Online interactive services requires high availability as well as high consistency. ● Online services are growing exceedingly as potential users are growing exceedingly. ● More and more desktop services are moving to the cloud ● Opposing requirements of storage demands are arising and making the storage challenging Reasons for opposing requirements are: ● Applications should be scalable Services should be responsive. ● User should have consistent view of the data ● Services should be highly available services to be up for 24/7 services to be
  • 4. Approach to Provide High Availability and Consistency Two approaches has been taken. 1. synchronous fault tolerant log replicator to provide availability. 2. To provide scalability partition the data into many small databases and provide each database with its own log replicator. Replications for High Availability Need for replications: ● Replication is needed for high availability ● replication with in data center overcome the host specific failures ● But to overcome datacenter specific failure and regional disaster the data should be replicated over geographically distributed datacenters.
  • 5. Common Replication Strategies and Issues Asynchronous master/slave ● write ahead log entries are replicated by master node to at least one slave. ● Log appends acknowledgement at master and transmissions to slave happens parallely. ● However if master fails then we can experience downtime till a slave becomes master and also loss of data can occur. Synchronous master/slave: ● Changes on masters and slave are done synchronously that is master acknowledge the change once the changes are mirrored to slaves. ● This approach prevent data loss in failover of master to slave. ● However failures need timely detection using an external system because it may cause high latency and user visible outage
  • 6. Common Replication Strategies and Issues Cont.. Optimistic Replication: ● There is no master. ● Any member can accept the changes and the changes propagates through the group asynchronously. This approach provide high availability and excellent latency ● However transactions are not possible as global mutation orderings are not known at commit time.
  • 7. Use of Paxos for Replication ● Paxos is fault tolerant consensus algorithm ● There is no master but group of similar peers ● A write ahead log can be replicated over all the peers. ● Any of the peer can initiate read or write. ● Log add the changes only if majority of the peers acknowledges the changes. ● The other peers which did not acknowledge the change eventually acknowledge. ● No distinguished failed state
  • 8. Use of Paxos for Replication Cont.. Issues with Paxos replication Strategy ● If we have only one replicated log over wide area then it might suffer high latencies which will limit the throughput. ● What if none of the replica is updated. ● What if majority of the replica does not acknowledge the writes Solution ● Partition the data ● Multiple replicated logs. ● Each partition of the data will have its own replicated log. ● Synchronous log replication among the data centers.
  • 10. Cross Entity Groups Operations.
  • 11. Partitioning For Scalability and Consistency ● Partition the data into entity groups ● Each partition is replicated across different data centers synchronously and independently ● The data is stored in NoSQL datastore in datacenter ● Within an entity group the changes are done using single phase ACID semantics. ● But across the entity group changes or operations are done using two phase single commit using asynchronous messaging. ● These entity groups are logically distant not physically distant. So operations across the different entity groups are local ● The traffic between the data centers is only for synchronous replications
  • 12. Physical Layout How to select entity group boundaries: ● Should not be too fine grained as it may require excessive cross group operations. Group should also not contain large no of entities as it may cause unnecessary writes. Physical Layout ● Google’s big table as a storage system which is fault tolerant and scalable ● Applications keeps the data near the user or to a region where it is being accessed the most and maintains replications near each other to avoid failures and high latency during failures. Keeps the group of data which are accessed together either close to each other or with in the same row. ● Implement cache for low latency
  • 13. Data Model Overview ● Lies between abstract tuples of RDBMS and concrete row-column storage. ● Schema=>Set of tables =>contains entities=>contains properties ● Entity group will consist of a root entity along with all entities in child table that references it
  • 15. Indexes ● This can be applied to any property ● Local Index- Used to find data within an entity group ● Global Index- Used to find entities without knowing in advance the entity groups that contain them ● Storing Clause- Applications store additional properties from the primary table for faster access at read time ● Repeated Indexes- For repeated properties ● Inline indexes: Extracting slices of information from child entities and storing it in the data in parent for fast access. Implements many to many links
  • 16. Mapping to Bigtable ● Here the column name = Megastore table name + Property name ● Each Bigtable row stores transaction, metadata and log for the group ● Metadata is in the same row which allows to update atomically through a single Bigtable transaction ● Index Entry- represented as a Bigtable row. Row key = Indexed property values + primary key of indexed entity
  • 17. Transactions and Concurrency control ● Entity group functions as a mini-database. ● Transaction writes mutations in write-ahead log, then mutations will apply to data ● Multiple values can be stored in the same row/column pair with different timestamps ● Multiversion Concurrency control- MVCC ● Readers and writers don’t block each other
  • 18. Cont.. Reads- a. Current- ensure that all committed writes are applied first, then read latest committed transaction b. Snapshot- reads the latest committed write operation c. Inconsistent- Ignore the state of log and read latest value Writes- Begins with a current read to determine the next available log position. Commit operation gathers mutations into a log entry, assigns it a timestamp higher than any prev ones and appends to the log using Paxos
  • 19. Transaction Lifecycle READ- Obtain Timestamp & Log Position of last committed transaction Application Logic- Read from Bigtable and gather writes into a log entry Commit - Use Paxos for appending that entry to log Apply - Write mutations to the entities and indexes Clean Up - Delete data that is no longer required
  • 20. Replication ● Initiation of reads and writes can be done from any replica ● Replication is done per entity group by synchronously replicating the groups transaction log to a quorum of replicas ● Reads guarantees: o Read will always observe the last-acknowledged write. o After a write has been observed, all future reads observe that write
  • 22. Data Structures and Algorithms Replicated Logs
  • 23. Reads Algorithm for a Current Read ● Query Local ● Find Position ● Catch-Up ● Validate ● Query Data
  • 24. Reads Timeline for reads for local replica A
  • 25. Writes Algorithm for writes ● Accept Leader ● Prepare ● Accept ● Invalidate ● Apply
  • 27. Coordinator Availability Failure Detection ● Google's Chubby lock service is used ● Writers are insulated from coordinator failure by testing whether a coordinator has lost its locks Validation Races ● Races between validates for earlier writes and invalidates for later writes are protected in the coordinator by always sending the log position associated with the action.
  • 30. Conclusion ● Megastore ● Paxos for Synchronization ● Bigtable Datastore