SlideShare a Scribd company logo
1 of 38
© Copyright 2016 Pivotal. All rights reserved.© Copyright 2013 Pivotal. All rights reserved.
WAN Gateway
Multi-site and Active-active Design Patterns
© Copyright 2016 Pivotal. All rights reserved.
• The multi-site capability connects geographically separated
distributed systems.
• It is important to understand that while you are presented with a
façade that APPEARS like it is all one system, each of the distributed
systems actually behaves autonomously
• A multi-site installation based on the WAN Gateway consists of two
or more distributed systems that are loosely coupled.
• Each site manages its own distributed system, and region data is
distributed to remote sites using one or more connections.
Using WAN Gateways
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
The connections consist of a gateway sender in the sending site and a
corresponding gateway receiver in the receiving site.
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Topologies
DISASTER RECOVERY
Disaster Recovery or Business
Continuity is still the most popular
design pattern for multi-site
replication using the WAN Gateway.
One important distinction
Geode WAN Gateway is designed to
be bi-directional.
That makes fail-back much easier.
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Topologies
Active/Active
If your use-case is
read-only, or updates
are limited to specific
entries belonging to a
given user, then it is
extremely easy to load
balance across
multiple sites hosting
Geode clusters in an
Active/Active
configuration.
Apache Geode
IIS Farm
DB
Clients
Apache Geode
IIS Farm
DB
Site 1
Site 2
WAN
Gateway
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Topologies
Active/Passive
Alternatively, if you
have heavy updates on
random data entries,
you might want to use
an Active/Passive
configuration.
Apache Geode
IIS Farm
DB
Clients
Apache Geode
IIS Farm
DB
Site 1
Site 2
WAN
Gateway
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Topologies
Business Unit
Active/Passive
Or if you have
different groups of
users who update
different datasets,
you can make each
cluster active for one
Business Unit, and
backup for the other.
Apache Geode
IIS Farm
DB
Equity Users
Apache Geode
IIS Farm
DB
Site 1
Site 2
WAN
Gateway
Debt Users
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Topologies
Geographically
Separated
Many times you are
using a multi-site
topology to achieve
locality of reference
for performance
purposes
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Active-active Design Patterns
1. Exchange Pattern
NYSE
LSE
LSE
TSE
NYSE, TSE Read--only
LSE, TSE Read--only
NYSE, LSE Read--only
Client connects
to all
exchanges it
needs for
writing, uses
local copy for
read only
access.
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Active-active Design Patterns
2. The "Realm Manager"
Pattern:
Use the “Command”
pattern to request that
an action be performed
on your behalf.
Request gets forwarded
to all distributed
systems but only the
one with the right
permission actually
takes the action.
Read Only For This Customer
Read Only For This Customer
Write Permission For This Customer
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Active-active Design Patterns
3. Follow the Sun
Pattern:
This is the "Global book"
pattern common in
Financial Services.
The token is here
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Active-active Design Patterns
4. Inventory Allocation Pattern:
This pattern is
commonly used when
there are multiple
trading venues and
selling short is not
allowed.
Partial Inventory
Partial Inventory
Partial Inventory
Partial Inventory
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Active-active Design Patterns
5. Apology based computing:
This is the pattern
that Max Feingold
refers to when he
says:
“At global scale,
getting the truth is
really really
expensive.”
© Copyright 2016 Pivotal. All rights reserved.
Configuring WAN Gateway
Simple Configuration
The minimum configuration you need to do is this…
<cache>
<gateway-sender id="NY" parallel=”true"
remote-distributed-system-id="1” /> ...
</cache>
© Copyright 2016 Pivotal. All rights reserved.
Enabling Persistence for Queues
Overflowing Gateway Queues to disk
To overflow the Gateway Queues to disk to conserve memory do this…
<cache>
<gateway-sender id="NY" parallel="false"
remote-distributed-system-id="1"
enable-persistence="true"
disk-store-name="gateway-disk-store"
maximum-queue-memory="200” /> ...
</cache>
© Copyright 2016 Pivotal. All rights reserved.
Multiple dispatcher threads
Multiple dispatcher threads for Parallel WAN and Async Event Queues
• Geode now defaults to 5 dispatcher threads for a parallel
WAN gateway or async event queue.
• If you detect that your system is using too much CPU,
modify the dispatcher-threads=1 in the gateway-sender
attributes.
• Default is ordering by key
© Copyright 2016 Pivotal. All rights reserved.
Multiple dispatcher threads
Configuring Dispatcher Threads and Ordering Policy for a Serial Gateway
To increase the number of dispatcher threads and set the ordering policy for
a serial gateway sender, use one of the following mechanisms.
<cache>
<gateway-sender id="NY" parallel="false"
remote-distributed-system-id="1"
enable-persistence="true"
disk-store-name="gateway-disk-store"
maximum-queue-memory="200"
dispatcher-threads=10 order-policy="key"/> ...
</cache>
© Copyright 2016 Pivotal. All rights reserved.
Additional Benefits
There are additional benefits that the WAN Gateway gives you
• Site-wide rolling release--roll each site independently
• Geode major release upgrades
• Mix of Cloud and On-prem
© Copyright 2016 Pivotal. All rights reserved.
Technical Details
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Consistency for WAN Updates
• With a distributed WAN configuration, one or more gateway senders
asynchronously queue and send region updates to another Geode
cluster.
• It is possible for multiple sites to send updates for the same region
entry at the same time.
• It is also possible that, due to a slow WAN connection, a cluster
might receive region updates after a considerable delay, and after it
has applied more recent updates to the region.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Consistency for WAN Updates
• To ensure that WAN propagated regions eventually reach a
consistent state, Geode first ensures that each cluster performs
consistency checking to regions before queuing updates to a
gateway sender for WAN distribution.
• In other words, region conflicts are first detected and resolved in the
local cluster, using local timestamp and conflict detection algorithms
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Partitioned Region Consistency
• For a partitioned region, Geode maintains consistency by routing all
updates on a given key to the Geode member that holds the primary
copy of that key.
• That member holds a lock on the key while distributing updates to
other members that host a copy of the key.
• Because all updates to a partitioned region are initially processed on
the primary Geode member, all members apply the updates in the
same order and consistency is maintained at all times.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Replicated Region Consistency
• For a replicated region, any member that hosts the region can
update an entry and distribute that update to other members without
locking the entry.
• It is possible that two members can update the same entry at the
same time (a concurrent update).
• It is also possible that, due to network latency, an update in one
member is received by other members at a later time, after those
members have already applied more recent updates to the entry (an
out-of-order update)
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Replicated Region Consistency
• For a replicated region, any member that hosts the region can
update an entry and distribute that update to other members without
locking the entry.
• It is possible that two members can update the same entry at the
same time (a concurrent update).
• It is also possible that, due to network latency, an update in one
member is received by other members at a later time, after those
members have already applied more recent updates to the entry (an
out-of-order update)
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Replicated Region Consistency
• If two members update the same entry at the same time, conflict
checking ensures that all members eventually arrive at the same
value, which is the value of one of the two concurrent updates.
• If a member receives an out-of-order update (an update that is
received after one or more recent updates were applied), conflict
checking ensures that the out-of-order update is discarded and not
applied to the cache.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Consistency for WAN Updates
• In a default configuration, the cluster that receives the event
examines the timestamp to determine whether or not the event
should be applied.
• If the timestamp of the update is earlier than the local timestamp, the
cluster discards the event.
• If the timestamp is the same as the local timestamp, then the entry
having the highest distributed system ID is applied (or kept).
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Discovery for Multi-Site Systems
• Each Geode cluster in a WAN configuration uses locators to
discover remote Geode clusters
• In the configuration for each locator in a WAN configuration you
must define a unique distributed-system-id property that
identifies the local cluster
• A locator uses the remote-locators property to define the
addresses of one or more locators in remote Geode clusters to use
for WAN distribution.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Discovery for Multi-Site Systems
• When a locator starts up, it contacts each locator that is configured
in the remote-locators property to exchange information about
the available locators and gateway receivers in the cluster.
• The locator also shares information about locators and gateway
receivers in any other Geode clusters that have connected to the
cluster.
• Connected clusters can then use the shared gateway receiver
information to distribute region events according to their configured
gateway senders.
• Each time a new locator starts up or an existing locator shuts down,
the changed information is broadcast to other connected Geode
clusters.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Gateway Senders
• A Geode cluster uses a gateway sender to distribute region events
to another, remote Geode cluster.
• You can create multiple gateway sender configurations to distribute
region events to multiple remote clusters,
• A gateway sender always communicates with a gateway receiver in
a remote cluster.
• Gateway senders do not communicate directly with other cache
server instances
• Geode provides two types of gateway sender configurations: serial
gateway senders and parallel gateway senders
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Serial Gateway Senders
• A serial gateway sender distributes region events from a single
Geode server in the local cluster to a remote Geode cluster.
• Although multiple regions can use the same serial gateway for
distribution, a serial gateway uses a single logical event queue to
dispatch events for all regions that use the gateway sender.
• Because a serial gateway sender distributes all of a region's events
through a single Geode member, it provides the most control over
ordering region events as they are propagated across the WAN.
• However, a serial gateway sender does not provides horizontal
scale of throughput for propagating events.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Parallel Gateway Senders
• While parallel gateway senders provide the best throughput for WAN
propagation, they provide less control for ordering events.
• With a parallel gateway sender, you cannot preserve event ordering
for the region as a whole because multiple Geode servers distribute
the region events at the same time.
• However, the ordering of events for a given partition can be
preserved
Note: Replicated regions can only be configured to use serial
gateway senders.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Gateway Sender Queues
• The queue that a gateway sender uses to distribute events to a
remote site can be overflowed to disk as needed, in order to prevent
the Geode member from running out of memory.
• You should configure the maximum amount of memory that each
queue uses, as well as the batch size and frequency for processing
batches
• You should also configure these queues to persist to disk, so that a
gateway sender can pick up where it left off when its member shuts
down and is later restarted.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Multi-threaded dispatcher
• By default gateway sender queues (even in serial gateway senders)
use 5 threads to dispatch queued events.
• If ordering is required on a serial gateway sender you should set the
number of dispatcher threads to 1.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
High Availability for Gateway Senders
• When a serial gateway sender configuration is deployed to multiple
Geode members, only one "primary” sender is active at a given time.
All other serial gateway sender instances are inactive "secondaries"
that are available as backups if the primary sender shuts down.
• Geode designates the first gateway sender to start up as the primary
sender, and all other senders become secondaries.
• As gateway senders start up and shut down in the distributed
system, Geode ensures that the oldest running gateway sender
operates as the primary
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
High Availability for Gateway Senders
• A parallel gateway sender is deployed to multiple Geode members
by default, and each Geode member that hosts primary buckets for
a partitioned region actively distributes data to the remote Geode
site.
• When you use parallel gateway senders, high availability for WAN
distribution is provided if you configure the partitioned region for
redundancy.
• With a redundant partitioned region, if a member that hosts primary
buckets fails or is shut down, then a Geode member that hosts a
redundant copy of those buckets takes over WAN distribution for
those buckets.
© Copyright 2016 Pivotal. All rights reserved.
Using WAN Gateways
Gateway Receivers
• A gateway receiver configures a physical connection for receiving region
events from gateway senders in one or more remote Geode clusters.
• A gateway receiver applies each region event to the same region or
partition that is hosted in the local Geode member. (An exception is
thrown if the receiver receives an event for a region that it does not
define.)
• Gateway senders use any available gateway receiver in the target
cluster to send region events.
• You can deploy gateway receiver configurations to multiple Geode
members as needed for high availability and load balancing.
There are issues with balancing of senders and receivers
© Copyright 2016 Pivotal. All rights reserved.
Multi-site Topologies
Parallel Multi-site Topology
• This is the most often recommended topology
• This is one where all sites know about each other.
• This is the most robust configuration, where any one of the sites can go
down without disrupting communication between the other sites.
• A parallel topology also guarantees that no site receives multiple copies
of the same message.
Parallel Multi-site Topology is the recommended topology for most
use-cases. Think of it as a “mesh” configuration.
© Copyright 2016 Pivotal. All rights reserved.
Thank you

More Related Content

What's hot

January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
Yahoo Developer Network
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
Vinay Kumar Chella
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
BlueData, Inc.
 

What's hot (20)

January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
 
Galera Cluster 4 for MySQL 8 Release Webinar slides
Galera Cluster 4 for MySQL 8 Release Webinar slidesGalera Cluster 4 for MySQL 8 Release Webinar slides
Galera Cluster 4 for MySQL 8 Release Webinar slides
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
 
Choosing between Codership's MySQL Galera, MariaDB Galera Cluster and Percona...
Choosing between Codership's MySQL Galera, MariaDB Galera Cluster and Percona...Choosing between Codership's MySQL Galera, MariaDB Galera Cluster and Percona...
Choosing between Codership's MySQL Galera, MariaDB Galera Cluster and Percona...
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CIT
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_final
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 
Build your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open SourceBuild your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open Source
 
YARN and the Docker container runtime
YARN and the Docker container runtimeYARN and the Docker container runtime
YARN and the Docker container runtime
 
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
 
Planning for Disaster Recovery (DR) with Galera Cluster
Planning for Disaster Recovery (DR) with Galera ClusterPlanning for Disaster Recovery (DR) with Galera Cluster
Planning for Disaster Recovery (DR) with Galera Cluster
 
MariaDB on Docker
MariaDB on DockerMariaDB on Docker
MariaDB on Docker
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
 
How to upgrade like a boss to my sql 8.0?
How to upgrade like a boss to my sql 8.0?How to upgrade like a boss to my sql 8.0?
How to upgrade like a boss to my sql 8.0?
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, Solutions
 
High Availability of SAP ASCS in Microsoft Azure
High Availability of SAP ASCS in Microsoft AzureHigh Availability of SAP ASCS in Microsoft Azure
High Availability of SAP ASCS in Microsoft Azure
 
Spring Meetup Paris - Getting Distributed with Hazelcast and Spring
Spring Meetup Paris - Getting Distributed with Hazelcast and SpringSpring Meetup Paris - Getting Distributed with Hazelcast and Spring
Spring Meetup Paris - Getting Distributed with Hazelcast and Spring
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
 

Similar to Apache Geode Clubhouse - WAN-based Replication

Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 

Similar to Apache Geode Clubhouse - WAN-based Replication (20)

Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
 
MySQL High Availability -- InnoDB Clusters
MySQL High Availability -- InnoDB ClustersMySQL High Availability -- InnoDB Clusters
MySQL High Availability -- InnoDB Clusters
 
Accelerate with ibm storage ibm spectrum virtualize hyper swap deep dive dee...
Accelerate with ibm storage  ibm spectrum virtualize hyper swap deep dive dee...Accelerate with ibm storage  ibm spectrum virtualize hyper swap deep dive dee...
Accelerate with ibm storage ibm spectrum virtualize hyper swap deep dive dee...
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
2689 - Exploring IBM PureApplication System and IBM Workload Deployer Best Pr...
2689 - Exploring IBM PureApplication System and IBM Workload Deployer Best Pr...2689 - Exploring IBM PureApplication System and IBM Workload Deployer Best Pr...
2689 - Exploring IBM PureApplication System and IBM Workload Deployer Best Pr...
 
What Multisite Means for Identity Management
What Multisite Means for Identity ManagementWhat Multisite Means for Identity Management
What Multisite Means for Identity Management
 
VMware End-User-Computing Best Practices Poster
VMware End-User-Computing Best Practices PosterVMware End-User-Computing Best Practices Poster
VMware End-User-Computing Best Practices Poster
 
Accelerate with ibm storage ibm spectrum virtualize hyper swap deep dive
Accelerate with ibm storage  ibm spectrum virtualize hyper swap deep diveAccelerate with ibm storage  ibm spectrum virtualize hyper swap deep dive
Accelerate with ibm storage ibm spectrum virtualize hyper swap deep dive
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
haproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdfhaproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdf
 
HAProxy
HAProxy HAProxy
HAProxy
 
Making Apache Tomcat Multi-tenant, Elastic and Metered
Making Apache Tomcat Multi-tenant, Elastic and MeteredMaking Apache Tomcat Multi-tenant, Elastic and Metered
Making Apache Tomcat Multi-tenant, Elastic and Metered
 
MySQL & Oracle Linux Keynote at Open Source India 2014
MySQL & Oracle Linux Keynote at Open Source India 2014MySQL & Oracle Linux Keynote at Open Source India 2014
MySQL & Oracle Linux Keynote at Open Source India 2014
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Zookeeper Tutorial for beginners
Zookeeper Tutorial for beginnersZookeeper Tutorial for beginners
Zookeeper Tutorial for beginners
 
Architecting for Resiliency
Architecting for ResiliencyArchitecting for Resiliency
Architecting for Resiliency
 

More from PivotalOpenSourceHub

More from PivotalOpenSourceHub (20)

New Security Framework in Apache Geode
New Security Framework in Apache GeodeNew Security Framework in Apache Geode
New Security Framework in Apache Geode
 
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
 
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
 
#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design
 
#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor
 
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
 
#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future
 
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
 
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
 
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
 
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
 
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
 
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
 
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
 
#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems
 
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
 
GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a Service
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
 
Apache Geode Offheap Storage
Apache Geode Offheap StorageApache Geode Offheap Storage
Apache Geode Offheap Storage
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Apache Geode Clubhouse - WAN-based Replication

  • 1. © Copyright 2016 Pivotal. All rights reserved.© Copyright 2013 Pivotal. All rights reserved. WAN Gateway Multi-site and Active-active Design Patterns
  • 2. © Copyright 2016 Pivotal. All rights reserved. • The multi-site capability connects geographically separated distributed systems. • It is important to understand that while you are presented with a façade that APPEARS like it is all one system, each of the distributed systems actually behaves autonomously • A multi-site installation based on the WAN Gateway consists of two or more distributed systems that are loosely coupled. • Each site manages its own distributed system, and region data is distributed to remote sites using one or more connections. Using WAN Gateways
  • 3. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways The connections consist of a gateway sender in the sending site and a corresponding gateway receiver in the receiving site.
  • 4. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies DISASTER RECOVERY Disaster Recovery or Business Continuity is still the most popular design pattern for multi-site replication using the WAN Gateway. One important distinction Geode WAN Gateway is designed to be bi-directional. That makes fail-back much easier.
  • 5. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Active/Active If your use-case is read-only, or updates are limited to specific entries belonging to a given user, then it is extremely easy to load balance across multiple sites hosting Geode clusters in an Active/Active configuration. Apache Geode IIS Farm DB Clients Apache Geode IIS Farm DB Site 1 Site 2 WAN Gateway
  • 6. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Active/Passive Alternatively, if you have heavy updates on random data entries, you might want to use an Active/Passive configuration. Apache Geode IIS Farm DB Clients Apache Geode IIS Farm DB Site 1 Site 2 WAN Gateway
  • 7. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Business Unit Active/Passive Or if you have different groups of users who update different datasets, you can make each cluster active for one Business Unit, and backup for the other. Apache Geode IIS Farm DB Equity Users Apache Geode IIS Farm DB Site 1 Site 2 WAN Gateway Debt Users
  • 8. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Geographically Separated Many times you are using a multi-site topology to achieve locality of reference for performance purposes
  • 9. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 1. Exchange Pattern NYSE LSE LSE TSE NYSE, TSE Read--only LSE, TSE Read--only NYSE, LSE Read--only Client connects to all exchanges it needs for writing, uses local copy for read only access.
  • 10. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 2. The "Realm Manager" Pattern: Use the “Command” pattern to request that an action be performed on your behalf. Request gets forwarded to all distributed systems but only the one with the right permission actually takes the action. Read Only For This Customer Read Only For This Customer Write Permission For This Customer
  • 11. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 3. Follow the Sun Pattern: This is the "Global book" pattern common in Financial Services. The token is here
  • 12. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 4. Inventory Allocation Pattern: This pattern is commonly used when there are multiple trading venues and selling short is not allowed. Partial Inventory Partial Inventory Partial Inventory Partial Inventory
  • 13. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 5. Apology based computing: This is the pattern that Max Feingold refers to when he says: “At global scale, getting the truth is really really expensive.”
  • 14. © Copyright 2016 Pivotal. All rights reserved. Configuring WAN Gateway Simple Configuration The minimum configuration you need to do is this… <cache> <gateway-sender id="NY" parallel=”true" remote-distributed-system-id="1” /> ... </cache>
  • 15. © Copyright 2016 Pivotal. All rights reserved. Enabling Persistence for Queues Overflowing Gateway Queues to disk To overflow the Gateway Queues to disk to conserve memory do this… <cache> <gateway-sender id="NY" parallel="false" remote-distributed-system-id="1" enable-persistence="true" disk-store-name="gateway-disk-store" maximum-queue-memory="200” /> ... </cache>
  • 16. © Copyright 2016 Pivotal. All rights reserved. Multiple dispatcher threads Multiple dispatcher threads for Parallel WAN and Async Event Queues • Geode now defaults to 5 dispatcher threads for a parallel WAN gateway or async event queue. • If you detect that your system is using too much CPU, modify the dispatcher-threads=1 in the gateway-sender attributes. • Default is ordering by key
  • 17. © Copyright 2016 Pivotal. All rights reserved. Multiple dispatcher threads Configuring Dispatcher Threads and Ordering Policy for a Serial Gateway To increase the number of dispatcher threads and set the ordering policy for a serial gateway sender, use one of the following mechanisms. <cache> <gateway-sender id="NY" parallel="false" remote-distributed-system-id="1" enable-persistence="true" disk-store-name="gateway-disk-store" maximum-queue-memory="200" dispatcher-threads=10 order-policy="key"/> ... </cache>
  • 18. © Copyright 2016 Pivotal. All rights reserved. Additional Benefits There are additional benefits that the WAN Gateway gives you • Site-wide rolling release--roll each site independently • Geode major release upgrades • Mix of Cloud and On-prem
  • 19. © Copyright 2016 Pivotal. All rights reserved. Technical Details
  • 20. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Consistency for WAN Updates • With a distributed WAN configuration, one or more gateway senders asynchronously queue and send region updates to another Geode cluster. • It is possible for multiple sites to send updates for the same region entry at the same time. • It is also possible that, due to a slow WAN connection, a cluster might receive region updates after a considerable delay, and after it has applied more recent updates to the region.
  • 21. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Consistency for WAN Updates • To ensure that WAN propagated regions eventually reach a consistent state, Geode first ensures that each cluster performs consistency checking to regions before queuing updates to a gateway sender for WAN distribution. • In other words, region conflicts are first detected and resolved in the local cluster, using local timestamp and conflict detection algorithms
  • 22. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Partitioned Region Consistency • For a partitioned region, Geode maintains consistency by routing all updates on a given key to the Geode member that holds the primary copy of that key. • That member holds a lock on the key while distributing updates to other members that host a copy of the key. • Because all updates to a partitioned region are initially processed on the primary Geode member, all members apply the updates in the same order and consistency is maintained at all times.
  • 23. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Replicated Region Consistency • For a replicated region, any member that hosts the region can update an entry and distribute that update to other members without locking the entry. • It is possible that two members can update the same entry at the same time (a concurrent update). • It is also possible that, due to network latency, an update in one member is received by other members at a later time, after those members have already applied more recent updates to the entry (an out-of-order update)
  • 24. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Replicated Region Consistency • For a replicated region, any member that hosts the region can update an entry and distribute that update to other members without locking the entry. • It is possible that two members can update the same entry at the same time (a concurrent update). • It is also possible that, due to network latency, an update in one member is received by other members at a later time, after those members have already applied more recent updates to the entry (an out-of-order update)
  • 25. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Replicated Region Consistency • If two members update the same entry at the same time, conflict checking ensures that all members eventually arrive at the same value, which is the value of one of the two concurrent updates. • If a member receives an out-of-order update (an update that is received after one or more recent updates were applied), conflict checking ensures that the out-of-order update is discarded and not applied to the cache.
  • 26. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Consistency for WAN Updates • In a default configuration, the cluster that receives the event examines the timestamp to determine whether or not the event should be applied. • If the timestamp of the update is earlier than the local timestamp, the cluster discards the event. • If the timestamp is the same as the local timestamp, then the entry having the highest distributed system ID is applied (or kept).
  • 27. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Discovery for Multi-Site Systems • Each Geode cluster in a WAN configuration uses locators to discover remote Geode clusters • In the configuration for each locator in a WAN configuration you must define a unique distributed-system-id property that identifies the local cluster • A locator uses the remote-locators property to define the addresses of one or more locators in remote Geode clusters to use for WAN distribution.
  • 28. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Discovery for Multi-Site Systems • When a locator starts up, it contacts each locator that is configured in the remote-locators property to exchange information about the available locators and gateway receivers in the cluster. • The locator also shares information about locators and gateway receivers in any other Geode clusters that have connected to the cluster. • Connected clusters can then use the shared gateway receiver information to distribute region events according to their configured gateway senders. • Each time a new locator starts up or an existing locator shuts down, the changed information is broadcast to other connected Geode clusters.
  • 29. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Gateway Senders • A Geode cluster uses a gateway sender to distribute region events to another, remote Geode cluster. • You can create multiple gateway sender configurations to distribute region events to multiple remote clusters, • A gateway sender always communicates with a gateway receiver in a remote cluster. • Gateway senders do not communicate directly with other cache server instances • Geode provides two types of gateway sender configurations: serial gateway senders and parallel gateway senders
  • 30. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Serial Gateway Senders • A serial gateway sender distributes region events from a single Geode server in the local cluster to a remote Geode cluster. • Although multiple regions can use the same serial gateway for distribution, a serial gateway uses a single logical event queue to dispatch events for all regions that use the gateway sender. • Because a serial gateway sender distributes all of a region's events through a single Geode member, it provides the most control over ordering region events as they are propagated across the WAN. • However, a serial gateway sender does not provides horizontal scale of throughput for propagating events.
  • 31. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Parallel Gateway Senders • While parallel gateway senders provide the best throughput for WAN propagation, they provide less control for ordering events. • With a parallel gateway sender, you cannot preserve event ordering for the region as a whole because multiple Geode servers distribute the region events at the same time. • However, the ordering of events for a given partition can be preserved Note: Replicated regions can only be configured to use serial gateway senders.
  • 32. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Gateway Sender Queues • The queue that a gateway sender uses to distribute events to a remote site can be overflowed to disk as needed, in order to prevent the Geode member from running out of memory. • You should configure the maximum amount of memory that each queue uses, as well as the batch size and frequency for processing batches • You should also configure these queues to persist to disk, so that a gateway sender can pick up where it left off when its member shuts down and is later restarted.
  • 33. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Multi-threaded dispatcher • By default gateway sender queues (even in serial gateway senders) use 5 threads to dispatch queued events. • If ordering is required on a serial gateway sender you should set the number of dispatcher threads to 1.
  • 34. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways High Availability for Gateway Senders • When a serial gateway sender configuration is deployed to multiple Geode members, only one "primary” sender is active at a given time. All other serial gateway sender instances are inactive "secondaries" that are available as backups if the primary sender shuts down. • Geode designates the first gateway sender to start up as the primary sender, and all other senders become secondaries. • As gateway senders start up and shut down in the distributed system, Geode ensures that the oldest running gateway sender operates as the primary
  • 35. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways High Availability for Gateway Senders • A parallel gateway sender is deployed to multiple Geode members by default, and each Geode member that hosts primary buckets for a partitioned region actively distributes data to the remote Geode site. • When you use parallel gateway senders, high availability for WAN distribution is provided if you configure the partitioned region for redundancy. • With a redundant partitioned region, if a member that hosts primary buckets fails or is shut down, then a Geode member that hosts a redundant copy of those buckets takes over WAN distribution for those buckets.
  • 36. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Gateway Receivers • A gateway receiver configures a physical connection for receiving region events from gateway senders in one or more remote Geode clusters. • A gateway receiver applies each region event to the same region or partition that is hosted in the local Geode member. (An exception is thrown if the receiver receives an event for a region that it does not define.) • Gateway senders use any available gateway receiver in the target cluster to send region events. • You can deploy gateway receiver configurations to multiple Geode members as needed for high availability and load balancing. There are issues with balancing of senders and receivers
  • 37. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Parallel Multi-site Topology • This is the most often recommended topology • This is one where all sites know about each other. • This is the most robust configuration, where any one of the sites can go down without disrupting communication between the other sites. • A parallel topology also guarantees that no site receives multiple copies of the same message. Parallel Multi-site Topology is the recommended topology for most use-cases. Think of it as a “mesh” configuration.
  • 38. © Copyright 2016 Pivotal. All rights reserved. Thank you