It's common business policy for organizations of a certain size to have two data centers as part of a disaster recovery or business continuity plan. However, most enterprise - applications are not designed for or intended to use systems in two different locations.
Enter the notion of a data center interconnect, which extends an Ethernet network between two physically separate data centers. While the idea is simple, Ethernet wasn't designed to run across a wide area network. Thus, a DCI implementation requires a variety of technological fixes to work around Ethernet's limitations.
This report outlines the issues that complicate DCIs, such as loops that can bring down networks and traffic trombones that eat up bandwidth. It also examines the variety of options companies have to connect two or more data centers, including dark fiber, MPLS services and MLAG, as well as vendor specific options such as Cisco OTV and HP EVI. The report looks at the pros and cons of each option.
How to Troubleshoot Apps for the Modern Connected Worker
Data Center Interconnects: An Overview
1. Report ID:S6970513
Next
reports
DataCenterInterconnects:
AnOverviewA DCI lets companies link two or more data centers
together for disaster recovery or business continuity,but it’s
not easy.This report provides an overview of the major DCI
technologies and describes their pros and cons.
By Greg Ferro
R e p o r t s . I n f o r m a t i o n W e e k . c o m M a y 2 0 1 3 $ 9 9
2. Previous Next
reports
reports.informationweek.com May 2013 2
CONTENTS
TABLE OF
3 Author’s Bio
4 Executive Summary
5 The DCI Problem
5 Figure 1:Ingress Routing Problems
6 Loop Prevention
7 Three Options
8 Software-Defined Networking
8 Vendor and Standards-Based Technologies
8 Figure 2:Leaf and Spine
9 Figure 3:Partial Mesh
10 Custom Options
10 Figure 4:MLAG
11 DCI:Weighing the Choices
12 Related Reports
D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
ABOUT US
InformationWeek Reports’ analysts arm business technology
decision-makers with real-world perspective based on qualitative
and quantitative research,business and technology assessment and
planning tools,and adoption best practices gleaned from
experience.
OUR STAFF
Lorna Garey,content director; lorna.garey@ubm.com
Heather Vallis,managing editor,research; heather.vallis@ubm.com
Elizabeth Chodak,copy chief; elizabeth.chodak@ubm.com
Tara DeFilippo,associate art director; tara.defilippo@ubm.com
Find all of our reports at reports.informationweek.com.
4. May 2013 4
Previous Next
It’s common business policy for organizations of a certain size to have two data centers
as part of a disaster recovery or business continuity plan.However,most enterprise
applications are not designed for or intended to use systems in two different locations.
For example,a MySQL database is designed to exist on a single server with a single stor-
age location.Building a resilient MySQL server requires an advanced infrastructure or
complex software.
Enter the notion of a data center interconnect,which extends an Ethernet network be-
tween two physically separate data centers.While the idea is simple,Ethernet wasn’t de-
signed to run across a wide area network.Thus,a DCI implementation requires a variety of
technological fixes to work around Ethernet’s limitations.
This report outlines the issues that complicate DCIs,such as loops that can bring down
networks and traffic trombones that eat up bandwidth.It also examines the variety of op-
tions companies have to connect two or more data centers,including dark fiber,MPLS
services and MLAG,as well as vendor-specific options such as Cisco OTV and HP EVI.The
report looks at the pros and cons of each option.
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
EXECUTIVE
SUMMARY
Table of Contents
5. May 2013 5
The most reliable method to connect two
data centers together for high availability
and disaster recovery is to route IP traffic be-
tween the data centers.However,it’s become
more common to extend the Ethernet net-
work over the WAN, called a data center in-
terconnect, or DCI. This allows for the use of
features such as virtual machine migration.
For instance,by connecting two data centers
via Ethernet, administrators can move a SQL
Server instance via VM migration without
changing the IP address of the operating sys-
tem.This is attractive to the server teams be-
cause the IP address is a key part of the di-
rectory service or configuration database.
Maintaining the same IP address means that
application settings remain the same and re-
duces the chance of errors when migration
occurs. Service continuity is simpler if the IP
address is unchanged.
A VMware ESXi server can perform vMotion
for up to eight virtual machines at once (pro-
vided that you have a 10 Gbps network
adapter or four at 1 Gbps).Given that it’s com-
mon to have 20 to 40 VMs per physical server,
you can see that evacuating a server will take
some time. VMotion performance requires
very low latency,typically less than 50 millisec-
onds to achieve control transfer (although op-
Previous Next
Ingress Routing Problems
Source: Greg Ferro S6970513/1
S
Without careful planning after a server migration,traffic may unnecessarily traverse one data center and the data
center interconnect to connect to servers in another data center.
Data Centre Interconnect
Server Migration
reports.informationweek.com
The DCI Problem
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
Table of Contents
Figure 1
6. May 2013 6
tions exist).A larger DCI bandwidth will result
in faster vMotion and reduce the risk of traffic
trombone.
Note that you can create a cascading failure
as more servers move,so you need increasing
amounts of bandwidth for interserver appli-
cation traffic, which makes the remaining
vMotion tasks progressively slower. Band-
width will eventually reach a peak and can
prevent vMotion from occurring at the peak
point of transition.
One problem with server
migration is that storage
must be synchronized be-
tween the data centers us-
ing storage replication
technology. Replication is
usually performed by the
storage array,but it’s an expensive option and
will consume additional bandwidth between
the sites.
Provided that the storage is replicated
between the sites, extending the Ethernet
network between data centers results in the
simplest possible server migration between
each data center,though it incurs significant
technical debt.It is networking best practice
to use Layer 3 routing between geographi-
cally diverse locations and to limit Layer 2
connectivity wherever possible,thus improv-
ing network stability and limiting risk do-
mains to a single data center. We’ll look at
some of the technological challenges of DCI
and discuss the pros and cons of various DCI
options.
Loop Prevention
Ethernet introduces several technical hur-
dles in building a DCI. Ethernet was created
some 30 years ago as a local area network
protocol,with no practical concept of scaling
past a few machines. By design, Ethernet is a
multiaccess technology that allows all Ether-
net broadcast frames to be received by all
endpoints on the network.Thus,an Ethernet
broadcast frame must be forwarded across
all Ethernet networks, including the DCI. If a
broadcast frame is looped back into an Eth-
ernet network, it will be forwarded by all
switches even though it was already broad-
cast.This creates a race condition that rapidly
consumes all network bandwidth and usu-
ally results in catastrophic network failure as
the volumes of broadcasts expand to con-
sume all resources.
The Spanning Tree Protocol was designed
to address the loop problem with Ethernet
and has generally served its purpose on the
LAN. However it’s not suitable for control of
flooding of packets between data centers,
because Spanning Tree is not easily scalable
and risk domains grow as network diameter
grows. STP has no domain isolation, so a
problem in a single data center can propa-
gate between data centers. In addition, first-
hop resolution and inbound routing selec-
tion can cause verbose inter-data center
traffic over the DCI.
When a server is migrated between data cen-
ters, traffic to and from the server must be in-
tentionally designed.Outbound flows from the
server will default to a router that may or may
not be in the same data center.In this instance,
traffic from a server in DC B may traverse the
DCI link to reach the router in DC A and then re-
Previous Next
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
Table of Contents
When a server is migrated
between data centers,traffic to
and from the server must be
intentionally designed.
7. May 2013 7
Previous Next
turn back over the DCI link to another resource
in the DC B.This is not optimal because the net-
work link between data centers is bearing all
traffic from external users and traffic from the
relocated server to other servers (see Figure 1).
The resulting traffic pattern is sometimes called
a traffic trombone.Consider a Web server with
Java Runtime and an MySQL database in DC A.
After migration of the Web server to DC B, the
traffic flows over the DCI are:
>> External flows from the WAN and/or
Internet
>> Flows from theWeb server to the database
>> Administrative traffic flows such as back-
ups,monitoring and patching
Consider performing a backup of the mi-
grated server in DC B to a backup system in
DC A. How much bandwidth do you need so
that the backup will complete within the
backup window? And will the backup impact
the critical traffic like the database queries or
customer Web traffic ?
Three Options
Today there are three common methods for
modifying traffic flow between data centers:
first-hop bypass,LISP and load balancing.We’ll
look at each in turn.
First-hop bypass relates to the many options
for establishing a local default gateway or
router hop for the server. The server will re-
quire the same default gateway address in
each data center,but sending the traffic from
DC B to DC A leads to failure.Therefore,meth-
ods based around MAC address filtering for
HSRP IP gateways are common.There are sev-
eral ways to handle this specific to each router
vendor’s software implementation.
Location Independent Separation Protocol
is an IETF standard proposed by Cisco that
modifies the concept of routing location. In-
stead of routing to a subnet in the network,
traffic is forwarded to a specific router using
a tunnel.The router will then forward the traf-
fic to an identifier, which is the IP address of
the server.You can find more about LISP here
or at the IETF LISP Working Group.LISP works
for inbound and outbound traffic.
Load balancing involves using the source
NAT features on a load balancing VIP so that
traffic will be sourced from a device within the
data center. However, this only works for in-
bound flows and must be combined with
other traffic controls such as first-hop bypass
for a complete solution.
A fourth option is route injection, which in-
volves triggering a dynamic route injection
into the network routing based on certain
trigger options. This method has proved less
reliable in wider use because routing proto-
cols have limited trigger capabilities. This
works for inbound flows and partially for out-
bound flows.
These technologies address the traffic
trombone problem using legacy or existing
network tools,but you may also wish to con-
sider vendor-specific technologies such as
Cisco’s OTV or Hewlett-Packard EVC, which
we’ll discuss later.
Software-Defined Networking
In the future, SDN and controller-based
networking will likely provide new capabili-
ties that do not rely on the configuration of
individual devices or require manual over-
Research:
The Next-Gen WAN
Respondents to our Next Gen-
eration WAN Survey are a highly
connected bunch:44% have 16
or more branch or remote offices
linked to their primary data cen-
ters.And Ethernet-based services
like MPLS outstripped ISDN
among current users,73% to
56%.What’s next? Demand for
dark fiber and private clouds,
among other things.
DownloadDownload
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
Table of Contents
8. May 2013 8
rides of routing configuration.If you’re plan-
ning a DCI deployment,you should consider
evaluating the new SDN technologies, in-
cluding Juniper Contrail, VMware NSX and
Alcatel-Lucent’s Nuage Networks.
Vendor and Standards-BasedTechnologies
There has been significant demand for DCI
products, and this has led to a number of
technological developments by vendors and
by standards bodies. I’ll look at five options:
dark fiber, MPLS pseudowires, MLAG, TRILL/
ECMP and custom vendor protocols.
Dark fiber: Dark fiber is a broad term used
to describe dedicated fiber-optic cables or
services that closely emulate dedicated ca-
bles. For some geographies, it’s possible to
lay your own fiber between data centers and
own the right of way or to purchase a dedi-
cated fiber from a provider. Physical cables
are usually capped at around 50 to 75 kilo-
meters (the distance of a long-haul, single-
mode laser transmitter).
More commonly, your local carrier provides
a dense wavelength division multiplexer ser-
vice that presents fiber at each site and ap-
pears as a dedicated fiber cable to the sites.
The customer can use any signal on that fiber
because the laser is physically multiplexed
and the DWDM drop has limited awareness of
your protocols. The DWDM service provides
additional circuit redundancy through the use
of ring technologies, and the carrier can pro-
vision multiple services over a single fiber pair.
MPLS pseudowires: When it comes to
MPLS, most organizations will choose to pur-
chase Layer 2 virtual private network service
from a service provider.Service providers use
the MPLS protocol internal to their networks
to provide a wide range of IP routed services
such as WAN and Internet circuits.
Previous Next
Leaf and Spine
Source: Greg Ferro S6970513/3
S
A leaf and spine architecture can be used in a DCI
40 Gbps 40 Gbps
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
Table of Contents
Figure 2
9. May 2013 9
Typically, your service provider will deliver
an Ethernet port to your premises, and this
will connect to the provider’s MPLS back-
bone. MPLS standards have organically
grown a messy bunch of protocols that can
provide a Layer 2 Ethernet emulation over an
MPLS network. Technologies such as VPLS,
EoMPLS, GREoMPLS and L2TPv3 all provide
ways for emulating Ethernet networks. Your
provider’s MPLS network should be config-
ured to support one or more of these tech-
nologies. These technologies are incorrectly
but widely referred to as “pseudowires” be-
cause their original purpose was to emulate
ATM and frame relay circuits in the early
2000s before being modified for Ethernet.
Large enterprises may build their own MPLS
backbones to have greater control over the
services and security of the WAN,but for most
companies this won’t be a viable option.
MPLS is a relatively complex group of proto-
cols that requires a significant amount of time
to learn and comprehend. Building mission-
critical business with MPLS is hard and should
generally be avoided.
MLAG: Multichassis link aggregation de-
scribes the logical bonding of two physical
switches into a single unit, as shown in Figure
2. The logical switch control plane is a single
software entity.This prevents loop conditions
from occurring and reduces operational risk.
It’s simple to use,configure and maintain com-
pared with other approaches, and is less ex-
pensive.Your service provider can supply Layer
2 services (probably using dark fiber or MPLS
pseudowires,as discussed previously).
Note that MLAG is not a standard.Each ven-
Previous Next
Partial Mesh
Source: Greg Ferro S6970513/4
S
TRILL can be used to create a layer-2 partial mesh DCI topology.
Partial MeshTopology
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
Table of Contents
Figure 3
Like This Report?
Rate It!Something we could do
better? Let us know.
RateRate
10. May 2013 10
dor has its own name for the technology,such
as Cisco’s vPC,HP’s IRF,Juniper’s MC-LAG and
Brocade’s Multi-Chassis Trunk.
To use MLAG for DCI, connect each port on
the MLAG switches to the Layer 2 service to
prevent loops. It’s recommended not to use
MLAG features on core switches in each data
center; instead use fixed switches in a modu-
lar design for control and better support.
MLAG could handle up to eight point-to-
point circuits.A service provider failure would
reduce the bandwidth and will require careful
design if you’re using quality of service to pro-
tect key applications.
ECMP/TRILL:Equal Cost Multipath,or ECMP,
is a more recent addition to the options for
DCI.The IETF TRILL standard provides a multi-
path protocol that “routes” Ethernet frames
across up to 16 paths that have the same
bandwidth or cost.
Although intended for data center back-
bones to implement a Clos Tree switch fabric
(sometimes known as leaf/spine),TRILL can be
used in a DCI technology. It provides high
availability because dual switches are used at
all sites,and also provides native STP isolation.
A unique feature of TRILL as a DCI technol-
ogy is that it supports partial meshed topol-
ogy for multiple data centers because the
Layer 2 traffic is routed over the TRILL core.
Although core features are
complete, the TRILL protocol
continues to be developed.
Many mainstream vendors have
not released a fully standards-
compliant implementation, so
while you can build aTRILL fabric
from a single vendor’s gear, you
may run into interoperability
problems in a heterogeneous
environment. Some vendors are
also extending the standard to
add proprietary features. Bro-
cade VCS and Cisco FabricPath
are two of the available options
today.
Custom Options
As you can see, there are com-
plex technical challenges to ex-
tending Ethernet networks between data
centers.The effort often brings more risk than
customers are willing to accept.However,ven-
dors are developing proprietary protocols to
address these risks. Case in point are Cisco’s
Previous Next
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
Table of Contents
MLAG
Source: Greg Ferro S6970513/2
S
MLAG logically bonds two physical switches to make them appear as a
single unit.
Physical MLAG
Logical MLAG
Figure 4
11. May 2013 11
Overlay Transport Virtualization (OTV) and
HP’s Ethernet Virtual Interconnect (EVI).
The protocols encapsulate Ethernet in IP
for transport over WAN services. Software
agents for these protocols in the edge net-
work devices provide features such as Span-
ning Tree isolation in each data center, re-
duced configuration effort and multisite
setups. Compared to MPLS, OTV and EVI are
very simple to configure and maintain,
though you will incur a substantial licensing
fee on specific hardware platforms.The sim-
plicity of these approaches makes them at-
tractive options for most enterprises.
You can find a more detailed comparison of
OTV and EVI here.
DCI:Weighing the Choices
Before embarking on a DCI project,consider
your disaster recovery plan carefully.Can you
meet your disaster recovery requirements by
cold start from a storage array replication or
even by restoring from a backup? If so, you
may not need to make the investment in a
DCI. On the other hand, if you are looking for
disaster avoidance, where server instances
can be evacuated between data centers when
a specific event, such as a major storm or po-
litical intervention, occurs, then a DCI may be
the way to go.
Perhaps the best advice is to consider care-
fully your actual business requirements. Mi-
grating virtual workloads between data cen-
ters creates unique technical problems due to
the complexity of traffic flows. The following
technical concerns are just a few of the less-
obvious problems created by DCI:
>>Tracing application problems can be dif-
ficult when servers might be in two locations
>> Applications incur latency over the DCI
for just one or two servers,resulting in unpre-
dictable performance
>> Loop topology failure leads to outages
in both data centers
>> Bandwidth exhaustion results in service
loss and cannot be easily controlled
Layer 2 DCI is a last-resort technology that
allows legacy applications to behave as if they
were in the same Ethernet domain. The cor-
rect solution is to deploy applications that are
designed to run active/active in two or more
data centers and avoid deploying DCI. If you
choose to implement DCI, you should strictly
limit its use to critical applications.
Previous Next
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
Table of Contents
LikeLike TweetTweetTweet
ShareShare
Like This Report?
Share it!
12. SubscribeSubscribe
Newsletter
Want to stay current on all new
InformationWeek Reports?
Subscribe to our weekly
newsletter and never miss
a beat.
May 2013 12
Previous
reports.informationweek.com
reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w
MORELIKE THIS
Want More Like This?
InformationWeek creates more than 150 reports like this each year,and they’re all free to registered
users.We’ll help you sort through vendor claims,justify IT projects and implement new systems by pro-
viding analysis and advice from IT professionals.Right now on our site you’ll find:
Strategy:OpenFlow vs.Traditional Networks: OpenFlow and SDN have the potential to simplify
network operations and management while driving down hardware costs.But they would also require
IT to rethink traditional network architectures.At the same time,other protocols are available or emerg-
ing that can provide many of the same benefits without requiring major changes.We’ll look at the pros
and cons of OpenFlow and SDN and how they stack up with existing options to simplify networking.
SDN Buyer’s Guide: SDN products are finally hitting the enterprise market.Do you have a strategy?
This report,the companion to our online comparison,explains key factors to consider in four areas:soft-
ware-defined networking controllers,applications,physical or virtual switches,and other compatible
hardware.
Research:IT Pro Ranking:Data Center Networking: Cisco has an iron grip on the data center
network.One reason is its reputation for quality:The company scores a 4.3 out of 5 for reliability,a rating
no other vendor matched.That said,technology and market changes are loosening Cisco’s hold.Will the
shift to virtual infrastructure,next-gen Ethernet and commodity switching components change the
vendor pecking order? More than 500 IT pros weighed in to evaluate seven vendors.
PLUS: Find signature reports,such as the InformationWeek Salary Survey,InformationWeek 500 and the
annual State of Security report; full issues; and much more.
Table of Contents