Why VM Replication Is Your Lifeline when Disaster Strikes

VM Replication is Your Lifeline
When Disaster Strikes

Nathan Schmidt
Services Account Manager,
Strategic Solutions
February 12, 2013

Agenda

• Our Top 5 Myths about Disaster Recovery (DR)

• Disaster Recovery Overview

• Most Common Problems with Replication

• How VM Replication Solves Those Problems

• Q&A Session

RACKSPACE® HOSTING | WWW.RACKSPACE.COM
2

Top 5 Myths About DR

My company isn’t located in a typical
disaster-prone area like hurricane-happy
Florida. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
3

“Blown Away,” © 2004 Vincent F, vincentfung.ca, used under CC by 2.0 license


Food for thought:
“70% of reported DC
outages are directly
attributable to
human error.”
Source: Uptime Institute, Data Center
Site Infrastructure Tier Standard:
Operational Sustainability, 2010

My company isn’t located in a typical
disaster-prone area like hurricane-happy
Florida. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
4

“Blown Away,” © 2004 Vincent F, vincentfung.ca, used under CC by 2.0 license

Disaster hasn’t happened before…
no need to worry, right?
Uh…right guys???

5

“Palindrome Day 12,” © 2010 Daniel Stockman, evocateur, used under CC BY-SA 2.0 license


I backup my data, that should suffice.

6

“Broken hard drive?” © 2009 Matthew, purplemattfish, used under CC BY-NC-ND 2.0 license



Example Schedule for Data Backups
Sun. Mon. Tues. Wed. Thurs. Fri. Sat.
Full Backup
Differential
Incremental Incremental Incremental Incremental Incremental

7



Example Schedule for Data Backups
Sun. Mon. Tues. Wed. Thurs. Fri. Sat.
Full Backup
Differential
Incremental Incremental Incremental Incremental Incremental

DC Outage
8



“Less than half of SMBs back up their data weekly or
more frequently, and only 23% backup daily.

41% of the SMBs surveyed said that putting together a
Disaster Recovery plan never occurred to them.”

Source: Symantec 2011 SMB Disaster
Preparedness Survey

9


I’ve tested my plan once, so I can safely say
that 60% of the time, we’ll failover every time.

10

“Thumbs up!,” © 2011 Kristian Niemi, kressen, used under CC BY-NC-ND 2.0 license


I don’t need a DR plan. My company can weather the
storm or survive whatever disruption may lie ahead.

11


I don’t need a DR plan. My company can weather the
storm or survive whatever disruption may lie ahead.

Factoid: “93% of companies that
lost their data for 10 days or
more filed for bankruptcy within
one year of the disaster, and 50%
filed for bankruptcy
immediately”

Source: National Archives and Records
Administration in Washington

12

“caution zombies ahead,” © 2009 enrevanche, used under CC BY-NC-SA 2.0 license

What We’ve Heard from our Customers

“I know I need DR, but
I’m not really sure what
that should be.”

• Many firms think that a backup of their data is enough

• Goal of a backup is to enable data restoration

• A DR plan helps quickly restore operations
13

What is Disaster Recovery?
DR is a holistic strategy for restoring IT systems that
powers business ops

• Includes: process, policies, people, technology

14

“Can’t Fail Cafe,” © 2006 Thomas Hawk, used under CC BY-NC 2.0 license

Building the DR Framework

Basic Concept Concern
DC Level Primary DC Outage
App Level App Configuration
Info Level Critical Data

15

How to Make DR a Goal for 2013

“I’ve tried to socialize the need for a
DR plan, but it isn’t considered a
priority by senior leadership.”

“How do I convince my boss that the
additional cost of resiliency tools are
justified?”

16

Knowing The Impact of Downtime

Show your boss that downtime equals dollars lost.

The average cost of downtime
for an SMB company is about
$42,000 per hour.
Source: Gartner Research

17

“Burning Money,” © 2006 Purple Slog, purpleslog, used under CC BY 2.0 license

Two Metrics That Help Quantify Impact

RPO = Recovery Point Objective Über Cool
for School
• How old your data is once it’s been restored

Example: Necktorious, Inc.
• Retail store located in Hollywood with an ecomm site

• Selling scarves to the stars

• Goal: minimize customer transaction data loss

• Low RPO of 15 minutes

18

“ultra super cool,” © 2006 Malingering used under CC BY-NC-ND 2.0 license

Two Metrics That Help Quantify Impact

RTO = Recovery Time Objective
• How long it takes until your users are able to continue
normal operations Pretty in Pashmina

Necktorious, Inc. Example Continued:
• Site downtime equals a loss of $10k+ per hour

• Ecomm platform on virtualized servers

• Replicates business-critical VMs every 4 hours

• Minimize RTO

19

“La mi inseparable Pashmina,” © 2008 Mareluna, used under CC BY-NC-ND 2.0 license

20

21

Resiliency Solutions Across the Portfolio

Cloud Backup
RAX Solution
RAX Cloud
Cloud Files Partner Solution

RAX Data Center DNS Failover (Neustar) RAX Data Center

Network Network
Global Load Balancer

Virtual Servers Virtual Servers
VM Replication

Managed DB Servers DB Servers Managed
Backup Database Replication Backup
Servers Servers
MBU Infrastructure Host-Based Replication MBU Infrastructure

SAN SAN
Array-Based Storage
Replication
Remote Data Remote Data
Replication Replication
Customer’s 2
20
Data Center 2

VM Replication – Quick Overview

• Helps protects and recover
business-critical VMs when
disaster strikes

• Offers geographic
redundancy by replicating
VMs between Rackspace
DCs

23

Solving Real-World Problems
VM Replication solution was designed by working closely
with our beta customer, Virtual, Inc.

“We had the opportunity to provide feedback on
the early-stage replication product and talk
through the options that best met our customer’s
needs.

I felt that Rackspace listened carefully to our
feedback and even anticipated how we intended
to use and implement the solution.”

- Russell Kuhl, V.P. of Technology at Virtual
24

Common Hurdles with Replication Tools

Failover
Cost
Testing

Complexity

25

Failover
Cost
Testing

Host-Based vs. Guest-Based Complexity

Host-Based Replication Guest-Based Replication
• Occurs at the hypervisor layer • Occurs at the VM layer

• Replication process controlled by VA • Replication process controlled by VMs

• Replicated VMs are inactive • Replicated VMs in target DC are active

26

Failover
Cost
Testing

Vacation Home Analogy Complexity

Guest-based is like always paying two electrical bills.

Main house Summer home
27

“Havana-Beach-House-Byron-Bay-8,” © 2011 Catherin Hugues, Loud Cow, used under CC BY-NC-ND 2.0 license
“Christmas day,” © 2005 Jocelyn Durston, jawcey, used under CC BY-NC-ND 2.0 license

Failover
Cost
Testing

VM Replication is Host-Based Complexity

It’s Cost-Effective.
• Replicated VMs in the target site remain off

• Only pay for the replicated VMs when they are powered on after failover

• Replicated VMs are powered down once you failback to the source site

28

Failover
Cost
Testing

Infrastructure Costs Complexity

Redundant infrastructure represents a significant cost.
• Repurpose redundant hypervisor in secondary DC (e.g. test/dev environment)

• Downsize the server footprint in the target site to accommodate just critical VMs

• Consider using less powerful servers, if degraded performance is acceptable

29

Failover
Cost
Testing

Vacation Home Analogy: Part 2 Complexity

Infrastructure footprint is like your house.

4-Bedroom Main House Beach Bungalow
30

“garage,” © 2009 Kathleen Leavitt Cragun, kathleenleavitt, used under CC BY-NC-ND 2.0 license “Beach bungalow,” © 2007 Steve L Martin, <SLiM>, used under CC BY 2.0 license

Failover
Cost
Testing

VM Replication Allows Different Hardware Complexity

It reduces the hardware costs related to redundancy.
• No need to replicate entire source environment – simply select specific VMs

• Heterogeneous storage options are available for target site (e.g. dSAN to local)

• While replicated VMs are powered down, repurpose redundant server

31

Failover
Cost
Testing

Lack of Expertise & Assistance Complexity

Companies may not have the expertise or manpower, or both.

• What do I need?

• Who’s going to design it?

• Who’s going to manage and monitor it?

• Who’s going to assist with the failover?

• Who’s going to “push the button” for failover?

• Who owns the overall DR strategy?

32

Failover
Cost
Testing

Levels of Responsibility Complexity

DR Plan Failover Runbook
Customer
“Pushing the Failover Button”

Failover Process
Replication App. (VA)
Virtual Machine Layer
Guest OS Layer Rackspace
Hypervisor Layer
Server Hardware
33

DC & Network

Failover
Cost
Testing

Rackspace Fanatical Support is the Key Complexity

• Replication team is

TECHNICAL ACCOUNT
available to
SUPPORT MANAGEMENT

ARCHITECTURE BUSINESS
design, monitor &
SUPPORT DEVELOPMENT

manage
PROFESSIONAL SERVICES,
NETWORK SECURITY, BACKUP,
STORAGE, VIRTUALIZATION,
• Virtualization team has
DATABASE ADMINISTRATORS,
CORPORATE SECURITY

VCP-certified architects
DATA CENTER
OPERATIONS
available

34

Failover
Cost
Testing

Failover Testing Complexity

Companies don’t test their failover plan enough.

• Some replication services charge per test – expensive

• The failover/failback process can be risky in production

• The risk requires extensive planning around every test

35

Failover
Cost
Testing

VM Replication - Testing via Snapshot Complexity

No disruption to replication during test

VM 2 Snapshot

VM 2 Replication Continues VM 2

VM 1 (Powered Off)

Hypervisor 1 Hypervisor 2

Host 1 Host 2

Rackspace DC 1 Rackspace DC 2

17
36

VM Replication Simplifies Failover
Failover
Cost
Testing

Testing Complexity

Failover testing that’s fast, free, and frequent.

• No charge for failover testing

• It’s quick to setup and doesn’t require planning

• Testing is done in a sandbox environment

37

Rackspace VM Replication

For more information on VM Replication,

please call us at 1-877-934-0409.

38

Why VM Replication Is Your Lifeline when Disaster Strikes

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (11)

More from Rackspace

More from Rackspace (20)

Recently uploaded

Recently uploaded (20)

Why VM Replication Is Your Lifeline when Disaster Strikes

Editor's Notes