This presentation will demonstrate the long awaited Backup & Recovery Framework feature. The presentation will feature a real-world demonstration as well explain the design philosophy and how the feature has been built to be agnostic of the backend Backup and Recovery software in use.
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Paul Angus - CloudStack Backup and Recovery Framework
1. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
CloudStack Backup and
Recovery Framework
T h e ‘ s o c l o s e I c a n a l m o s t s m e l l i t ’ I n t r o t o :
Paul Angus, CTO • @CloudyAngus
paul.angus@shapeblue.com
2. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
Paul Angus – CTO
• VP/Chair Apache CloudStack project
• 20+ years C-Level experience.
• Global authority on CloudStack & cloud infrastructure design
• Specialising in the deployment of CloudStack
and surrounding infrastructure especially the user story
• USP, Georgian Ministry of Justice, Orange, TomTom,
PaddyPower, Ascenty, BSkyB, SAP, BT, Ticketmaster
A b o u t m e
3. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
• Background to the Backup and Recovery Framework
• Why is this a ‘framework’? (and why is it in quotes?)
• What will the framework work with?
• An Example would be nice ..
C l o u d S t a c k B a c k u p a n d R e c o v e r y F r a m e w o r k
6. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
VM Snapshot
• A hypervisor-driven point-in-time image of a virtual
machine’s disks. The exact mechanism of this is
dependant on the hypervisor.
Volume Snapshot
• A point-in-time image of a specific volume – The process
usually involves taking a VM snapshot and then copying
the required volume to secondary storage and the deleting
the VM snapshot
W h a t i s a s n a p s h o t
7. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
Snapshots are used by many (everyone) as a form of
backup
W h a t ’s s n a p s h o t s g o t t o d o w i t h t h i s ?
9. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
W h a t ’s w r o n g w i t h s n a p s h o t s ?
• Not consistent across volumes
• A VM with separate disks for say; OS, logs and
data, will have each disk processed at different
times
• Only crash consistent within a volume
and they aren’t application aware
• A database will not be quiesced before snapshot is
taken
• Transaction logs won’t be processed
• VMware tools or XenTools may help with some
basics
10. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
W h a t ’s w r o n g w i t h s n a p s h o t s ?
• They can be very slow.
• Data has to be transferred from primary storage to
secondary storage
• In the case of vSphere, this data must go via the
SSVM to be compressed into a OVA while being
moved to secondary storage.
• CloudStack locks out other actions while a
snapshot is taking place. Workarounds have their
own issues.
11. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
W h a t ’s w r o n g w i t h s n a p s h o t s ?
• Restoration of a VM requires a number of steps
(and did I mention it’s slow)
• User must make a template from snapshot
(VM image copied via SSVM from one part of secondary
storage to another – 2x network transfers)
• User then creates VM from template
(VM image copied from secondary to primary to create
template on primary storage, then primary storage based
template copied to make actual VM. 2x network assuming
network based primary storage and 1x on-disk copy)
• To restore a VM ‘exactly as it was’ requires considerable
operator intervention
12. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i tMust Have
• Users able to backup whole VM or individual volumes
• Users able to schedule their backups and keep n versions
• Users able to restore VM in-place (overwrite existing VM)
• Ability for Cloud Operators to leverage hardware capabilities of their respective
storage solutions.
• Seamless operation of SAN assisted backup vs non-assisted backups
Highly Desirable
• Users able to restore to alternate location
• Users able to restore to original location when original VM has been deleted
• Support for commercial backup software; Rubrik, Veeam, CommVault
• Support for 3rd Party Backup Solutions (e.g. Amanda)
• Support for in-guest (client based) backup solutions
B a c k u p a n d R e c o v e r y F r a m e w o r k H i s t o r y
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258527
June 2017
14. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
The CloudStack Backup & Recovery Framework
• The framework abstracts typical backup & recovery operations
• Plugins are required to drive physical actions by the backup solution
• Each backup vendor can have a plugin for their solution
• (A dummy plugin has been created for testing the B&R framework API)
W h y i s t h i s a ‘ f r a m e w o r k ’ ?
16. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
Backup Challenge:
• Terminology tends to be vendor dependant
• Different vendors have very models
• Some vendors’ products are not especially cloud-friendly
A b s t r a c t e d O p e r a t i o n s
17. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
Backup Types
• Scheduled
• Ad-hoc
• SLA based
• SLA based backups are mutually exclusive with scheduled and ad-hoc
backups for a given VM.
C o n c e p t s / Te r m i n o l o g y
18. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
‘SLA’ based backups
• No fixed time for backup i.e. no explicit schedule
• Instead use a ‘policy’, ‘SLA’ or ‘offering’ ie:
• How often to take a backup (Restore Point Objective)
• How long to keep a backup
• Optionally a backup window
C o n c e p t s / Te r m i n o l o g y
*ACS API refers to a snapshot policy where it
should be called a schedule (as it is in the UI)
19. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
SLA/Offering based backups
• Advantages:
• Software/operator can optimise exact backup times instead of
having 1000s of jobs all start at 23:00
• Simplification of offerings to end users – Gold, Silver, Bronze etc
C o n c e p t s / Te r m i n o l o g y
20. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
C o n c e p t s / Te r m i n o l o g y – W h a t i s a ‘ b a c k u p ’
VM timeline
Full Backup
Incremental 1
Incremental 2
Incremental 3
Incremental 4
Restore Point 1
Restore Point 2
Restore Point 3
Restore Point 4
Restore Point 5
Veeam ‘job’ ==
Backup Chain
• ‘Taking a Backup’ creates a Restore Point
• A Backup Chain can be thought of as a container for restore points
• First ‘backup’ must therefore create a backup chain for the first
and subsequent ‘backups’ to live in.
21. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
The CloudStack Backup and Recovery Framework
Creates a vendor agnostic API and UI in CloudStack for end users to leverage 3rd
party backup and recovery solutions.
The B&R Framework abstracts the specifics of B&R solutions, such that through the
use of a plugin, a 3rd party B&R solution can deliver ….
T h e P r a c t i c a l i t i e s
22. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
T h e P r a c t i c a l i t i e s
Use
importBackupProviderOffering
to link CloudStack backup
offering with a
Veeam Template job
Operator can name/describe job
on ACS side independently
23. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
T h e P r a c t i c a l i t i e s
Offering/template job
contains the frequency of
backup and allowed times
24. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
T h e P r a c t i c a l i t i e s
Offering/template job repeated
for each zone as (in Veeam) a job
also contains the backup
destination, which could be
different in each CloudStack zone
*A default job must be created for each
zone for ad-hoc and scheduled jobs
25. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
T h e P r a c t i c a l i t i e s
Additional features can be
included/excluded in the backup
offering
26. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
Ve e a m B & R P l u g i n
‘Ad-hoc’ & Scheduled Backups
• CloudStack will send command to carry
out an ad-hoc run of the policy based job
or create one as required.
• Ad-hoc and scheduled jobs will be driven
by CloudStack
• One VM to one Veeam job mapping allows
for simpler accounting and unified
incremental backups
27. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
Ve e a m B & R P l u g i n
Listing Backups
listVMBackups
• Veeam thinks in terms of restore points as
being the image of a VM at any point in
time. This API lists the restore points
available for a given VM backup job (VM)
28. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
VM and Volume Restoration
• Veeam ‘in-place’ restoration is more in-place-of than in-the-original-place
• The Veeam plugin will take care of the fact the ‘restored’ VM is in an alternate
location to the ‘original’ using upcoming VM Ingestion feature.
• Current implementation automatically attaches a restored volume to a VM (option
to restore to secondary storage will be added)
Ve e a m B & R P l u g i n
29. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
APIs*
• Backend Configuration
• listBackupProviders
• listBackupProviderOfferings
• importBackupProviderOfferings
• listBackupOfferings
• deleteBackupOffering
A b s t r a c t e d O p e r a t i o n s
* Subject to change
30. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
APIs*
• User Backup Creation
• createVMBackupSchedule
• updateVMBackupSchedule
• deleteVMBackupSchedule
• listVMBackupSchedules
• createVMBackup
• listVMBackups
• deleteVMBackup
A b s t r a c t e d O p e r a t i o n s
• User Offering Based backups
• assignVMToBackupOffering
• removeVMFromBackupOffering
• User Backup Restoration
• restoreVMBackup
• restoreVolumeFromBackup
* Subject to change
32. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
Dummy Plugin
• Built for compile time testing of the APIs.
Veeam Backup & Replication 9.5 (for VMware)
• Plugin written against Veeam Backup & Replication 9.5 and Apache CloudStack
4.12
• Tested against VMware 6.5
• Requires Veeam Backup & Replication Console, Veeam Backup Enterprise
Manager, OpenSSH and PowerShell on the Veeam Server
W h a t w i l l t h e f r a m e w o r k w o r k w i t h ?
34. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
C l i c k t o e d i t
https://github.com/apache/cloudstack/pull/3553
• PR kinda like an Alpha, learned lessons about the UX to be implemented.
• API backend largely done
• Work left:
• Separate the UI and API parts
• Refactor API ‘parts’ taking into account lessons learned.
• Scheduling of backups
• Allow changing of underlying policy
• Completion of the UI
• Targeting end of Q4 for full API to be merged
W h e r e w e ’ r e a t
37. @CLOUDYANGUS @CLOUDSTACK #CCC2019US @APACHECON
CloudStack Backup and
Recovery Framework
T h e ‘ s o c l o s e I c a n a l m o s t s m e l l i t ’ I n t r o t o :
Paul Angus, CTO • @CloudyAngus
paul.angus@shapeblue.com