Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle

•Download as PPT, PDF•

1 like•1,812 views

Speaker: Josh Kruck To learn more about Pivotal Cloud Foundry, visit pivotal.io/platform-as-a-service/pivotal-cloud-foundry

Technology

Using Service Brokers to Manage Data
Lifecycle
Josh Kruck | @krujos
jkruck@pivotal.io
github.com/krujos

2
What are the some operational
problems with data?

3
Primary
Primary
DR Backup
Snapshots
Business Critical Data Lifecycle
RTO 00:05 RPO 01:00
First 12 hours
Replica
Backup

4
Primary
Backup Backup
Primary
Snapshots
Replica
Backup
Business Critical Data Lifecycle
RTO 00:05 RPO 01:00
First 24 hours
DR

8
(capex is easy, just buy more stuff)
copies aren’t
really the
problem!

10
managed by
3 systems
[“storage”, “backup”, “rdbms”]

11
and 5 teams.
[
“storage”,
“backup”,
“offsite provider”,
“app owner”,
“dba”
]

12
(you shouldn't buy more people)
opex is the
problem

13
what’s the
read/write
load on the
copy?

14
0
5475 copies
doing nothing
for your
business

15
Why all this talk about backups and stuff?
?

16
Good code needs good tests.
Good tests need good data.
Good data needs… a copy.
A play in 3 acts
so lets get one!

17
“I don’t think we
have any copies of
that”

18
“I not allowed to have
prod logs, much less
the db”

19
we can do it, this
one time: file a
ticket.

20
Solved!
But did we create
another problem?

21
Once you find a copy, it needs a curator
Sizing (don’t use all of 10 TB of prod to test)
But your sample must represent the entirety of
the dataset.
Representative curation is futile with most
datasets (unknown unknowns).
Sizing means you restrict your tests to what you
left in.
Sizing hides performance issues (missing index)
So maybe it’s not worth it….

22
Once you find a copy, it needs a curator
Sanitize it!
Can’t have SSN’s and
CC in test

23
Once you find a copy, it needs a curator
Delete!
old data smells funny.

24
Once you find a copy, it needs a curator
Refresh!
GOTO 10

25
hard|complex
manual
infrequent
error prone
handoffs
deletion
ownership
Curation is
expensive

26
A manual process
that starts with a
ticket is the wrong
solution

27
The sum of the mess is worth more than its parts
There’s 5475 secondary copies with
no load, can we leverage them for
testing?
Fix: Let CF manage
your data.

29
most copies do nothing, but when the sky is falling you need them
first do no harm

30
cf create-service
Copy Data
Sanitize Data
cf push <app>
Test
cf delete app -r -f
cf delete-service
Pattern:

31
How do you fill in
that hand
wavy part in the
middle?

32
Putting the E in Enterprise
Buy a CDM Product
Actifio, Delphix, ViPR
Great if they support your workloads!
And you can consume the form factors they
deliver

33
Based on technology to allow layered writes
Layered FS (Docker, Docker, Docker)?
Clones, Linked Clones, VM Snaps
Writeable Snapshots (FlexClone, XtremIO,
LVM Snaps)
Building is harder than buying
BYO

34
cf create-service
Snap Prod VM
Spin up VM
Allocate IP
Sanitize Data in PG
cf push demo
Test
Dispose
AMI and Postgres Demo

35
https://github.com/krujos/data-lifecycle-service-broker
please help!

What's hot

Accelerating Time to MarketVMware Tanzu

Cloud-Native Workshop - Santa Monica VMware Tanzu

Navigating the Cloud Foundry Ecosystem of Ecosystems: An ISV PerspectiveIvan Dwyer

5 Steps to Developing Push-based Apps in the Age of Connected DevicesVMware Tanzu

devops, microservices, and platforms, oh my!Andrew Shafer

Redis rise of Dataopslandoop

Achieving a Serverless Development ExperienceIvan Dwyer

Platform Requirements for CI/CD Success—and the Enterprises Leading the WayVMware Tanzu

Tackling customer issues in cloud native environmentsLibbySchulze

Microsoft DevOps JourneyMayank Srivastava

Devops the Microsoft WayPatrick Chanezon

Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)VMware Tanzu

Microservices? Dynamic Infrastructure? - Adventures in Keeping Your Applicati...New Relic

Cloud-Native Workshop - Santa MonicaVMware Tanzu

DevOps 101 - Moving Fast with ConfidenceNew Relic

Better Software is Better than Worse Software - Michael CotéVMware Tanzu

Thinking about the full stack to create great mobile experiencesNew Relic

Running Data Platforms Like ProductsVMware Tanzu

Container Landscape -05.01.15Barton George

Code to Cloud: Three Trends for Faster, Safer Continuous DeliveryVMware Tanzu

What's hot (20)

Accelerating Time to Market

Cloud-Native Workshop - Santa Monica

Navigating the Cloud Foundry Ecosystem of Ecosystems: An ISV Perspective

5 Steps to Developing Push-based Apps in the Age of Connected Devices

devops, microservices, and platforms, oh my!

Redis rise of Dataops

Achieving a Serverless Development Experience

Platform Requirements for CI/CD Success—and the Enterprises Leading the Way

Tackling customer issues in cloud native environments

Microsoft DevOps Journey

Devops the Microsoft Way

Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)

Microservices? Dynamic Infrastructure? - Adventures in Keeping Your Applicati...

Cloud-Native Workshop - Santa Monica

DevOps 101 - Moving Fast with Confidence

Better Software is Better than Worse Software - Michael Coté

Thinking about the full stack to create great mobile experiences

Running Data Platforms Like Products

Container Landscape -05.01.15

Code to Cloud: Three Trends for Faster, Safer Continuous Delivery

Similar to Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle

How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...Precisely

9 Hyperion Performance Myths and How to Debunk ThemDatavail

Management and Automation of MongoDB Clusters - SlidesSeveralnines

BP206 - Let's Give Your LotusScript a Tune-Up Craig Schumann

Mini-course "Practices of the Web Giants" at Global Code - São PauloOCTO Technology

From Duke of DevOps to Queen of Chaos - Api days 2018Christophe Rochefolle

PyData 2015 Keynote: "A Systems View of Machine Learning" Joshua Bloom

“Performance” - Dallas Oracle Users Group 2019-01-29 presentationCary Millsap

Acc 340 Preview Full Coursefasthomeworkhelpdotcome

Acc 340 Preview Full Course fasthomeworkhelpdotcome

The Reality Facing The Mainframe WorldIan Baynes

Works on my machine, your problem now? - QCon 2014Wolfgang Gottesheim

Managing the Earthquake: Surviving Major Database Architecture Changes (rev.2...Michael Rosenblum

Seven Cloud Sins of DevOpsTaras Slipets

Design Matters: Why In-Place Copy Data Management is the Right Choice Catalogic Software

ATLUG comes to you ICS.UG 2015ICS User Group

DBTA Case Study on Data Optimization | September 2008Embarcadero Technologies

BioIT Trends - 2014 Internet2 Technology ExchangeChris Dagdigian

What the hell is your software doing at runtime?Roberto Franchini

Powerpoint fujitsuaiimnevada

Similar to Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle (20)

How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...

9 Hyperion Performance Myths and How to Debunk Them

Management and Automation of MongoDB Clusters - Slides

BP206 - Let's Give Your LotusScript a Tune-Up

Mini-course "Practices of the Web Giants" at Global Code - São Paulo

From Duke of DevOps to Queen of Chaos - Api days 2018

PyData 2015 Keynote: "A Systems View of Machine Learning"

“Performance” - Dallas Oracle Users Group 2019-01-29 presentation

Acc 340 Preview Full Course

The Reality Facing The Mainframe World

Works on my machine, your problem now? - QCon 2014

Managing the Earthquake: Surviving Major Database Architecture Changes (rev.2...

Seven Cloud Sins of DevOps

Design Matters: Why In-Place Copy Data Management is the Right Choice

ATLUG comes to you ICS.UG 2015

DBTA Case Study on Data Optimization | September 2008

BioIT Trends - 2014 Internet2 Technology Exchange

What the hell is your software doing at runtime?

Powerpoint fujitsu

Recently uploaded

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Partners Life - Insurer Innovation Award 2024The Digital Insurer

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

GenCyber Cyber Security Day PresentationMichael W. Hawkins

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

A Year of the Servo Reboot: Where Are We Now?Igalia

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Developing An App To Navigate The Roads of BrazilV3cube

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Real Time Object Detection Using Open CVKhem

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Advantages of Hiring UIUX Design Service Providers for Your Business

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Partners Life - Insurer Innovation Award 2024

What Are The Drone Anti-jamming Systems Technology?

[2024]Digital Global Overview Report 2024 Meltwater.pdf

How to Troubleshoot Apps for the Modern Connected Worker

GenCyber Cyber Security Day Presentation

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

A Year of the Servo Reboot: Where Are We Now?

Boost Fertility New Invention Ups Success Rates.pdf

Axa Assurance Maroc - Insurer Innovation Award 2024

Developing An App To Navigate The Roads of Brazil

Strategies for Landing an Oracle DBA Job as a Fresher

Handwritten Text Recognition for manuscripts and early printed texts

Real Time Object Detection Using Open CV

Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle

1. Using Service Brokers to Manage Data Lifecycle Josh Kruck | @krujos jkruck@pivotal.io github.com/krujos

2. 2 What are the some operational problems with data?

3. 3 Primary Primary DR Backup Snapshots Business Critical Data Lifecycle RTO 00:05 RPO 01:00 First 12 hours Replica Backup

4. 4 Primary Backup Backup Primary Snapshots Replica Backup Business Critical Data Lifecycle RTO 00:05 RPO 01:00 First 24 hours DR

5. 5 525,600 minutes

6. 6 5476 copies

7. 7

8. 8 (capex is easy, just buy more stuff) copies aren’t really the problem!

9. 9 The real problem is 5476 copies are…

10. 10 managed by 3 systems [“storage”, “backup”, “rdbms”]

11. 11 and 5 teams. [ “storage”, “backup”, “offsite provider”, “app owner”, “dba” ]

12. 12 (you shouldn't buy more people) opex is the problem

13. 13 what’s the read/write load on the copy?

14. 14 0 5475 copies doing nothing for your business

15. 15 Why all this talk about backups and stuff? ?

16. 16 Good code needs good tests. Good tests need good data. Good data needs… a copy. A play in 3 acts so lets get one!

17. 17 “I don’t think we have any copies of that”

18. 18 “I not allowed to have prod logs, much less the db”

19. 19 we can do it, this one time: file a ticket.

20. 20 Solved! But did we create another problem?

21. 21 Once you find a copy, it needs a curator Sizing (don’t use all of 10 TB of prod to test) But your sample must represent the entirety of the dataset. Representative curation is futile with most datasets (unknown unknowns). Sizing means you restrict your tests to what you left in. Sizing hides performance issues (missing index) So maybe it’s not worth it….

22. 22 Once you find a copy, it needs a curator Sanitize it! Can’t have SSN’s and CC in test

23. 23 Once you find a copy, it needs a curator Delete! old data smells funny.

24. 24 Once you find a copy, it needs a curator Refresh! GOTO 10

25. 25 hard|complex manual infrequent error prone handoffs deletion ownership Curation is expensive

26. 26 A manual process that starts with a ticket is the wrong solution

27. 27 The sum of the mess is worth more than its parts There’s 5475 secondary copies with no load, can we leverage them for testing? Fix: Let CF manage your data.

28. 28 How?

29. 29 most copies do nothing, but when the sky is falling you need them first do no harm

30. 30 cf create-service Copy Data Sanitize Data cf push <app> Test cf delete app -r -f cf delete-service Pattern:

31. 31 How do you fill in that hand wavy part in the middle?

32. 32 Putting the E in Enterprise Buy a CDM Product Actifio, Delphix, ViPR Great if they support your workloads! And you can consume the form factors they deliver

33. 33 Based on technology to allow layered writes Layered FS (Docker, Docker, Docker)? Clones, Linked Clones, VM Snaps Writeable Snapshots (FlexClone, XtremIO, LVM Snaps) Building is harder than buying BYO

34. 34 cf create-service Snap Prod VM Spin up VM Allocate IP Sanitize Data in PG cf push demo Test Dispose AMI and Postgres Demo

35. 35 https://github.com/krujos/data-lifecycle-service-broker please help!

Editor's Notes

First, act, how do I get the copies?
much sleuthing and failed attempts to generate legit test data later…
Act II
ACT III I have a customer who hasn’t refreshed test data in three years.
ACT III I have a customer who hasn’t refreshed test data in three years.
Represent the entirety of the dataset means things like previous schemas. Rows with missing additive fields, FK’s etc. Is selecting those records going to cause issues? What about formats assumed in the data itself (but surely no one stores encoded information in their database). Everyone knows the data well enough to know what representative is? (no)
Represent the entirety of the dataset means things like previous schemas. Rows with missing additive fields, FK’s etc. Is selecting those records going to cause issues? What about formats assumed in the data itself (but surely no one stores encoded information in their database). Everyone knows the data well enough to know what representative is? (no)
Represent the entirety of the dataset means things like previous schemas. Rows with missing additive fields, FK’s etc. Is selecting those records going to cause issues? What about formats assumed in the data itself (but surely no one stores encoded information in their database). Everyone knows the data well enough to know what representative is? (no)
Represent the entirety of the dataset means things like previous schemas. Rows with missing additive fields, FK’s etc. Is selecting those records going to cause issues? What about formats assumed in the data itself (but surely no one stores encoded information in their database). Everyone knows the data well enough to know what representative is? (no)

Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle

Similar to Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle (20)

More from VMware Tanzu

More from VMware Tanzu (20)

Recently uploaded

Recently uploaded (20)

Cloud Foundry Summit 2015: Using Service Brokers to Manage Data Lifecycle

Editor's Notes