SlideShare a Scribd company logo
1 of 15
Download to read offline
Migrating EBI into the cloud
Lessons learned… so far
Tony Wildish, Cloud bioinformatics lead architect
Migrating EBI into the cloud
Lessons learned… so far
Tony Wildish
Cloud Bioinformatics Lead Architect
wildish@ebi.ac.uk
EBI and the cloud
• A few EBI teams have made forays into the cloud
• End-of-grant money, PoC prototypes, small services
• Nothing big, sustained…
• Want to make bigger steps into the cloud, but how…?
• This talk: some of the lessons learned so far…
3
Why migrate EBI into the cloud?
4
1 PB
1 TB
1 GB
2004 2019
• Data growing exponentially
• Doubling every ~2 years
• No sign of slowing down
• Exotic hardware
• High-memory machines, GPUs
• Hard to utilize well in HPC environment
• Expensive to buy
EBI workloads
• Web services
• Upload file, wait for processing, browse results
• Not particularly demanding => not particularly interesting (to me!)
• Batch processing
• N files in, crunch, M files out
• Often periodic, e.g. triggered by upstream release of new version of source data
• Can be very CPU/data intensive (e.g. metagenome assembly: weekslong runtime)
• Dozens of different pipelines, in different groups, in different languages
• Varying historical legacy, code quality…
5
Workflows, HPC vs. cloud-native
6
Step 1
Step 2
Step N
Step 1
Step 2
Step N
Step 1
Step 2
Step N
Typical HPC workflow:
• Workflow-oriented
• Inefficient use of resources
• exotic hardware (GPUs…)
oversubscribed, underutilized
• Hard to scale up
Step 1
Step N
Queue
Queue
Queue
Shared FS,
Object store,
database…
Step 2 Step 2 Step 2Step 2
Cloud-native workflow dataflow:
• Dataflow-oriented
• Efficiency per-step
• Easy to scale up/out
• Portable to multi/hybrid-cloud
Cost optimization: Data vs. Compute
• Compute is ‘easy’
• Many options for cost optimization
• Reserved instances, spot markets, sustained use discounts
• VM -> containers -> serverless functions
• Many monitoring/advisory tools
• Data is harder
• Can’t ‘turn it off’ to save money
• Tiered storage (‘cold archive’) doesn’t help against exponentially growing data
• Always true that half our data is < 2 years old, so still active
• Harder to estimate costs of data movement (ingress, egress)
7
Culture
• ‘Everyone’ agrees it’s a good idea to use cloud
• Few people have the time, knowledge or experience to do it well
• People concerned about:
• Spending ‘real money’ – keep getting asked for ‘cloud credits’
• Re-writing legacy pipelines – no, it’s not going to get easier if you wait
• Maintaining two pipeline versions, one for in-house, one for cloud
• ‘cloud-bursting’ from on-prem not a trivial way to use cloud
8
Knowledge, support, expertise
• In-house training program developed last year
• Users need new skills to be able to use cloud well
• Need systematic approach to spreading those skills through EBI
• 1-day program of the basics: Docker/Gitlab/K8s with exercises
• Given to ~200 people now - https://bit.ly/resops-2019
• Support/expertise
• ‘Cloud Consultants’ team
• Consultation, PoC, embed in teams for larger projects
• Management of organization-level infrastructure: billing, IAM, security policies
9
Porting pipelines
• Lift-and-shift:
• A cluster in the cloud that looks like the on-premises system
• ✅ Relatively easy to do, lowers entrypoint for users
• ✅ It gets people used to the idea of cloud, lets them start exploring
• ✅ Stepping stone to better ways of doing things
• ❌ Hard to be cost-effective, doesn’t exploit cloud capabilities well
• ❌ Especially true for pipelines that assume large POSIX filesystems
• ❌ Hard to learn anything useful
• ❌ Hard to maintain momentum towards (cost-)efficient solutions
10
Scaling up
• Small deployments don’t teach you much
• Need scale, longevity to start seeing cost benefits
• Reserved instances, spot markets, tiered storage etc
• Understanding/controlling costs is a long-term process
• Iterate frequently with owners of deployments
• A cost-tuning process is more important than getting it right first time
• Too many variables to predict
• Establish a culture of review, oversight
11
Acces, Accounting, Authorisation
• Who created/uses/used what resource?
• And which group/team they’re in, to aggregate by organization hierarchy
• Who needs what rights?
• E.g. group-level priorities within organization, different group have different needs
• Account/resource management when people start/leave work
• Not trivial in an academic environment c.f. startup/devops-shop
• => need an API for the structure of your organization
• Automate generating corresponding structures in cloud
• Update/verify changes in cloud with source-of-truth
• Hard to do hybrid/multi-cloud smoothly without it
12
Summary
• Migrating to cloud: an opportunity for cultural change?
• Training is important, as is a centre of expertise to help keep momentum
• Interest and willingness doesn’t always translate into sustained effort
• Pick your use cases carefully
• Not every use case will teach you something useful
• Simply reproducing on-prem systems in the cloud is not efficient/cost-effective
• Cloud IAM/policies/accounting need to be seamlessly linked to your organization
• Ad-hoc legacy infrastructure in cloud every bit as bad as on-premises
• Build dataflows, not workflows
• Data management is the key to optimizing workflows, reducing costs, scaling up
13
14
“When you invent the ship, you also invent the
shipwreck; when you invent the plane you also invent
the plane crash; and when you invent electricity, you
invent electrocution...”
Paul Virilio
Thank you
Tony Wildish
Cloud bioinformatics lead architect
customerservices@jisc.ac.uk
jisc.ac.uk

More Related Content

What's hot

The move-to-hybrid-cloud-itsmf-april2015
The move-to-hybrid-cloud-itsmf-april2015The move-to-hybrid-cloud-itsmf-april2015
The move-to-hybrid-cloud-itsmf-april2015Eduserv
 
Graver.saturday
Graver.saturdayGraver.saturday
Graver.saturdaynado-web
 
Cloud vs. Self Hosted Software Options
Cloud vs. Self Hosted Software OptionsCloud vs. Self Hosted Software Options
Cloud vs. Self Hosted Software OptionsThe Echo Group
 
ECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps DayECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps DayBob Sokol
 
2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...
2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...
2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...RightScale
 
Cloud and Desktop aaS for Teaching
Cloud and Desktop aaS for TeachingCloud and Desktop aaS for Teaching
Cloud and Desktop aaS for TeachingDavid Wallom
 
Office 365 Cloud benefits for SMBs
Office 365 Cloud benefits for SMBsOffice 365 Cloud benefits for SMBs
Office 365 Cloud benefits for SMBsAdepteq
 
Break Out Project Collaboration Tools: intro Matthias
Break Out Project Collaboration Tools: intro MatthiasBreak Out Project Collaboration Tools: intro Matthias
Break Out Project Collaboration Tools: intro Matthiasimec.archive
 
Gist open d ba_ss
Gist open d ba_ssGist open d ba_ss
Gist open d ba_ssAnkit Bose
 

What's hot (16)

Case management systems
Case management systemsCase management systems
Case management systems
 
Lasa cyp cloud tools
Lasa cyp cloud toolsLasa cyp cloud tools
Lasa cyp cloud tools
 
Rubik cloud risks-jun2012
Rubik cloud risks-jun2012Rubik cloud risks-jun2012
Rubik cloud risks-jun2012
 
Goldner "Modeling Our Services to Meet Today's User Expectations"
Goldner "Modeling Our Services to Meet Today's User Expectations"Goldner "Modeling Our Services to Meet Today's User Expectations"
Goldner "Modeling Our Services to Meet Today's User Expectations"
 
Cloud Computing in Alaska
Cloud Computing in AlaskaCloud Computing in Alaska
Cloud Computing in Alaska
 
The move-to-hybrid-cloud-itsmf-april2015
The move-to-hybrid-cloud-itsmf-april2015The move-to-hybrid-cloud-itsmf-april2015
The move-to-hybrid-cloud-itsmf-april2015
 
IT Innovations on a Church Budget
IT Innovations on a Church BudgetIT Innovations on a Church Budget
IT Innovations on a Church Budget
 
Graver.saturday
Graver.saturdayGraver.saturday
Graver.saturday
 
Cloud vs. Self Hosted Software Options
Cloud vs. Self Hosted Software OptionsCloud vs. Self Hosted Software Options
Cloud vs. Self Hosted Software Options
 
ECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps DayECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps Day
 
2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...
2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...
2012 RightScale Conference NYC - Jeff Gelb, Director of Technology Strategy, ...
 
Kick-Start Your Cloud
Kick-Start Your CloudKick-Start Your Cloud
Kick-Start Your Cloud
 
Cloud and Desktop aaS for Teaching
Cloud and Desktop aaS for TeachingCloud and Desktop aaS for Teaching
Cloud and Desktop aaS for Teaching
 
Office 365 Cloud benefits for SMBs
Office 365 Cloud benefits for SMBsOffice 365 Cloud benefits for SMBs
Office 365 Cloud benefits for SMBs
 
Break Out Project Collaboration Tools: intro Matthias
Break Out Project Collaboration Tools: intro MatthiasBreak Out Project Collaboration Tools: intro Matthias
Break Out Project Collaboration Tools: intro Matthias
 
Gist open d ba_ss
Gist open d ba_ssGist open d ba_ss
Gist open d ba_ss
 

Similar to Migrating EBI into the cloud - lessons learned, so far

Digitisation at Scale: Automating the mass acquisition of digitised content
Digitisation at Scale: Automating the mass acquisition of digitised contentDigitisation at Scale: Automating the mass acquisition of digitised content
Digitisation at Scale: Automating the mass acquisition of digitised contentintranda GmbH
 
Lessons Learned From Cloud Migrations
Lessons Learned From Cloud MigrationsLessons Learned From Cloud Migrations
Lessons Learned From Cloud MigrationsMandi Walls
 
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising Anna Perricci
 
Achieving more with less - Infrastructure
Achieving more with less - InfrastructureAchieving more with less - Infrastructure
Achieving more with less - InfrastructureGethNichols
 
Lessons Learned from Continuous Delivery
Lessons Learned from Continuous DeliveryLessons Learned from Continuous Delivery
Lessons Learned from Continuous DeliveryMandi Walls
 
Challenges Facing an IT Director
Challenges Facing an IT DirectorChallenges Facing an IT Director
Challenges Facing an IT DirectorCSaC
 
Introduction to Digital Preservation
Introduction to Digital PreservationIntroduction to Digital Preservation
Introduction to Digital PreservationBill LeFurgy
 
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOCloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOStorage Switzerland
 
Coping Strategies for the Death of Unlimited Storage
Coping Strategies for the Death of Unlimited StorageCoping Strategies for the Death of Unlimited Storage
Coping Strategies for the Death of Unlimited StorageGlobus
 
ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...
ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...
ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...India Scrum Enthusiasts Community
 
Patching is Your Friend in the New World Order of EPM and ERP Cloud
Patching is Your Friend in the New World Order of EPM and ERP CloudPatching is Your Friend in the New World Order of EPM and ERP Cloud
Patching is Your Friend in the New World Order of EPM and ERP CloudDatavail
 
ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)
ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)
ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)EDINA, University of Edinburgh
 
Community IT Innovators - Office 365 vs. Google Apps 101812
Community IT Innovators - Office 365 vs. Google Apps 101812Community IT Innovators - Office 365 vs. Google Apps 101812
Community IT Innovators - Office 365 vs. Google Apps 101812Community IT Innovators
 
10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that works10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that worksKate Thomas
 
4A - Working remotely - Richard Craig
4A - Working remotely - Richard Craig4A - Working remotely - Richard Craig
4A - Working remotely - Richard CraigCFG
 
USG Rock Eagle 2017 - PWP at 1000 Days
USG Rock Eagle 2017 - PWP at 1000 DaysUSG Rock Eagle 2017 - PWP at 1000 Days
USG Rock Eagle 2017 - PWP at 1000 DaysEric Sembrat
 
Migrating Core Enterprise Applications to the Cloud
Migrating Core Enterprise Applications to the CloudMigrating Core Enterprise Applications to the Cloud
Migrating Core Enterprise Applications to the CloudRoger Valade
 

Similar to Migrating EBI into the cloud - lessons learned, so far (20)

Digitisation at Scale: Automating the mass acquisition of digitised content
Digitisation at Scale: Automating the mass acquisition of digitised contentDigitisation at Scale: Automating the mass acquisition of digitised content
Digitisation at Scale: Automating the mass acquisition of digitised content
 
Lessons Learned From Cloud Migrations
Lessons Learned From Cloud MigrationsLessons Learned From Cloud Migrations
Lessons Learned From Cloud Migrations
 
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
 
Achieving more with less - Infrastructure
Achieving more with less - InfrastructureAchieving more with less - Infrastructure
Achieving more with less - Infrastructure
 
Lessons Learned from Continuous Delivery
Lessons Learned from Continuous DeliveryLessons Learned from Continuous Delivery
Lessons Learned from Continuous Delivery
 
Challenges Facing an IT Director
Challenges Facing an IT DirectorChallenges Facing an IT Director
Challenges Facing an IT Director
 
Introduction to Digital Preservation
Introduction to Digital PreservationIntroduction to Digital Preservation
Introduction to Digital Preservation
 
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOCloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
 
Coping Strategies for the Death of Unlimited Storage
Coping Strategies for the Death of Unlimited StorageCoping Strategies for the Death of Unlimited Storage
Coping Strategies for the Death of Unlimited Storage
 
ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...
ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...
ATC2013-Naresh Arumugam- Top 10 challenges with distributed teams and viable ...
 
Patching is Your Friend in the New World Order of EPM and ERP Cloud
Patching is Your Friend in the New World Order of EPM and ERP CloudPatching is Your Friend in the New World Order of EPM and ERP Cloud
Patching is Your Friend in the New World Order of EPM and ERP Cloud
 
ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)
ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)
ESDIN - OGC Web Services Shibboleth Interoperability Experiment (OSI)
 
DevOps Days Ohio
DevOps Days OhioDevOps Days Ohio
DevOps Days Ohio
 
Community IT Innovators - Office 365 vs. Google Apps 101812
Community IT Innovators - Office 365 vs. Google Apps 101812Community IT Innovators - Office 365 vs. Google Apps 101812
Community IT Innovators - Office 365 vs. Google Apps 101812
 
Big Data and the Semantic Web
Big Data and the Semantic WebBig Data and the Semantic Web
Big Data and the Semantic Web
 
10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that works10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that works
 
Cloud Backup Solutions for Your Church
Cloud Backup Solutions for Your ChurchCloud Backup Solutions for Your Church
Cloud Backup Solutions for Your Church
 
4A - Working remotely - Richard Craig
4A - Working remotely - Richard Craig4A - Working remotely - Richard Craig
4A - Working remotely - Richard Craig
 
USG Rock Eagle 2017 - PWP at 1000 Days
USG Rock Eagle 2017 - PWP at 1000 DaysUSG Rock Eagle 2017 - PWP at 1000 Days
USG Rock Eagle 2017 - PWP at 1000 Days
 
Migrating Core Enterprise Applications to the Cloud
Migrating Core Enterprise Applications to the CloudMigrating Core Enterprise Applications to the Cloud
Migrating Core Enterprise Applications to the Cloud
 

More from Jisc

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...Jisc
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxJisc
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxJisc
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Jisc
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...Jisc
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptxJisc
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxJisc
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxJisc
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxJisc
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJisc
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxJisc
 
An introduction to Cyber Essentials
An introduction to Cyber EssentialsAn introduction to Cyber Essentials
An introduction to Cyber EssentialsJisc
 
MarkChilds.pptx
MarkChilds.pptxMarkChilds.pptx
MarkChilds.pptxJisc
 
RStrachanOct23.pptx
RStrachanOct23.pptxRStrachanOct23.pptx
RStrachanOct23.pptxJisc
 
ISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxJisc
 
FerrellWalker.pptx
FerrellWalker.pptxFerrellWalker.pptx
FerrellWalker.pptxJisc
 

More from Jisc (20)

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptx
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptx
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptx
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptx
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptx
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptx
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptx
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptx
 
An introduction to Cyber Essentials
An introduction to Cyber EssentialsAn introduction to Cyber Essentials
An introduction to Cyber Essentials
 
MarkChilds.pptx
MarkChilds.pptxMarkChilds.pptx
MarkChilds.pptx
 
RStrachanOct23.pptx
RStrachanOct23.pptxRStrachanOct23.pptx
RStrachanOct23.pptx
 
ISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptx
 
FerrellWalker.pptx
FerrellWalker.pptxFerrellWalker.pptx
FerrellWalker.pptx
 

Recently uploaded

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Migrating EBI into the cloud - lessons learned, so far

  • 1. Migrating EBI into the cloud Lessons learned… so far Tony Wildish, Cloud bioinformatics lead architect
  • 2. Migrating EBI into the cloud Lessons learned… so far Tony Wildish Cloud Bioinformatics Lead Architect wildish@ebi.ac.uk
  • 3. EBI and the cloud • A few EBI teams have made forays into the cloud • End-of-grant money, PoC prototypes, small services • Nothing big, sustained… • Want to make bigger steps into the cloud, but how…? • This talk: some of the lessons learned so far… 3
  • 4. Why migrate EBI into the cloud? 4 1 PB 1 TB 1 GB 2004 2019 • Data growing exponentially • Doubling every ~2 years • No sign of slowing down • Exotic hardware • High-memory machines, GPUs • Hard to utilize well in HPC environment • Expensive to buy
  • 5. EBI workloads • Web services • Upload file, wait for processing, browse results • Not particularly demanding => not particularly interesting (to me!) • Batch processing • N files in, crunch, M files out • Often periodic, e.g. triggered by upstream release of new version of source data • Can be very CPU/data intensive (e.g. metagenome assembly: weekslong runtime) • Dozens of different pipelines, in different groups, in different languages • Varying historical legacy, code quality… 5
  • 6. Workflows, HPC vs. cloud-native 6 Step 1 Step 2 Step N Step 1 Step 2 Step N Step 1 Step 2 Step N Typical HPC workflow: • Workflow-oriented • Inefficient use of resources • exotic hardware (GPUs…) oversubscribed, underutilized • Hard to scale up Step 1 Step N Queue Queue Queue Shared FS, Object store, database… Step 2 Step 2 Step 2Step 2 Cloud-native workflow dataflow: • Dataflow-oriented • Efficiency per-step • Easy to scale up/out • Portable to multi/hybrid-cloud
  • 7. Cost optimization: Data vs. Compute • Compute is ‘easy’ • Many options for cost optimization • Reserved instances, spot markets, sustained use discounts • VM -> containers -> serverless functions • Many monitoring/advisory tools • Data is harder • Can’t ‘turn it off’ to save money • Tiered storage (‘cold archive’) doesn’t help against exponentially growing data • Always true that half our data is < 2 years old, so still active • Harder to estimate costs of data movement (ingress, egress) 7
  • 8. Culture • ‘Everyone’ agrees it’s a good idea to use cloud • Few people have the time, knowledge or experience to do it well • People concerned about: • Spending ‘real money’ – keep getting asked for ‘cloud credits’ • Re-writing legacy pipelines – no, it’s not going to get easier if you wait • Maintaining two pipeline versions, one for in-house, one for cloud • ‘cloud-bursting’ from on-prem not a trivial way to use cloud 8
  • 9. Knowledge, support, expertise • In-house training program developed last year • Users need new skills to be able to use cloud well • Need systematic approach to spreading those skills through EBI • 1-day program of the basics: Docker/Gitlab/K8s with exercises • Given to ~200 people now - https://bit.ly/resops-2019 • Support/expertise • ‘Cloud Consultants’ team • Consultation, PoC, embed in teams for larger projects • Management of organization-level infrastructure: billing, IAM, security policies 9
  • 10. Porting pipelines • Lift-and-shift: • A cluster in the cloud that looks like the on-premises system • ✅ Relatively easy to do, lowers entrypoint for users • ✅ It gets people used to the idea of cloud, lets them start exploring • ✅ Stepping stone to better ways of doing things • ❌ Hard to be cost-effective, doesn’t exploit cloud capabilities well • ❌ Especially true for pipelines that assume large POSIX filesystems • ❌ Hard to learn anything useful • ❌ Hard to maintain momentum towards (cost-)efficient solutions 10
  • 11. Scaling up • Small deployments don’t teach you much • Need scale, longevity to start seeing cost benefits • Reserved instances, spot markets, tiered storage etc • Understanding/controlling costs is a long-term process • Iterate frequently with owners of deployments • A cost-tuning process is more important than getting it right first time • Too many variables to predict • Establish a culture of review, oversight 11
  • 12. Acces, Accounting, Authorisation • Who created/uses/used what resource? • And which group/team they’re in, to aggregate by organization hierarchy • Who needs what rights? • E.g. group-level priorities within organization, different group have different needs • Account/resource management when people start/leave work • Not trivial in an academic environment c.f. startup/devops-shop • => need an API for the structure of your organization • Automate generating corresponding structures in cloud • Update/verify changes in cloud with source-of-truth • Hard to do hybrid/multi-cloud smoothly without it 12
  • 13. Summary • Migrating to cloud: an opportunity for cultural change? • Training is important, as is a centre of expertise to help keep momentum • Interest and willingness doesn’t always translate into sustained effort • Pick your use cases carefully • Not every use case will teach you something useful • Simply reproducing on-prem systems in the cloud is not efficient/cost-effective • Cloud IAM/policies/accounting need to be seamlessly linked to your organization • Ad-hoc legacy infrastructure in cloud every bit as bad as on-premises • Build dataflows, not workflows • Data management is the key to optimizing workflows, reducing costs, scaling up 13
  • 14. 14 “When you invent the ship, you also invent the shipwreck; when you invent the plane you also invent the plane crash; and when you invent electricity, you invent electrocution...” Paul Virilio
  • 15. Thank you Tony Wildish Cloud bioinformatics lead architect customerservices@jisc.ac.uk jisc.ac.uk