While cloud computing offers virtually unlimited capacity, harnessing that capacity in an efficient, cost effective fashion can be cumbersome and difficult at the workload level. At the organizational level, it can quickly become chaos.
You must make choices around cloud deployment, and these choices could have a long-lasting impact on your organization. It is important to understand your options and avoid incomplete, complicated, locked-in scenarios. Data management and placement challenges make having the ability to automate workflows and processes across multiple clouds a requirement.
In this webinar, you will:
• Learn how to leverage cloud services as part of an overall computation approach
• Understand data management in a cloud-based world
• Hear what options you have to orchestrate HPC in the cloud
• Learn how cloud orchestration works to automate and align computing with specific goals and objectives
• See an example of an orchestrated HPC workload using on-premises data
From computational research to financial back testing, and research simulations to IoT processing frameworks, decisions made now will not only impact future manageability, but also your sanity.
3. Today’s Speakers
Rick Friedman
Vice President, Solution
Development
Cycle Computing
Scott Jeschonek
Director of Product
Management, Cloud
Avere Systems
4. Agenda
• Discuss the current state of HPC
• Clouds and their impact on your HPC world
• Reasons why you aren’t 100% cloud-based already
• The Hybrid Cloud and HPC
• Possible implementations
• Delivering File Systems using Avere Systems
• Orchestration using Cycle Computing
6. What Drives Today’s Needs
• Data
– Who, what, when, how much, where?
• Datacenter limitations
– Can I defy physics?
• User expectations
– Can we even do that?
• Technology shifts
– What is the “best practice”?
7. Big Compute Workloads: How are they handled?
Compute Demand vs. Cluster Size
Cluster Size
Compute
Demand
Missed
Opportunity
Wasted
Resources
• Internal infrastructure has huge value and
some limitations
• Access, not capacity, is the barrier to
continued growth
• Perception limits scale of problem solving
• Public cloud = cost-effective, readily
available resources to users with problems
& deadlines.
• Financial services, manufacturing and life
sciences are leading the way.
8. Basic HPC Environment Requirements
Resource
Manager
Jobs Manager /
Scheduler
Workload
NAS Storage
Lots of compute resources (“Grid”)
9. Advantages of Clouds
Significantly reduce
infrastructure
management costs both in
money and time
Maintain operational
flexibility during scale-out
jobs…let the provider deal
with scale challenges
10. Why the Cloud for Big Compute?
• Scientist / Engineer User perspective
– Zero queue times, capacity in minutes
– Scale compute to problems size, not vice versa
– Try / support new computational approaches and software quickly
• SysArchitect perspective
– Dynamically adjust workloads to “lowest cost/impact” provider
– Focus on computational excellence, not hardware management
– Support a wide range of user types efficiently
• Organizational perspective
– Match spending to actual consumption
– Increase responsiveness to business dynamics
– Grow user base without hardware limitations
11. Clouds Have Awesome New Capabilities
• Big Data
– Analytics Tools
– Massively scalable NoSQL
– Data warehousing
• Machine Learning
– Voice/Vision/Speech
– Early days
12. So…why isn’t everything in the cloud?
• Current infrastructure investment (capex)
• Cloud costs not yet completely in line
• Software infrastructure in place
– Costs to refactor, dependencies to consider
• Data environment in one or more data centers
• Orchestration and management of cloud clusters is hard
• Network bandwidth / latency concerns
• Business Continuity
13. Other Reasons You’re Not 100% in the Cloud
• Corporate budgets
• Corporate policies
• Corporate politics
• Education / awareness
• Government regulations
• Interest groups
• Vendor relationships
14. Near Future, Hybrid Cloud
Tokyo office London office
Analysts
Analysts
NYC office
Analysts AnalystsAnalysts
Analysts AnalystsAnalysts
AnalystsAnalysts
Hong Kong office
• Adoption of one or more cloud providers
• > 1 hedge on price and SLA
• Mix of on-prem and cloud resources
• Regulatory, proprietary and/or security
characteristics will likely keep data in the DC
NAS
Primary
DC
Cloud
Provider
1
Cloud
Provider
2
NAS
Secondary
DC
15. Cloud Compute
Environment
Data
HPC in the Cloud
Cloud Compute
API
Scheduler
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
Jobs
On-Premises
Data Center
16. Cloud Compute
Environment
HPC in the Cloud, “Grids on Demand”
Cloud Compute
API
Data
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
Jobs
On-Premises
Data Center
Scheduler1 Scheduler2
Scheduler3 Scheduler4
17. Challenges with HPC in the Cloud
• How do you get the data close to your compute nodes?
• How do you orchestrate on-demand clusters/grids of compute
nodes?
• How does this all come together??
18. Cloud Compute
Environment
Data Access Layer
Cloud Compute
API
Scheduler1
Data
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
Jobs
On-Premises
Data Center
Data Access Layer
Scheduler2 Scheduler3 Scheduler4
• File System
• Caching Layer
• Only load necessary
blocks of files
• Opaque to compute
nodes
19. Advantages of Data Access / Cache Layer
• Keep your data on prem! – Data in cloud is only there while the
compute nodes work the jobs.
– Reduce the security objections, simplify the move to cloud
• Increase cloud compute performance – using file system caching,
most of the data will be in RAM, close to the nodes
– Avoids ingest latencies and slashes transit latency after first read
• Scale out – Using solution that facilitates 10s of 1000s of core file
system connections
20. Typical File Access in Hadoop Cluster
Caching files will work for
certain types of jobs
Where typical file is accessed
By multiple clients
source: http://blog.cloudera.com/blog/2012/09/what-do-real-life-hadoop-workloads-look-like/
21. Hybrid Cloud using Avere FXT and vFXT Edge Filers
Cloud
Compute
On-Prem
Compute
Cloud
Storage
On-Prem
Storage
NAS
Object
Bucket 1 Bucket 2
Bucket n
Virtual Compute
Farm
Virtual
FXT
File Storage for
Private Object
NAS
Optimization
Cloud NAS
Physical
FXT
The “Edge” = locating your data
Close to your compute
Without truly moving it from your
NAS environment
22. Avere Building Blocks
“Avere is uniquely positioned to offer scale
across tens of thousands of cloud compute
cores while leaving the data where it
originates, on premises, with it’s global file
system and caching capabilities.”
- Unnamed CTO
Cloud Compute
Virtual FXT
NAS
Object
Physical
FXT
Cloud
On-Premises
File Acceleration
23. Cloud Compute
Environment
Orchestration and Management Layer
Cloud Compute
API
Data
On-Premises
Data Center
Scheduler1 Scheduler2
Scheduler3 Scheduler4
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
Jobs
24. Optimization
• Benchmark instances
• Make Workflow UI
• Human workflow
Provisioning
• Workload placement
Optimal scale
• Cost optimization
• Data scheduling
Cluster Configuration
• Multi-cloud, without changes
• Pre-set or User-defined “types”
• Abstraction for all cluster data,
attributes (roles, OS, etc)
Monitoring
• Auto-scaling
• Usage tracking
• Error Handling
• Reporting
Internal
File: Declarative
Cluster Definition
Packages, Installers
Containers, Data
Admin
Scope Configure
Run on
Cloud
Optimize
User
Complete Multi-Cloud Workflow Control
25. User
Web
UI API
CMD
Line
Job & Data
Workflow
Automated
Job Placement,
Cost optimization
Auto-scaling,
Benchmarking,
Compliance,
Reporting tools
Multi-cloud
Without Changes
Internal
Cluster
How Cycle Makes Cloud Productive
• Scientist / Engineer productivity:
– Simple workflows
– Zero queue time
– Auto-scaling
• SysAdmin productivity:
– Instant access to additional resources
– Workflows linking internal and multiple clouds
– Simple reliable tools to enable apps with
special requirements
• Organizational productivity:
– Secure, consistent cloud access
– Usage tracking
– Ability to leverage multiple providers
26. Big Data w/o Disrupting Production
• Challenge
– Estimate the carbon stored in Saharan biomass
– Rapidly establish a baseline for later research using large
amounts of high-resolution remote sensing data
– Existing internal compute resources fully committed
– Limited window to complete processing
• Cycle solution
– Full workflow including data management between internal
data capture and cloud processing
– Leverage spot pricing to minimize cost while maximizing
computation
• Results
– Linearly scalable, predictable enabling plan for next steps
– Science being done that could not be done otherwise
– 1 month start to initial runs
26
27. Overall Architecture – Data In-House
Cloud Compute
Scheduler
Avere FXT Edge Filer
Avere FXT
Workload
Cloud API
NAS Storage
Scheduler
Cloud Storage
28. What We Covered…
• The Current State of HPC
• Clouds and Their Impact on Your HPC World
• Reasons Why You aren’t 100% Cloud-based Already
• The Hybrid Cloud and HPC
• Possible Implementations
• Delivering File Systems Using Avere Systems
• Orchestration Using Cycle Computing
29. Thank you!
Cycle Computing Contact Info: More about Avere Systems:
askavere@averesystems.com
www.averesystems.com
1-888.88.AVERE
https://twitter.com/averesystems
https://www.youtube.com/user/AvereSystems
https://www.linkedin.com/company/589037
info@cyclecomputing.com
www.cyclecomputing.com
888.292.5320
https://twitter.com/cyclecomputing
https://www.youtube.com/user/CycleComputing
https://www.linkedin.com/company/692068
Editor's Notes
Title Slide
Cost-effective internal infrastructure has enabled users to solve increasingly larger problems over the last 15 years while also highlighting some inefficiencies
The barrier to sustaining that growth is an access and allocation problem, not a compute problem
Big Compute users are limiting the size of the problems they tackle, to the infrastructure they think they can access.
Public cloud represents an opportunity to allocate cost-effective, readily available resources to users who value the ability to solve a problem within a deadline.
Financial services, manufacturing and life sciences are leading the way; their problem is most acute and solving it has measureable business benefit