Usman Shakeel from Amazon Web Services, explains to us how to use AWS Spot Instances to implement low cost video rendering applications and workflows.
This presentation was delivered during the AWS Toronto Media and Entertainment Symposium
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽❤️🧑🏻 89...
Cost Effective Rendering in the Cloud with Spot Instances
1. Cost Effective Rendering at Scale
with EC2 Spot
Usman Shakeel | Principal Solutions Architect M&E
Amazon Web Services
2. Agenda
Cost Effective Rendering at Scale
with EC2 Spot
VFX/Animation Rendering
Computationally intensive Batch Process
Non-deterministic Compute usage patterns
Customer Sizes/Types
Hybrid/All-in Cloud Workflows Architectures
AWS’s Spare Capacity at Scale
Spot Features that make it super easy
Terminations – What is it worth?
Real world examples
Under 2 pennies per core hour
What is the definition of “large” in scale
Is it really cost effective?
13
2
4. Challenges in the VFX/Animation Industry
• Increasingly Shrinking Budgets
• Cap-ex / Op-ex conundrum and flexible hardware needs
• Increasingly Global Workflows
• Increased Demand for Computation
– High Resolutions (4K, 8K and beyond)
– 3D Stitching
– VR, AR Stitching
• Project based Infrastructure requirements
– Budget, Quality, Render Time
• A broad and complex Software toolset per project
• Security of Crown Jewels
7. The challenge of making a film
On-premise capacity
Rendering in the Cloud
8. The challenge of making a film
On-premise capacity
Rendering in the Cloud
Cloud provides you the capability to
scale fast and get the outputs faster
Initial project on-boarding
artwork
10. Rendering Workflow Components
(move to the cloud)
Storage
Render Farm
Pipeline and License
Manager
Graphics Artist
Workstations
• Content has gravity
• Network Bandwidth
• Hybrid/All-in Cloud
• IO Performance
• Ability to burst at a very
short notice
• Cost?
• Performance
• Security
• License mobility/Elasticity
• Dependency Management
(hybrid scenario)
• Interactivity
• High Performant
Storage
• Hardware Support
11. Rendering in the Cloud - Hydrating the Cloud Renderfarm
• S3 as the content repository for your content/data
• On AWS Marketplace/SaaS
(Aspera, Signiant, File Catalyst, Expedat)
• S3 Multi-part Upload
• AWS import/export Snowball
• S3 Transfer Acceleration NEW !
• Direct to Shared File Systems
• EFS throughput scales linearly to the storage
• Lustre can hydrate from an S3 bucket
• Avere can be fronted to S3 or an on-premise NAS
• AWS Snowball NEW !
• AWS Direct Connect
EFS
S3
Multipart
AWS Snowball
12. Rendering in the Cloud - Shared FS Everywhere (some ideas)
Shared Storage On-prem Storage
AWS Direct Connect
Storage Cache
Amazon S3
Luster on EC2
Avere on EC2
EFS
AWS Direct Connect
Hydrate workers
EC2 Spot
Shared Storage
FXT on-prem
13. Rendering in the Cloud - Shared FS (Content/Data Share)
Everywhere
Elastic File System (Amazon EFS)
• Designed to support Peta-Byte scale file systems
• Throughput scales linearly to storage
• Same latency spec across each AZ
• Thousands of concurrent NFS connections
• Works great for Large I/O sizes
• Pay for only what you use not what you provision
• Managed with multi-copy durability Amazon EFS
14. • BYOL
• SaaS
• AWS Marketplace
• Elastic Licensing models
Thinkbox Deadline 8 Usage Based Licensing
• Render nodes pull metered licenses from Cloud-based license server
• Usage is tracked per minute
• Bulk minutes will be available via Thinkbox’s online store
• Hosts 3rd party licensing (Nuke, VRay, etc)
Rendering in the Cloud - Licensing at Cloud Scale
16. Rendering in the Cloud - Move the Graphic Artist to the Cloud …
Rendering is going Global
• NVIDIA GPU based EC2 instances
• Nice DCV
• Teradici PCoIP
• Windows and Linux (VNC+VirtualGL)
3D Modeler
Modeling Dumb Client
Remote Application
running on a G2 instance
G2
17. Rendering in the Cloud - Managing your “disposable” infrastructure
Launch a CloudFormation stack
with all the infrastructure
resources for a specific project
Autoscale the stack as
appropriate
AMI
CloudFormation
Template
CloudFormation
Terminate
Template
18. Rendering in the Cloud – Securing the Crown Jewels
• AWS alignment with the latest MPAA cloud based application
guidelines for content security – August 2015
• VPC private endpoint for S3 – enables a true private workflow
capability
• Encryption & key management capabilities
• Glacier Vault for high-value media/originals
19. Rendering in the Cloud - A Sample Architecture
(All in Cloud Pipeline)
Shared Storage
Renderfarm
On-Prem Storage
Pipeline and License Manager
3D Modeler
Remote
App Visualization
AWS Direct Connect
Modeling Dumb Client
Storage Cache
Amazon S3
Avere on EC2
Scalable Renderfarm on EC2
Appstream or Teradici running on a G2 instance
Pipeline Manager running on EC2
G2
EC2 SPOT
EFS
Hydrate workers
EC2 Spot
20. Render Farm
Rendering in the Cloud - A Sample Architecture
(A Hybrid Pipeline)
Shared Storage
Renderfarm
On-Prem Storage
AWS Direct Connect
Storage Cache
Amazon S3
Avere on EC2
Scalable Renderfarm on EC2
EFS
Hydrate workers
EC2 Spot
On-premise
Renderfarm
Cloud renderfarm as an
extension of on-prem renderfarm
FXT on-prem
Pipeline and License
Manager (also manage
cloud renderfarm)
22. On-Demand
Pay for compute
capacity by the hour
with no long-term
commitments
For spiky workloads,
or to define needs
AWS EC2 Consumption Models
Reserved
Make a low, one-time
payment and receive
a significant discount
on the hourly charge
For committed
utilization
Spot
Bid for unused
capacity, charged at a
Spot Price which
fluctuates based on
supply and demand
For time-insensitive
or transient
workloads
23. Spare capacity at scale
• AWS has more than a
million active
customers in 190
countries.
• Amazon EC2
instance usage has
increased 93% YoY,
comparing Q4 2014
and Q4 2013, not
including Amazon
use.
24. With Spot the rules are simple
Markets where the price of
compute changes based on
supply and demand
You’ll never pay more than your
bid. When the market exceeds
your bid you get 2 minutes to
wrap up your work
26. $0.27 $0.29$0.50
1b 1c1a
8XL
$0.30 $0.16$0.214XL
$0.07 $0.08$0.082XL
$0.05 $0.04$0.04XL
$0.01 $0.04$0.01L
C3
$1.76
On
Demand
$0.88
$0.44
$.22
$0.11
Show me the markets!
Each instance family
Each instance size
Each Availability Zone
In every region
Is a separate Spot Market
29. Amazon EC2 Spot – in the wild
1) We make this easy using the
Spot bid advisor
2) With deliberate pool
selection and bidding, you
will keep your Spot instance
as long as you need to.
3) And with new features like
Spot fleet diversified we do
the heavy lifting for you...
31. Spot fleet helps you
Launch Thousands of Spot Instances
with one RequestSpotFleet call.
Get Best Price
Find the lowest priced horsepower that works for you.
or
Get Diversified Resources
Diversify your fleet. Grow your availability.
And
Apply Custom Weighting
Create your own capacity unit based on your application
needs
33. An easy to use interface that
lets you launch spare EC2
instances in seconds
Helps you select and bid on the
EC2 instances that meet your
applications requirements
Simple to use dashboard lets
you modify and manage your
application’s compute capacity
EC2 Spot Console
35. Using a single
additional Parameter
Run continuously
for up to 6 hours
Save up to 50% off
On-Demand pricing
EC2 Spot block
$1
36. Capitalizing on two minute warning
• When the Spot price
exceeds your bid price, the
instance will receive a two-
minute warning
• Check for the 2 minute spot
instance termination
notification every 5 seconds
leveraging a script invoked
at instance launch
37. Sample script – two minutes left!
1) Check for 2 minute warning
2) If YES, run shutdown scripts
3) OTHERWISE, do nothing
4) Then sleep for 5 seconds
#!/bin/bash
while true
do
if curl -s http://169.254.169.254/latest/meta-
data/spot/termination-time | grep -q .*T.*Z; then
/env/bin/runterminationscripts.sh;
else
# Spot instance not yet marked for termination.
sleep 5
fi
done
39. A Customer Example – Large Scale, Cheap, High Performant
A large scale example for animation rendering on AWS:
• Hybrid Environment using Avere
• All in Cloud Rendering using EFS
• Automated environment leveraging Spot Fleet
• Launched 40K cores in 20 min at < $0.02/core/hr for the particular rendering workload
Findings:
• EFS performance for rendering
• Hybrid Rendering Scenarios
http://www.slideshare.net/
AmazonWebServices/
cmp404-cloud-rendering-at-walt-disney-animation-studios
42. EFS Performance in a real rendering scenario -
Average Read Latency
0
100
200
300
400
500
600
700
100 500 800 1200 2400 4000
Time(µs)
Render Processes
Mid-TierA
Mid-TierB
Mid-TierC
Archive
EFS
43. Customer Example
Rendering in the Cloud vs. On-Premise
-
5,000
10,000
15,000
20,000
25,000
30,000
1 10 20 30 40 50 60 70 80 90
RenderTime(s)
Frame #
EC2/EFS
On Prem
Lower is better
44. The $9 Billion Experiment
50,000 physical cores to meet the 1500 scientific researchers demand
Over 5 days, less than 1% of instances were terminated, leaving them with a significant margin of safety.
Instead of building a 50,000 core data center they were able to successfully use AWS Spot for 5 days and pay just $45,000
Another customer example - Large Scale, Cheap, High Performant
45. Parting thoughts
VFX/Animation rendering workloads can be streamlined on the cloud
• Avoid Data/Content movement
• Distribute Single job across multiple nodes
• Manage state often
• Segregate subworkflows (winthin a single pipeline) between incloud and on-premises based on
dependancies
Rendering in the Cloud is possible and can be more performant over
traditional hardware setup
• All-in Cloud vs. Hybrid
• Technical Feature set has come a long way from even a year ago
AWS EC2 has a VERY Large Capacity @ CHEAP
• EC2 Spot (Fleet, Block) and Reserved Instance models