Building a Just-in-Time Application Stack for Analysts

Today’s Speaker
Scott Jeschonek
Director of Cloud Products
Avere Systems

Housekeeping
• Recording
• Attachments
• Questions
• Rating

Agenda
• Highlight challenges faced by today’s IT organizations, especially
with analytics teams, when dealing with public clouds
• Focus for today largely on compute and data
• Discuss how to meet these challenges
• How to create a scalable compute environment in under 10 minutes
• How to leverage data both in and outside the public cloud

Clouds Can Be Easy to Use
AWS EC2 Compute
Google GCP Compute
Microsoft Azure Compute

Each Cloud Offers
Clouds Can Do Many Things
• Virtual Machines/Compute
• Containers
• Storage
• Databases (various)
• Networking
• Tiered Applications
• Big Data Processing
• And more…too many to mention
in a slide

Overall Benefits of Cloud, Tools and Integrators
• Cloud platform reduces fulfillment time for new resources
• Cloud platform removes permanence from resource allocation
• Cloud platform removes cost from resource allocation (CAPEX)
• Cloud platform increases capacity and flexibility
• Cloud services and tools decrease complexity and cost of
ownership

In Fact It’s So Easy ….
…End users can set things up themselves.

Common End User Comments
• “I can’t wait for IT to give me
resources.”
• “I don’t have to wait for IT to
give me resources.”
• “There are too many
requirements to use IT
resources…I’ll just go to (enter
public cloud name here).”

Liberating, but Still Liable
• Corporate or Institutional data
• Spending on behalf of corporation or institution equates to direct liability
• Security concerns remain, even if the environment is self-contained
• Costs can spiral out of control; budgets may not account for these
spending events

Cloud - Extension of IT Resources
• Budget chargeback
• Networking(!)
• Security (of users, of data)
• Resource fulfillment
• Capacity planning (for budgets)

With the Right Tools, IT Can Make Cloud Magic
• On-demand services with automated chargeback
• Extension of existing automation capabilities
• Rapid allocation of new compute without CAPEX costs
• Significantly reduced fulfillment
– From order, ship, unbox, rack & stack to “run automation”

Myriad of Third Party Tools and Services

Cloud Compute Use Case Examples
• Analytical processing (either single or multi machine use cases)
– Life Sciences Analytics / Quality Check (QC) / SNP analysis
applications
– Financial Risk Modeling
– Rendering and Transcoding activities
• Build/Test environments
• Big Data applications such as Hadoop
• Application servers/services
• Or simply workstations on demand for temporary use
– Example: Amazon Workspaces

Cloud Compute Usage Examples
Cloud Compute
100% Cloud Compute
Local/SSD Storage
Cloud Storage 100% Cloud Compute
Local/SSD Storage
Cloud Storage
Cloud Compute
Cloud Compute
On-Premises NAS
WAN
100% Cloud Compute
Local/SSD Storage
On-Premise Data over WAN
Cloud Compute
WAN
On-Premises NASOn-Premises Compute
Extended Compute (Burst) into Cloud
Local/SSD Storage
On-Premise Data over WAN

Data Considerations
• Considerations:
– Is there a lot of data?
– Are there multiple nodes acting on the data?
– Is there to be a lot of writing (versus reading) of data?
– Is the data sensitive?
– Is there a scratch space requirement?
– Will the data need to persist in the cloud?

Choices for Your Data
• Copy to local SSD or Persistent SSD/EBS on each node
• Locate / migrate data to object store bucket in cloud provider
• Run a file system in the compute environment and serve data as a
NAS
• Use a caching layer in the compute environment and serve only
requested data, leaving the data wherever it originated

Avere vFXT – Caching File System in the Cloud
• Avere vFXT:
– Highest performance
– Scale-out NAS
– Ideal for high core-count applications and large numbers of servers
– Global namespace: one mount for various sources, including cloud and
on-premises data
– Scale up and down as demand requires
– Only obtains data that has been requested by clients
– Ideal for cloud bursting on-premises data to cloud compute
– Scale = 10s of 1000s of cores

Avere CloudFusion: NAS-in-the-Cloud
• Avere CloudFusion
– Single-node, low cost caching NAS
– Uses low-cost s3 storage as the storage
• Store significant data
– Presents NFS or SMB
– Supports multiple clients
• For example, use it as your AWS Workspaces storage
– Use as scratch space
• Simple to configure

Advantages of a caching layer in compute
• No persistent data in compute = lower cost
• Achieve high performance at low latencies
• Maintain data security by leaving it on-premises
• Abstract data sources between on-premises and cloud for a single
file system experience
• Reduce complexity of compute environment by avoiding re-write of
any applications

Deployment of Application Stack
• Among the many ways, we’ll start with those provided by the cloud
providers themselves
• For compute, choose:
– A pre-configured image (AMI, VM) with all necessary software
– Multiple pre-configured images with all necessary software
– Pre-configured images using Puppet or other CM tool for updates
– A container, set of containers in a cluster
• For networking, choose:
– A configured VPN (for internet-based connectivity)
– Cloud Provider peering connections
– Direct connectivity through companies like Equinix
– Security Group / Firewall / route configurations

Deployment of Application Stack (continued)
• For security, choose:
– IAM in the public cloud
– Service accounts / roles to restrict what the compute nodes can access
• For data, choose:
– A caching / file system application
– Program to copy / move data to the local nodes, triggered as part of the
stack creation

The 10-Minute Stack
• AWS: CloudFormation Template (JSON / REST)
• Google Launcher / Deployment Manager Templates (YAML, Python)
• Microsoft Azure Resource Manager (JSON / REST)
Each offer significant
examples on their respective
sites.
For AWS, wrappers such as
Terraform and Troposphere
reduce the complexity.

What You’ll Need
• Command-line tools (aws cli, gcloud, powershell)
• Text editor / code editor
• A Project / VPC / Network in the respective cloud
– Assume that you will create multiple stacks but within an existing
infrastructure framework
– Use the commands and python/etc. to validate the network and security
environments
• Image (AMI/Virtual Machine) or configuration management (e.g.,
Chef) for application image creation
• File System capability…we’ll use Avere
– You’ll need python coding for this piece

Google Deployment Manager
resources:
- name: vm-instance
type: compute.v1.instance
properties:
disks:
- deviceName: boot
type: PERSISTENT
boot: true
autoDelete: true
initializeParams:
sourceImage:
https://www.googleapis.com/compute/v1/projects/debian-
cloud/global/images/debian-7-wheezy-v20150526
machineType:
https://www.googleapis.com/compute/v1/projects/myproject/
zones/us-central1-f/machineTypes/f1-micro
networkInterfaces:
- network: $(ref.a-new-network.selfLink)
accessConfigs:
- name: External NAT
type: ONE_TO_ONE_NAT
zone: us-central1-f

What Will You Create with the Templates?
• All of the necessary security (if not exists)
– For example, if you require that your instances access object storage,
then permission will need to be granted to the instance either directly (in
Google’s case) or via IAM role (for AWS)
• Disks (volumes) for the machines (if using persistent)
• Network routes for new addresses or network/subnets
• Compute instances
• UserData can then be included in the templates to call extra
configuration on the instances

Deploying Avere with the Stack
• Leverage CloudFormation / Deployment Manager / Resource
Manager to set up the initial nodes
• Add checks to ensure networking is configured properly
– Cloud provider endpoint access is critical
• GCS/S3 API endpoint for storage, EC2 or GCE endpoint for controlling IP
address failover for vFXT
• Call XML-RPC library to complete configuration of
– “Core filer” mappings
– Client IP address configuration
– Integration with AD or NIS
– Configuration to on-premises NFS server

End State
Avere vFXT in Compute
WAN
On-Premises NAS
Application Node
Application Node
Application Node
Application Node
Validated Network
AWS: VPC
GCP: Project Network
Azure: Virtual Network
vFXT configured with
IP Addresses
DNS, NTP
Mapping to on-
premises NAS
Export for Global
Namespace
NAT / Proxy / VPN /
Router
Application nodes have
a mount point
configured based on the
Avere vFXT Export
addresses
IAM Roles applied

Summary
• Cloud Tools abound for creating on-demand application stacks in
your favorite cloud
• IT organizations can leverage these clouds and tools to maximize
their customers’ capabilities and thus their satisfaction
• Leverage caching file systems running in the cloud to provide
performance-based access to only relevant data, limiting the need
to move large amounts of data into the cloud temporarily

Avere Systems
Scott Jeschonek
scottj@averesystems.com
Averesystems.com
Thank you!

Building a Just-in-Time Application Stack for Analysts

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building a Just-in-Time Application Stack for Analysts

Similar to Building a Just-in-Time Application Stack for Analysts (20)

More from Avere Systems

More from Avere Systems (18)

Recently uploaded

Recently uploaded (20)

Building a Just-in-Time Application Stack for Analysts

Editor's Notes