As a global company, Amazon.com employs over 100,000 individuals worldwide, with a range of full-time employees, contractors, partners, subsidiaries, and vendors, all of whom need some level of secure and potentially restricted access to the Amazon.com corporate network. By starting the move to Amazon WorkSpaces in 2015, Amazon.com centralized access to its corporate network without relying on VPN remote access or complex office networking topologies. In this session, we will dive deep to show exactly how Amazon.com approached its WorkSpaces design, implementation, and rollout. The talk provides detailed working knowledge on multiple Amazon VPC design considerations, including challenging IPv4 address space issues, AWS Direct Connect integrations with transitive routing, global Active Directory and GPO deployment, content filtering, patch management, virtualized applications with Amazon WorkSpaces Application Manager (Amazon WAM), offline access with Amazon WorkDocs, device management, and disaster recovery considerations. This session will also look at the long-pole migration to WorkSpaces, including the dependencies on office networking bandwidth and access points, and the move to a more decentralized networking infrastructure. Representatives from both AWS and Amazon.com will be presenting.
2. Quick Survey
Todd Beckett
Technical Program Manager, Amazon.com
Project Lead, Amazon.com Corporate WorkSpaces
Does any of this sound familiar?
• Fleets of Terminal Servers
• “Why can’t I use my (fill-in-the-blank nutty
machine with 123,233,223 video drivers)” aka
“it works great at my house!”
• “Why are the laptops stuck in customs?”
Let’s begin the WorkSpaces journey
3. Jan Sep
The WorkSpaces Journey
JunJunFeb NovMar Apr May Jul Aug Oct Dec
Pilot
• 200 Users
• 1Region
Production
• 16 k Capacity
• Subs, Partners, Restricted Users
First Subsidiary On-boarded
Pilot FullLegal Asks for a
WorkSpace
International Partner Moves Into Scope
International Partner Deployed
Expanded Availability
• Sydney and Singapore
• Data Centers, Training
• China
• 33% of Corp on Zero Clients Q3
N. Virginia Live
Structured Learning
• Don’t ask your C-level Officers to be pilot users (DUH!).
• Your documentation and issue reporting needs to be publicly accessible.
Ireland & Oregon Live
• Get your prod rollout ready ASAP, and your AWS Direct Connect sooner.
• Solve for your most complex customer first.
• Patching WorkSpaces is different.
4. Customer Obsession
• We built an agent that manages 200K machines for Amazon
• We patch 5 platforms at 99.9% penetration
• We own and manage the MDM for Amazon.com
But, the new hotness?
• I work in Corporate IT at Amazon.com, and WorkSpaces is the only
service people are beating down the door to get
5. The Nitty Gritty
Steve Mueller
WorkSpaces Specialist SA, AWS
Technical Lead, Amazon.com Corporate WorkSpaces
Networking
Imaging
Automation
Migration and
Roadmap
6. Before We Begin
6 Regions
• Oregon
• Northern Virginia
• Ireland
• Tokyo
• Singapore
• Sydney
http://aws.amazon.com/about-aws/global-infrastructure/
(as of October 2015)
Amazon WorkSpaces
General Availability
7. A Brief Refresher
• Directory: a Directory Service instance
• 1 directory spans exactly two subnets
• 1 directory = 2 Amazon EC2 instances (1 per
subnet)
• You can have multiple directories in 1 Amazon VPC
• Each directory has its own registration code
• Zero clients: each regcode needs its own URL
Subnet A (AZ 1) Subnet B (AZ 2)
regcode
(example: WSpdx+A1B2C3)
Hard and fast rules to remember
• A WorkSpace is tied to exactly one directory
• A WorkSpace will live in 1 of the 2 directory subnets
The key takeaway here is …
zero client url
(example: https://url1.company.com)
Visualization of a Directory Instance
laptops, desktops, tablets
zero clients
8. A Wise Person Once Said …
A discussion about WorkSpaces will start at the desktop ...
… but end with the network.
9. The Deployment Model
• Regional proximity to users
• Tie into the global
corporate network via
Direct Connect
• Use existing IP space
• Restrict corporate network
access when necessary
• Enable future expansion
Amazon.com Global Corporate Network
(10.0.0.0/8)
10.44.192.0/20
10.44.208.0/20
10.44.224.0/20
10.44.240.0/20
TBD
TBD
This is Amazon EC2 at
scale.
200K+ users and growing
10. Authentication
Gateway
Active
Directory
corp
servers
Direct Connect
Amazon
Corp Net
Users
Amazon
Streaming
Gateway
WorkSpaces Service Broker
A) AWS-managed (public)
B) customer-managed (public and/or private)
MFA
Accessing Corporate WorkSpaces
WorkSpacesVGW
Internet
Session
Gateway
secure protocols, analogous to VPN
(SSL and PCoIP w/ IPSec AES-256)
1
2
3
Client authenticates (AD and MFA) via Authentication Gateway (SSL)
Client brokers desktop session with Session Gateway (SSL)
Client accesses desktop through Streaming Gateway (PCoIP w/ IPSec AES-256)
How Client Traffic Flows
access from Corp
(wired, wireless, VPN)
Amazon-provided
hardware
From the Amazon Corporate Network
Zero Client
Gateway
B
Amazon.com VPC
A
Sophos
source filtering
by IP
Transit
InfoSec Logging
all corporate network access
untrusted prior to filtering
US East
Amazonians
us-east-1
• regional proximity
• tie into corp via DX
redundant
private VIFs
• use existing IP space
10.44.208.0/2010.x.x.x/8 • restrict corp network access
KEY POINT
Kerberos/TGT
ticket
Streaming
Gateway IP
11. Authentication
Gateway
Active
Directory
corp
servers
Direct Connect
Amazon
Corp Net
Users
Amazon
Streaming
Gateway
WorkSpaces Service Broker
A) AWS-managed (public)
B) customer-managed (public and/or private)
MFA
Accessing Corporate WorkSpaces
WorkSpacesVGW
Internet
Session
Gateway
secure protocols, analogous to VPN
(SSL and PCoIP w/ IPSec AES-256)
1
2
3
Client authenticates (AD and MFA) via Authentication Gateway (SSL)
Client brokers desktop session with Session Gateway (SSL)
Client accesses desktop through Streaming Gateway (PCoIP w/ IPSec AES-256)
How Client Traffic Flows
access from ANY network
BUT Amazon corporate
Amazon-provided hardware
From ANY Network BUT Amazon Corporate
Zero Client
Gateway
B
Amazon.com VPC
A
Sophos
source filtering
by IP
Transit
InfoSec Logging
all corporate network access
untrusted prior to filtering
Standalone
Network
• BYOD: use ANY device, not just
Amazon hardware
• BYON: more than just BYOD …
bring your own network
-or-
BYOD
• NEXT-GEN: the new corporate
network
12. VPC: Our 3 Golden Rules
• #1 customer question: “What’s the best VPC design?”
• Amazon.com: Historical problems with IP exhaustion in the 10/8
• multi-year reclamation effort, not done
• Rule #2: eliminate IP waste – be frugal with what we use
• Unknown end state: How many users will come?
• Every subnet costs us 5 IP addresses
• Expect new blocks, multiple VPCs per region, new regions
• Long-term vision: 1 VPC not big enough for all users
• AWS Fact: largest VPC size: /16 (65K addresses)
• 4 regions at launch = 4 VPCs minimum = 4 IP address blocks
• Rule #1: avoid paralysis – take what we can now and just go
• Rule #3: be flexible to accommodate what you don’t know
/23
/24
/22
/20
/26
“Embrace the ambiguity.”
Jim McDonald, Cloud Architect, Hess
AWS re:Invent 2014
https://www.youtube.com/watch?v=Qdk-bUQnCls
13. The VPC Distribution Model
Given a /18 address range for initial launch (16K) * …
• How do we break that into VPCs across 4 regions?
Thought Process
• 1 VPC per region, or should we use more?
Solution – keep it simple, don’t over-engineer
Weighted distribution of initial user demand by WorkSpaces region
* Thank you, Corporate Networking
10.44.192.0/20 10.44.208.0/20 10.44.224.0/20 10.44.240.0/20
Oregon N.
Virginia
Ireland Tokyo
10.44.192.0/18
/18 = 2 /19
= 4 /20
REMINDER!
• Use even-sized VPCs: weighted delays the
inevitable, regional demand will outpace IPs
• 1 per region: avoid operational concerns
• Cut evenly, or weighted based on geo?
Even: /18 = /20 per region (4K each)
Weighted: Oregon bigger than Tokyo
/20 /20 /20 /20
10.44.192.0/20
/20
10.44.208.0/20
/20
10.44.224.0/20
/20 10.44.240.0/20
/20
Day 1 VPC Rollout
14. Carving the VPC
/18 = 2 /19
= 4 /20
/20
21
22
23
24
24
24
24
24
24
24
24
24
24
24
24
24
24
24
24
23
23
23
23
23
23
23
22
22
22
21
let’s go to the
whiteboard!
Remember, we want:
• 2 subnets min. for workspaces
• 2 subnets min. for sophos
• to Avoid ip burn
• to reduce operational burden
• to be flexible to change
Direct Connect
(to on-prem)
VPC RULE: you can’t route
traffic to another instance inside
the same subnet. Must route to
Sophos instance in another
subnet.
10 IPs
10 IPs
10 IPs
10 IPs
10 IPs
10 IPs
10 IPs
10 IPs
1 zc url
1 zc url
1 zc url
1 zc url
1 zc url
1 zc url
1 zc url
1 zc url++
80 IPs 8 zc urls
X 4 regions
360 IPs 32 zc urls
too large for
sophos,
lopsided
remainder
The magic
number!
16. The Desktop Vending Machine
“… believe me, I’ve hugged servers enough in my life.
They DO NOT hug you back.”
Dr. Werner Vogels (re:Invent 2012)
“Don’t hug your desktops. They don’t hug you back.”
Amazon.com (re:Invent 2015)
re:Invent 2015 - Hands-On Labs on WorkSpaces
THEN …
… AND NOW
17. Image Growing Pains
• The Early Project Days : a constant imaging construction zone
• start from stock
• install by hand: malware protection, patch and asset
management, software distribution
• image and deploy
• we’re done, right?
Patch Tuesday
A never-ending list of app updates
Crack open. Update. Rinse and Repeat.
We need to automate.
This. Is. Laborious.
And error prone!
2 images per region.
4 regions total.
8 images.
And that’s just Day 1.
Ugh.
18. The Image Factory
• Package
• Download app installer
• Decorate with installation script – pre, main, post exec hooks
• Create zip Amazon S3 Amazon CloudFront
• Catalog
• Create manifest file for each unique desktop image
• csv: desc,url,file,reboot
• Deploy
• Image desktop: download bootstrap package, unpack
• Execute: top-level installation script
• Read manifest, download each package, unzip
• Execute: local package installation script
• Image
injection
imaging
19. What’s in an Image?
• Problem: How much should we install in an image?
• zero : core OS only with software distribution agents
• thin : zero + light footprint (protection and management agents)
• baked : thin + all other software
• Find the balance between “get going” and automation – MVP and iterate
• Immediate: Baked. Get desktops out. Touch, feel, collect data.
• Remote desktops need champions. Champions need to touch remote desktops.
• Parallel: Reduce and simplify – work towards thin, then zero
• Pulling apps out requires automation – not always easy
Don’t be afraid to experiment : work from a base image, or regenerate every time?
zero
thin
baked
TOP CUSTOMER QUESTION!
20. Images and Bundles
Instance Type
Image (AMI)
Bundle
WorkSpaces
The Image : Bundle : WorkSpace Relationship
• 1 bundle maps to exactly 1 image
Image v1.1.1.1Bundle A
“Bundles. 2 years later.”
• Updated images created over time Image v1.1.1.2
Image v1.1.2.1
Do we even care about image retention?
Image v1.1.1.1
Bundle A
Image v1.1.1.2
Bundle B
Image v1.1.2.1
Bundle C
• Different bundles?
• Same bundle?
• Who still has their Windows 95 CD?
• Patch management keeps older desktops updated
• Will always provision from latest image
• 1 WorkSpace maps to exactly 1 bundle
• Can’t remove a bundle with active WorkSpaces
One bundle to rule them all.*
Burdensome. But retention
and versioning.
Efficient. But zero retention, and
no versioning.
* Instance types not withstanding.• We DO care about versioning, however
21. The Image Catalog
Image v1.1.1.1Bundle A
We like this.
Image v1.1.1.2
Image v1.1.2.1
Image v1.1.1.1
Image v1.1.1.2
Image v1.1.2.1
But we don’t like this.
And InfoSec needs this.
v1.1.2.1
v1.1.1.1
v1.1.1.2
So how do we retain version information?
• Registry or text file is most common
But this is EC2 – let’s grab ami-id metadata!
http://169.254.169.254/latest/meta-data/ami-id
ami-id
version
hostname
built-by
…
THE IMAGE CATALOG
Image v???
Image v???
The Image Factory
22. The Evolution of Automation
CLI Tools on A-Linux
#!/usr/bin/ruby
#!/usr/bin/perl
#!/bin/bash
• fast and easy start – “just go”
• many operations need data (dir-id, wsb, region) CSV files over API calls
• as data increases, fast and easy not so fast and easy anymore
• oh, right … no AWS SDK support for Perl
• object notation, AWS SDK support
Web-Based UI
Self-Service Portal for End-Users
Admin Portal for Helpdesk
(Python)
(Ruby)
API Gateway Lambda DynamoDB
create-workspaces
describe-workspaces
reboot-workspaces
terminate-workspaces
Public APIs
{ “key1”: “val1”, “key2”: “val2” }
json transport
Common API Development
23. Event Handling
create-workspace
terminate-workspace
• delete object from Active Directory
• bind WorkSpace to Sophos
• email users
• post-install hooks for other activities
poll API with cron
CloudTrail
CloudWatch Logs
Kinesis
Lambda
API events
create-workspace ENI
terminate-workspace
25-30 minutes
IP ready only at end
We want workflow-driven behavior.
Code
24. User Migration Efforts
WorkDocs
DFS File Share
cloud-based Sync Storage
• install WorkDocs sync agent on
existing desktops and WorkSpace
• data stored securely in S3,
synced across all devices
Zero Clients, Tablets,
Chromebooks
• rolled out 50 zero clients globally
• Chromebooks solve a lot of mobile
problems
• profile actual tablet usage – hype or real?
• different makes, models used
25. So What’s Next?
Governance by Usage • by usage, not by business or use-case
• no logins after 30 days? Warning.
• desktop marked unhealthy? failed
login attempts on VIP desktops?
Proactively capture, track, open
trouble tickets
• you want it, you get it
• 45? Final warning. 60? Remove.
CloudWatch
RedShift
Scheduled Actions
Event-Driven Monitoring
CloudWatch
CloudWatch Logs
Kinesis / Lambda
(auto-cut trouble ticket)
26. Longer-Term Roadmap
Helpdesk Portal at Scale
• “How many WorkSpaces does a user have
globally?” “Last time logged in?” “Who’s
active now”?
• API operations expensive at scale, offset
with an indexing database in DynamoDB
Virtualized Software Distribution
• WorkSpaces Application Manager
• Provision and remove based on
employment status
Employment Verification, Geo-Alignment
• Auto-provision users to the AWS region
closest to their home office
Configuration Drift
• Alerts trigged when key infrastructure
changes
Transitive Routing
27. And Finally …
Just some quick stats
• 3K+ WorkSpaces provisioned
• Pilot to Production in 6 months
• includes 4 Direct Connects
YOU CAN DO THIS!
A small team of people helped change how
we do desktops at Amazon.com
The actual Amazon.com corporate WorkSpaces team in Las
Vegas for re:Invent 2015
Come see us at the Hands-On Labs!
• All of our best practices and automation
frameworks built the HOL WorkSpaces
environment
29. Hi, I am Jeff – Chief AWS Evangelist
What I do:
• Write the AWS Blog
• Record & edit podcasts
• Social media
Applications:
• Email
• Browser
• Amazon WorkDocs
• Audacity
• Amazon Music
35. Old World
• Multiple working environments
• Disjointed
• Transient
• Fragile
• Breakable hardware
• Drop connections
What Has Changed
New World
• Single working environment
• Unified
• Continuous
• Robust
• Amazon IT runs it
• Persistent sessions
36. My Laptop
• Crashed and re-imaged 3 months ago – no big deal
• Has become a legacy
• Unique stuff:
• Stickers
• ID of WorkSpace
• No:
• Proprietary data
• Apps or app patching
• Data & app transfer
37. Office
Zero Client
Dual monitors
• WorkSpaces
• No OS
My New Working Environment(s)
Home
Hand-built PC
Dual monitors
• WorkSpaces
• Microsoft Windows 7
• Oracle VM
VirtualBox
• Ubuntu
Mobile
Laptop
• WorkSpaces
• Windows 7