More Related Content
Similar to Machine Learning with Amazon SageMaker (20)
More from Vladimir Simek (18)
Machine Learning with Amazon SageMaker
- 1. 1
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 1
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Machine learning for every data scientist and developer
Vladimír Šimek
Sr. Solutions Architect, AWS
Machine Learning
with Amazon SageMaker
AWS Česko-Slovenský Webinár 02/2021
- 2. 2
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Agenda
• Gentle introduction
• AI/ML on AWS – the full stack
• Introduction to Amazon SageMaker
• Use Cases
• Demos
• Resources
• Q&A in chat window
- 3. 3
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
The AWS ML Stack
Broadest and most complete set of machine learning capabilities
Amazon
SageMaker
VISION SPEECH TEXT SEARCH CHATBOTS PERSONALIZATION FORECASTING FRAUD CONTACT CENTERS
Deep
Learning
AMIs &
Containers
GPUs &
CPUs
Elastic
Inference
Trainium Inferentia FPGA
AI SERVICES
ML SERVICES
FRAMEWORKS & INFRASTRUCTURE
DeepGraphLibrary
Amazon
Rekognition
Amazon
Polly
Amazon
Transcribe
+Medical
Amazon
Lex
Amazon
Personalize
Amazon
Forecast
Amazon
Comprehend
+Medical
Amazon
Textract
Amazon
Kendra
Amazon
CodeGuru
Amazon
Fraud Detector
Amazon
Translate
INDUSTRIAL AI CODE AND DEVOPS
NEW
Amazon
DevOps Guru
Voice ID
For Amazon Connect
Contact Lens
NEW
Amazon
Monitron
NEW
AWS Panorama
+ Appliance
NEW
Amazon Lookout
for Vision
NEW
Amazon Lookout
for Equipment
NEW
Amazon
HealthLake
HEALTH AI
NEW
Amazon Lookout
for Metrics
ANOMALY DETECTION
Amazon
Transcribe
for Medical
Amazon
Comprehend
for Medical
Label
data
NEW
Aggregate &
prepare data
NEW
Store & share
features
Auto ML Spark/R
NEW
Detect
bias
Visualize in
notebooks
Pick
algorithm
Train
models
Tune
parameters
NEW
Debug &
profile
Deploy in
production
Manage
& monitor
NEW
CI/CD Human review
NEW: Model management for edge devices
NEW: SageMaker JumpStart
SAGEMAKER STUDIO IDE
- 4. 4
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Machine learning development is complex and costly
Visualize in
notebooks
Pick
algorithm
Train
models
Tune
parameters
Deploy in
production
Manage
and monitor
Label
data
Collect and
prepare data
Store
features
CI/CD
Check
for bias
- 5. 5
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
https://aws.amazon.com/sagemaker
Amazon SageMaker
Most complete, end-to-end ML service
Integrated Workbench
Capabilities designed specifically for ML, data
preparation, experiment management,
and workflows
Managed Infrastructure
Designed for ultra low latency and high
throughput, automatic scaling, and
distributed training
Managed Tooling
Purpose-built from the ground up to
work together including auto ML,
collaboration, debugger, profiler, bias
analyzer, and explainability
- 6. 6
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker overview
PREPARE
SageMaker Ground Truth
Label training data for machine learning
SageMaker Data Wrangler NEW
Aggregate and prepare data for
machine learning
SageMaker Processing
Built-in Python, BYO R/Spark
SageMaker Feature Store NEW
Store, update, retrieve, and share features
SageMaker Clarify NEW
Detect bias and understand
model predictions
BUILD
SageMaker Studio Notebooks
Jupyter notebooks with elastic compute and
sharing
Built-in and Bring
your-own Algorithms
Dozens of optimized algorithms or bring
your own
Local Mode
Test and prototype on your local machine
SageMaker Autopilot
Automatically create machine learning
models with full visibility
SageMaker JumpStart NEW
Pre-built solutions for common use cases
TRAIN & TUNE
Managed Training
Distributed infrastructure management
SageMaker Experiments
Capture, organize, and compare
every step
Automatic
Model Tuning
Hyperparameter optimization
Distributed Training
Libraries NEW
Training for large datasets
and models
SageMaker Debugger NEW
Debug and profile training runs
Managed Spot Training
Reduce training cost by 90%
DEPLOY & MANAGE
Managed Deployment
Fully managed, ultra low latency,
high throughput
Kubernetes & Kubeflow
Integration
Simplify Kubernetes-based
machine learning
Multi-Model Endpoints
Reduce cost by hosting multiple models per
instance
SageMaker Model Monitor
Maintain accuracy of deployed models
SageMaker Edge Manager NEW
Manage and monitor models on
edge devices
SageMaker Pipelines NEW
Workflow orchestration and automation
Amazon SageMaker
SageMaker Studio
Integrated development environment (IDE) for ML
- 7. 7
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Tens of thousands of customers use Amazon SageMaker
- 8. 8
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
SageMaker
Common
use cases
Demand
Forecasting
Retail, Consumer
Goods, Manufacturing
Extract and
Analyze Data
from Documents
Healthcare, Legal,
Media/Ent, Education
Computer
Vision
Healthcare, Pharma,
Manufacturing
Autonomous
Driving
Automotive,
Transportation
Personalized
Recommendations
Media & Entertainment,
Retail, Education
Churn
Prediction
Retail, Education,
Software & Internet
Predictive
Maintenance
Manufacturing,
Automotive, IoT
Fraud
Detection
Financial Services,
Online Retail
Credit Risk
Prediction
Financial Services,
Retail
- 9. 9
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker
Features
Amazon SageMaker Studio
Amazon SageMaker Autopilot
Amazon SageMaker JumpStart
Amazon SageMaker Pipelines
Amazon SageMaker Clarify
PREPARE DATA
Amazon SageMaker Ground Truth
Amazon SageMaker Processing
Amazon SageMaker Data Wrangler
Amazon SageMaker Feature Store
TRAIN
Managed Training
Amazon SageMaker Experiments
Amazon SageMaker distributed training libraries
Amazon SageMaker Debugger
DEPLOY
Managed Deployment
Amazon SageMaker Edge Manager
OTHER
Kubernetes integration
Security
Human reviews
- 10. 10
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 10
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker Studio
- 11. 11
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker Studio
Fully Integrated Development Environment (IDE) for machine learning
Collaboration
at scale
Share notebooks
without tracking
code dependencies
Easy
experiment
management
Organize, track, and
compare thousands
of experiments
Automatic
model
generation
Get accurate models
with full visibility and
control without writing
code
Higher quality
ML models
Automatically debug
errors, monitor models,
and maintain high
quality
Increased
productivity
Code, build, train,
deploy, and monitor
in a unified
visual interface
- 12. 12
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Use Amazon SageMaker Studio to update models and see
impact on model quality immediately
- 13. 13
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker Notebooks
Fast-start sharable notebooks
Easy access
with Single
Sign-On
(SSO)
Access your notebooks
in seconds
Fully
managed
and secure
Administrators manage
access and
permissions
Fast
setup
Start your notebooks
without spinning up
compute resources
Easy
collaboration
Share notebooks
with a single click
Flexible
Dial up or down
compute resources
(coming soon)
- 14. 14
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Use Amazon SageMaker Notebooks to easily share your
work with colleagues
- 15. 15
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Code dependencies are automatically captured to enable
collaboration with colleagues
- 16. 16
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker
Autopilot
Automatic model
creation with full
visibility and control
Quick to start
Provide your data in a tabular form
and specify target prediction
Automatic model creation
Get ML models with feature engineering
and model tuning automatically done
Visibility and control
Get notebooks for your models with source code
Recommendations and optimization
Get a leaderboard and continue to improve your model
- 17. 17
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Use Amazon SageMaker Autopilot to automatically train
and tune the best machine learning models
- 18. 18
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 18
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Demo #1
- 19. 19
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 19
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker JumpStart
- 20. 20
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Getting started with machine learning can be challenging
Requires knowledge of
cloud infrastructure
Time consuming Multiple steps
involved with building,
training, and deploying
ML models
- 21. 21
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
SageMaker
JumpStart
Easily and quickly bring
machine learning
applications to market
Solutions can be used out-of-the-box or can be customized for a specific
business problem
15+ pre-built solutions for common ML use cases
Use one-click deployable ML models and algorithms from popular
model zoos
Accelerate time to deploy over 150 open
source models
Easily bring ML applications to market using pre-built solutions, ML models, and
algorithms from popular model zoos, and getting started content
Get started with just a few clicks
- 22. 22
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
SageMaker JumpStart
Use cases
Autonomous
driving
Churn
prediction
Computer
vision
Credit risk
prediction
Demand
forecasting
Extract data
from documents
Fraud
detection
Personalized
recommendations
Predictive
maintenance
- 23. 23
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 23
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker
Data Wrangler
- 24. 24
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
80% of time spent on data prep
Source: Forbes survey of 80 data scientists, March 2016
60%
19%
9%
5%
4% 3%
Cleaning and organizing data
Collecting data sets
Mining data for patterns
Other
Refining algorithms
Building training sets
What data scientists spend the most time doing
- 25. 25
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon
SageMaker Data
Wrangler
The fastest and easiest
way to prepare data for
machine learning
25
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
- 26. 26
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
SageMaker Data Wrangler
Use cases
Cleanse & Explore Data
Use built-in data transforms
to accelerate data cleansing
and exploration
Visualize & Understand Data Enrich Data
Quickly detect outliers or
extreme values within a data set
without the need to write code
Use pre-configured data transformation
tools to transform data into formats that
can be used to build accurate ML models
- 27. 27
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Quickly select and query data
Select data from Amazon
Athena, Amazon Redshift,
AWS Lake Formation,
Amazon S3, and features from
SageMaker Feature Store
Write queries for data sources
before importing data over
to SageMaker Data Wrangler
Import data in various file
formats, such as CSV files,
parquet files, and database
tables directly into
Amazon SageMaker
- 28. 28
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Easily transform data
Transform your data without writing a
single line of code using pre-configured
data transforms
Preconfigured data transforms include
convert column type, rename column, and
delete column
Author custom transforms in PySpark,
SQL, and Pandas
Detect bias and identify dataset
imbalance with SageMaker Clarify
- 29. 29
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Understand your data visually
Intuitively understand your data with a set
of pre-configured visualization templates
Preconfigured visualization templates
include histograms, scatter plots, box and
whisker plots, line plots, and bar charts
Interactively create and edit your own
visualizations so you can quickly detect
outliers or extreme values
- 30. 30
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Quickly estimate model accuracy
Identify inconsistencies in
data preparation workflows and
diagnose issues before
ML models are deployed
into production
Select subsets of data to identify
errors
Identify which features are
contributing to model
performance relative to others
Determine if additional feature
engineering is needed to
improve model performance
- 31. 31
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Deploy data preparation workflows into production
Export data preparation
workflows to a notebook
or Python code
Integrate your workflow
with SageMaker Pipelines to
automate model deployment
and management
Publish created features to
SageMaker Feature Store for
reuse and syndication across
teams and projects
- 32. 32
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 32
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Demo #2
- 33. 33
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon
SageMaker
is devops ready
Security features to help you meet strict security
requirements of ML workloads
Security
PCI, HIPAA, SOC 1/2/3, FedRAMP, and ISO
9001/27001/27017/27018
Compliance
Create automated workflows in minutes to support
thousands of models
ML workflows
Train complex models with massive datasets
Scalability
Automatic scheduling and execution of jobs with managed infrastructure
Orchestration
- 34. 34
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker integrates with Kubernetes
Amazon SageMaker
Operators for Kubernetes
2
Amazon SageMaker Components
for Kubeflow Pipelines
1
Pipelines
- 35. 35
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon
SageMaker
Built-in features help
you go from idea to
production faster, without
compromising security
Control data traffic across SageMaker components over a private network,
and ensure appropriate ingress/egress with single-tenancy
Infrastructure and network isolation
Define, enforce, and audit who can be authenticated
and authorized to use Amazon SageMaker resources
Authentication and authorization
Ensure automatic data encryption at rest and
in transit with flexibility to bring your own keys
Data protection
Track, trace, and audit all API calls, events, data access, or interactions down
to the user and IP level to ensure quick remediation
Auditability and monitoring
Inherit the most comprehensive compliance controls,
and easily abide by your industry’s legislation
Compliance certifications
- 36. 36
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Resources
https://aws.amazon.com/sagemaker/
https://aws.amazon.com/sagemaker/getting-started/
https://www.aws.training/
https://github.com/aws/amazon-sagemaker-examples
https://www.getstartedonsagemaker.com/
https://aws.amazon.com/free/
- 37. 37
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 37
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Thank you
vladsim@amazon.com