SlideShare a Scribd company logo
1 of 54
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep Learning 모델의 효과적인
분산 트레이닝과 모델 최적화 방법
김무현, Data Scientist
AWS ML Solutions Lab
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon ML Solutions Lab
Brainstorming Modeling Teaching
Leverage Amazon experts with decades of ML
experience with technologies like Amazon Echo,
Amazon Alexa, Prime Air and Amazon Go
Amazon ML Solutions Lab
provides ML expertise
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Now let’s make it as
fast, efficient and inexpensive
as possible
Put machine learning in the
hands of every developer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
M L F R A M E W O R K S &
I N F R A S T R U C T U R E
The Amazon ML Stack: Broadest & Deepest Set of Capabilities
A I S E R V I C E S
R E K O G N I T I O N
I M A G E
P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D
C O M P R E H E N D
M E D I C A L
L E XR E K O G N I T I O N
V I D E O
Vision Speech Chatbots
A M A Z O N S A G E M A K E R
B U I L D T R A I N
F O R E C A S TT E X T R A C T P E R S O N A L I Z E
D E P L O Y
Pre-built algorithms & notebooks
Data labeling (G R O U N D T R U T H )
One-click model training & tuning
Optimization ( N E O )
One-click deployment & hosting
M L S E R V I C E S
F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e
E C 2 P 3
& P 3 d n
E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C
I N F E R E N C E
Models without training data (REINFORCEMENT LEARNING)
Algorithms & models ( A W S M A R K E T P L A C E )
Language Forecasting Recommendations
NEW NEWNEW
NEW
NEW
NEWNEW
NEW
NEW
RL Coach
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Optimizing Infrastructure and Frameworks
• Distributed training for TensorFlow, MXNet, Keras, PyTorch
• Let’s tune models using Amazon SageMaker HPO
• Optimizing the trained model for deployment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Where to train and deploy deep learning models
Amazon
SageMaker
Amazon Elastic
Container Service
for Kubernetes
Amazon Elastic
Container Service
AWS Deep Learning
Containers
Amazon
EC2
AWS Deep Learning
AMIs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Making TensorFlow faster
Training a ResNet-50 benchmark with the synthetic ImageNet dataset
using our optimized build of TensorFlow 1.11 on a c5.18xlarge instance
type is 11x faster than training on the stock binaries.
https://aws.amazon.com/about-aws/whats-new/2018/10/chainer4-4_theano_1-0-2_launch_deep_learning_ami/
October 2018
Available with Amazon SageMaker,
AWS Deep Learning AMIs, and AWS Deep Learning Containers
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 P3dn
https://aws.amazon.com/blogs/aws/new-ec2-p3dn-gpu-instances-with-100-gbps-networking-local-nvme-storage-for-faster-machine-learning-p3-price-reduction/
Reduce machine
learning training time
Better GPU
utilization
Support larger, more
complex models
K E Y F E A T U R E S
100Gbps of networking
bandwidth
8 NVIDIA Tesla
V100 GPUs
32GB of
memory per GPU
(2x more P3)
96 Intel
Skylake vCPUs
(50% more than P3)
with AVX-512
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 P3 instance type has the most powerful GPU, NVIDIA V100
But
Are you fully utilizing GPUs?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tensor Core and mixed-precision training
https://arxiv.org/abs/1710.03740
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to port training scripts for mixed precision
https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
Porting the model to use FP16 data type where appropriate.
1. Use float16 data type on models containing convolutions or matrix
multiplication
2. Check if trainable variables is float32 before converting to float16
3. Use float32 for softmax calculation
Adding loss scaling to preserve small gradient values.
1. Multiply by a scale factor before computing gradient
2. Divide the calculated gradient by the same scale factor
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Code snip for mix-precision training in TensorFlow
x = tf.placeholder(tf.float32, [None, 784])
W1 = tf.Variable(tf.truncated_normal([784, FLAGS.num_hunits]))
b1 = tf.Variable(tf.zeros([FLAGS.num_hunits]))
z = tf.nn.relu(tf.matmul(x, W1) + b1)
W2 = tf.Variable(tf.truncated_normal([FLAGS.num_hunits, 10]))
b2 = tf.Variable(tf.zeros([10]))
y = tf.matmul(z, W2) + b2
y_ = tf.placeholder(tf.int64, [None])
cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_,
logits=y)
train_step =
tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
data = tf.placeholder(tf.float16, shape=(None, 784))
W1 = tf.get_variable('w1', (784, FLAGS.num_hunits), tf.float16)
b1 = tf.get_variable('b1', (FLAGS.num_hunits), tf.float16,
initializer=tf.zeros_initializer())
z = tf.nn.relu(tf.matmul(data, W1) + b1)
W2 = tf.get_variable('w2', (FLAGS.num_hunits, 10), tf.float16)
b2 = tf.get_variable('b2', (10), tf.float16,
initializer=tf.zeros_initializer())
y = tf.matmul(z, W2) + b2
y_ = tf.placeholder(tf.int64, shape=(None))
loss = tf.losses.sparse_softmax_cross_entropy(y_,
tf.cast(y, tf.float32))
* Source code from https://github.com/khcs/fp16-demo-tf
MLP normal implementation MLP mixed-precision implementation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Code snip for mix-precision training in TensorFlow
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
# Train
for _ in range(3000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
def gradients_with_loss_scaling(loss, variables, loss_scale):
return [grad / loss_scale
for grad in tf.gradients(loss * loss_scale, variables)]
with tf.device('/gpu:0'), 
tf.variable_scope(
'fp32_storage', custom_getter=float32_variable_storage_getter):
data, target, logits, loss = create_model(nbatch, nin, nout, dtype)
variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
grads = gradients_with_loss_scaling(loss, variables, loss_scale)
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum)
training_step_op = optimizer.apply_gradients(zip(grads, variables))
init_op = tf.global_variables_initializer()
sess.run(init_op)
for step in range(6000):
batch_xs, batch_ys = mnist.train.next_batch(100)
np_loss, _ = sess.run([loss, training_step_op],
feed_dict={data: batch_xs, target: batch_ys})* Source code from https://github.com/khcs/fp16-demo-tf
MLP normal implementation MLP mixed-precision implementation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
For other Deep Learning frameworks such as Apache MXNet, PyTorch, etc
please refer to
AWS Deep Learning AMI Developer Guide
https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-gpu-opt-training.html
NVIDIA Deep Learning SDK
https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scaling TensorFlow near-linearly 256 GPUs
https://aws.amazon.com/about-aws/whats-new/2018/11/tensorflow-scalability-to-256-gpus/
Stock
TensorFlow
65%
scaling efficiency
with 256 GPUs
30m
training time
AWS-Optimized
TensorFlow
90%
scaling efficiency
with 256 GPUs
Available with
Amazon SageMaker
and the AWS Deep
Learning AMIs
14m
training time
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
I also have huge amount of data or large models for training
How to scale deep learning training tasks?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Infra for distributed training - scale up
Amazon
Elastic Block
Store (EBS)
Amazon EC2
GPU GPU
GPU GPU
GPU GPU
GPU GPU
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Infra for distributed training - scale out
Amazon
Elastic Block
Store (EBS)
Amazon EC2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-GPUs and multi-nodes options
Using DL framework’s feature
• TensorFlow
- Multi-powering for multi-GPUs training
- Parameter server for multi-node training
• Apache MXNet
- Multi-GPUs by defining context with list of GPUs
- Parameter server for multi-node training
Using Horovod
• https://eng.uber.com/horovod/
• Open source distributed training framework based on Message Passing Interface (MPI)
• Baidu’s draft implementation of the TensorFlow ring-allreduce algorithm
• Support famous deep learning frameworks such as TensorFlow, MXNet, Keras, PyTorch
Performance scalability using Horovod
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod
Install Horovod and related packages
à AWS Deep Learning AMI and Deep Learning Containers have all already
Modify your training code to be trained using Horovod
Run multi-GPUs or distributed training using Horovod mpirun command
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod with TensorFlow
import tensorflow as tf
import horovod.tensorflow as hvd
# Initialize Horovod
hvd.init()
# Pin GPU to be used to process local rank (one GPU per
process)
config = tf.ConfigProto()
config.gpu_options.visible_device_list = str(hvd.local_rank())
# Build model...
loss = ...
opt = tf.train.AdagradOptimizer(0.01 * hvd.size())
# Add Horovod Distributed Optimizer
opt = hvd.DistributedOptimizer(opt)
# Add hook to broadcast variables from rank 0 to all other
processes during
# initialization.
hooks = [hvd.BroadcastGlobalVariablesHook(0)]
# Make training operation
train_op = opt.minimize(loss)
# Save checkpoints only on worker 0 to prevent other workers
from corrupting them.
checkpoint_dir = '/tmp/train_logs' if hvd.rank() == 0 else None
# The MonitoredTrainingSession takes care of session
initialization,
# restoring from a checkpoint, saving to a checkpoint, and
closing when done
# or an error occurs.
with
tf.train.MonitoredTrainingSession(checkpoint_dir=checkpoint_dir,
config=config,
hooks=hooks) as mon_sess:
while not mon_sess.should_stop():
# Perform synchronous training.
mon_sess.run(train_op)
( source code from https://github.com/horovod/horovod )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod with Apache MXNet
import mxnet as mx
import horovod.mxnet as hvd
from mxnet import autograd
# Initialize Horovod
hvd.init()
# Pin GPU to be used to process local rank
context = mx.gpu(hvd.local_rank())
num_workers = hvd.size()
# Build model
model = ...
model.hybridize()
# Create optimizer
optimizer_params = ...
opt = mx.optimizer.create('sgd', **optimizer_params)
# Initialize parameters
model.initialize(initializer, ctx=context)
# Fetch and broadcast parameters
params = model.collect_params()
if params is not None:
hvd.broadcast_parameters(params, root_rank=0)
# Create DistributedTrainer, a subclass of gluon.Trainer
trainer = hvd.DistributedTrainer(params, opt)
# Create loss function
loss_fn = ...
# Train model
for epoch in range(num_epoch):
train_data.reset()
for nbatch, batch in enumerate(train_data, start=1):
data = batch.data[0].as_in_context(context)
label = batch.label[0].as_in_context(context)
with autograd.record():
output = model(data.astype(dtype, copy=False))
loss = loss_fn(output, label)
loss.backward()
trainer.step(batch_size)
( source code from https://github.com/horovod/horovod )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod with Keras
import keras
import horovod.keras as hvd
# Horovod: initialize Horovod.
hvd.init()
# Horovod: pin GPU to be used to process local rank (one GPU
per process)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.visible_device_list = str(hvd.local_rank())
# Horovod: adjust number of epochs based on number of GPUs.
epochs = int(math.ceil(12.0 / hvd.size()))
model = ...
# Horovod: adjust learning rate based on number of GPUs.
opt = keras.optimizers.Adadelta(1.0 * hvd.size())
# Horovod: add Horovod Distributed Optimizer.
opt = hvd.DistributedOptimizer(opt)
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=opt, metrics=['accuracy'])))
callbacks = [
# Horovod: broadcast initial variable states from rank 0 to
all other processes.
# This is necessary to ensure consistent initialization of
all workers when
# training is started with random weights or restored from a
checkpoint.
hvd.callbacks.BroadcastGlobalVariablesCallback(0),
]
# Horovod: save checkpoints only on worker 0 to prevent other
workers from corrupting them.
if hvd.rank() == 0:
callbacks.append(keras.callbacks.ModelCheckpoint(
'./checkpoint-{epoch}.h5'))
model.fit(x_train, y_train,
batch_size=batch_size,
callbacks=callbacks,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
( source code from https://github.com/horovod/horovod )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod in Amazon EC2
https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-horovod-tensorflow.html
STEP 1. Configure Horovod Hosts file
172.100.1.200 slots=8
172.200.8.99 slots=8
172.48.3.124 slots=8
localhost slots=8
STEP 2. Configure nodes to not do “StrickHostKeyChecking”
STEP 3. Execute training script using mpirun command
~/anaconda3/envs/tensorflow_p36/bin/mpirun -np $gpus -hostfile ~/hosts -mca plm_rsh_no_tree_spawn 1 
-bind-to socket -map-by slot 
-x HOROVOD_HIERARCHICAL_ALLREDUCE=1 -x HOROVOD_FUSION_THRESHOLD=16777216 
-x NCCL_MIN_NRINGS=4 -x LD_LIBRARY_PATH -x PATH -mca pml ob1 -mca btl ^openib 
-x NCCL_SOCKET_IFNAME=$INTERFACE -mca btl_tcp_if_exclude lo,docker0 
-x TF_CPP_MIN_LOG_LEVEL=0 
python -W ignore ~/examples/horovod/tensorflow/train_imagenet_resnet_hvd.py 
--data_dir ~/data/tf-imagenet/ --num_epochs 90 --increased_aug -b $BATCH_SIZE 
--mom 0.977 --wdecay 0.0005 --loss_scale 256. --use_larc 
--lr_decay_mode linear_cosine --warmup_epochs 5 --clear_log
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod in Amazon EKS
https://docs.aws.amazon.com/dlami/latest/devguide/deep-learning-containers-eks-tutorials-distributed-gpu-training.html
STEP 1. Install Kubeflow to setup a cluster for distributed training
STEP 2. Set the app name and initialize it.
STEP 3. Install mpi-operator from kubeflow
STEP 4. Create a MPI Job template, define the number of nodes (replicas), number of GPUs each
node has (gpusPerReplica)
STEP 5. Apply the manifest to the default environment. The MPI Job will create a launch pod
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod in Amazon SageMaker
from sagemaker.tensorflow import TensorFlow
distributions = {'mpi': {'enabled': True, "processes_per_host": 2}}
# METHOD 1 - Using Amazon SageMaker provided VPC
estimator = TensorFlow(entry_point=train_script,
role=sagemaker_iam_role,
train_instance_count=2,
train_instance_type='ml.p3.8xlarge',
script_mode=True,
framework_version='1.12',
distributions=distributions)
# METHOD 2 - Using your own VPC for training performance improvement
estimator = TensorFlow(entry_point=train_script,
role=sagemaker_iam_role,
train_instance_count=2,
train_instance_type='ml.p3.8xlarge',
script_mode=True,
framework_version='1.12',
distributions=distributions,
security_group_ids=['sg-0919a36a89a15222f'],
subnets=['subnet-0c07198f3eb022ede', 'subnet-055b2819caae2fd1f’])
estimator.fit({"train":s3_train_path, "test":s3_test_path})
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Examples of hyperparameters
Neural Networks
Number of layers
Hidden layer width
Learning rate
Embedding
dimensions
Dropout
…
Decision Trees
Tree depth
Max leaf nodes
Gamma
Eta
Lambda
Alpha
…
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automatic Model Tuning
Finding the optimal set of hyperparameters
1. Manual Search (”I know what I’m doing”)
2. Grid Search (“X marks the spot”)
• Typically training hundreds of models
• Slow and expensive
3. Random Search (“Spray and pray”)
• Works better and faster than Grid Search
• But… but… but… it’s random!
4. HPO: use Machine Learning
• Training fewer models
• Gaussian Process Regression and Bayesian Optimization
• You can now resume from a previous tuning job
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to use Amazon SageMaker HPO
Configuration
Training Jobs
Resulting Models
Estimator
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hardware optimization is extremely complex
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neo is a compiler and runtime for machine learning
Compiler
Runtime
Processor vendors can integrate
hardware-specific optimizations
Device makers can embed runtime
into edge devices and IoT
github.com/neo-ai
Apache Software License
Neo
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to compile a model
https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation-cli.html
Configure the compilation job
{
"RoleArn":$ROLE_ARN,
"InputConfig": {
"S3Uri":"s3://jsimon-neo/model.tar.gz",
"DataInputConfig": "{"data": [1, 3, 224, 224]}",
"Framework": "MXNET"
},
"OutputConfig": {
"S3OutputLocation": "s3://jsimon-neo/",
"TargetDevice": "rasp3b"
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 300
}
}
Compile the model
$ aws sagemaker create-compilation-job
--cli-input-json file://config.json
--compilation-job-name resnet50-mxnet-pi
$ aws s3 cp s3://jsimon-neo/model-
rasp3b.tar.gz .
$ gtar tfz model-rasp3b.tar.gz
compiled.params
compiled_model.json
compiled.so
Predict with the compiled model
from dlr import DLRModel
model = DLRModel('resnet50', input_shape,
output_shape, device)
out = model.run(input_data)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Model compilation using AWS console
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Performance improvement result
Image file name MXNet model (seconds)
Neo-compiled model
(seconds)
Improvement
(mxnet model / neo-
compiled model)
input_001 0.0299 0.0128 233.59%
input_002 0.0223 0.0129 172.86%
input_003 0.0275 0.0125 220.00%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Do I need really
that much complex & deep
neural networks
to meet the required accuracy?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Compressing deep learning models
• Compression is the process of reducing the size of a trained network,
either by removing certain layers or by shrinking layers, while
maintaining accuracy.
• A smaller model will predict faster and require less memory.
• The number of possible combinations makes is difficult to perform this
task manually, or even programmatically.
• Reinforcement learning to the rescue!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Defining the problem
• Objective: find the smallest possible network
architecture from a pre-trained network
architecture, while producing the best
accuracy.
• Environment: a custom developed
environment that accepts a Boolean array of
layers to remove from the RL agent and
produces an observation describing layers.
• State: the layers.
• Action: A boolean array one for each layer.
• Reward: a combination of compression ratio
and accuracy.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker RL
Reinforcement learning for every developer and data scientist
Broad support
for frameworks
Broad support for simulation
environments
2D & 3D physics
environments and
OpenGym support
Support Amazon Sumerian, AWS
RoboMaker and the open source
Robotics Operating System
(ROS) project
Fully
managed
Example notebooks
and tutorials
K E Y F E A T U R E S
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://github.com/awslabs/amazon-sagemaker-
examples/tree/master/reinforcement_learning/rl_network_compression_ray_custom
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Predictions drive
complexity and
cost in production
Training
10%
Inference
90%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Are you making the most of your infrastructure?
One size does not fit allLow utilization and high costs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Elastic Inference
https://aws.amazon.com/blogs/aws/amazon-elastic-inference-gpu-powered-deep-learning-inference-acceleration/
Match capacity
to demand
Available between 1 to 32
TFLOPS
K E Y F E A T U R E S
Integrated with
Amazon EC2,
Amazon SageMaker, and
Amazon DL AMIs
Support for TensorFlow, Apache
MXNet, and ONNX
with PyTorch coming soon
Single and
mixed-precision
operations
Lower inference costs
up to 75%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Elastic Inference with TensorFlow
OPTION 1 - Using Elastic Inference TensorFlow Serving
$ amazonei_tensorflow_model_server --model_name=ssdresnet
--model_base_path=/tmp/ssd_resnet50_v1_coco --port=9000
OPTION 2 - Using Elastic Inference TensorFlow Predictor
from tensorflow.contrib.ei.python.predictor.ei_predictor import EIPredictor
img = mpimg.imread(FLAGS.image)
img = np.expand_dims(img, axis=0)
ssd_resnet_input = {'inputs': img}
eia_predictor = EIPredictor(model_dir='/tmp/ssd_resnet50_v1_coco/1/')
pred = eia_predictor(ssd_resnet_input)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Elastic Inference with Apache MXNet
OPTION 1 - Use EI with the MXNet Symbol API
import mxnet as mx
data = mx.sym.var('data', shape=(1,))
sym = mx.sym.exp(data)
# Pass mx.eia() as context during simple bind operation
executor = sym.simple_bind(ctx=mx.eia(), grad_req='null')
# Forward call is performed on remote accelerator
executor.forward(data=mx.nd.ones((1,)))
print('Inference %d, output = %s' % (i, executor.outputs[0]))
OPTION 2 - Use EI with the Module API
ctx = mx.eia()
sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-152', 0)
mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Other tips
SageMaker Pipemode using TensorFlow Pipemode
Dataset extension
https://github.com/aws/sagemaker-tensorflow-
extensions
Apache MXNet can read training data from Amazon
S3 directly
https://mxnet.incubator.apache.org/versions/master/
faq/s3_integration.html
* dataset – a 3.9 GB CSV file– contained 2 million records, each record having
100 comma-separated, single-precision floating-point values.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Summary
Training
• Make it sure to utilize Tensor Core by using mix-precision training
• Learn to use Horovod for efficient multi-GPU or multi-node distributed
training
• Find the most optimal hyperparameter using SageMaker HPO
Deployment
• Compile your model using Amazon SageMaker Neo
• Use Amazon Elastic Inference to reduce inference cost if applicable
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dive into Deep Learning
An interactive deep learning book
with code, math, and discussions
http://d2l.ai/
http://ko.d2l.ai/
STAT 157 Course at UC Berkeley, Spring 2019
한국어 version of the first 4 chapters is available NOW.
• GitHub Pull Request for any correction is welcome
• Raise issue at https://github.com/d2l-ai/d2l-ko/issues
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Getting started
https://ml.aws
https://aws.amazon.com/blogs/machine-learning
https://aws.amazon.com/sagemaker
https://github.com/awslabs/amazon-sagemaker-examples
https://medium.com/@julsimon

More Related Content

What's hot

AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019Amazon Web Services Korea
 
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...Amazon Web Services Korea
 
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic SessionAmazon Web Services Japan
 
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...Amazon Web Services Korea
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfAmazon Web Services
 
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...Amazon Web Services Japan
 
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)Amazon Web Services Korea
 
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習Amazon Web Services Japan
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...Amazon Web Services Korea
 
20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSync20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSyncAmazon Web Services Japan
 
20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMR20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMRAmazon Web Services Japan
 
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けてAWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けてAmazon Web Services Japan
 
20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門Amazon Web Services Japan
 
20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep Dive20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep DiveAmazon Web Services Japan
 
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...Amazon Web Services Japan
 
20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLift20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLiftAmazon Web Services Japan
 
Optimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdfOptimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdfAmazon Web Services
 
20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuru20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuruAmazon Web Services Japan
 
[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWS[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWSAmazon Web Services Korea
 
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...Amazon Web Services Korea
 

What's hot (20)

AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
 
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
 
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
 
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdf
 
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
 
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
 
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
 
20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSync20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSync
 
20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMR20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMR
 
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けてAWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
 
20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門
 
20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep Dive20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep Dive
 
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
 
20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLift20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLift
 
Optimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdfOptimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdf
 
20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuru20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuru
 
[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWS[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWS
 
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
 

Similar to Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit Seoul 2019

Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)Julien SIMON
 
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMakerDeep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMakerAmazon Web Services
 
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)Julien SIMON
 
Sviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdfSviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdfAmazon Web Services
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Amazon Web Services
 
Scale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for BuildersScale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for BuildersAmazon Web Services
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018Amazon Web Services Korea
 
Machine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 BarcelonaMachine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 BarcelonaAmazon Web Services
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Amazon Web Services
 
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...Amazon Web Services
 
Machine learning using Kubernetes
Machine learning using KubernetesMachine learning using Kubernetes
Machine learning using KubernetesArun Gupta
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Amazon Web Services
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfAmazon Web Services
 
Build, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfBuild, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfAmazon Web Services
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Amazon Web Services
 
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...Amazon Web Services
 
Amazon AI/ML Overview
Amazon AI/ML OverviewAmazon AI/ML Overview
Amazon AI/ML OverviewBESPIN GLOBAL
 
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기Amazon Web Services Korea
 
Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)Julien SIMON
 
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...Amazon Web Services
 

Similar to Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit Seoul 2019 (20)

Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
 
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMakerDeep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
 
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
 
Sviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdfSviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdf
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
 
Scale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for BuildersScale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for Builders
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
 
Machine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 BarcelonaMachine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 Barcelona
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
 
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
 
Machine learning using Kubernetes
Machine learning using KubernetesMachine learning using Kubernetes
Machine learning using Kubernetes
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdf
 
Build, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfBuild, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdf
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
 
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
 
Amazon AI/ML Overview
Amazon AI/ML OverviewAmazon AI/ML Overview
Amazon AI/ML Overview
 
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
 
Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)
 
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
 

More from Amazon Web Services Korea

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2Amazon Web Services Korea
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1Amazon Web Services Korea
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...Amazon Web Services Korea
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon Web Services Korea
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Web Services Korea
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Amazon Web Services Korea
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...Amazon Web Services Korea
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Amazon Web Services Korea
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon Web Services Korea
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon Web Services Korea
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Amazon Web Services Korea
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Web Services Korea
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...Amazon Web Services Korea
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...Amazon Web Services Korea
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon Web Services Korea
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...Amazon Web Services Korea
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...Amazon Web Services Korea
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...Amazon Web Services Korea
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...Amazon Web Services Korea
 

More from Amazon Web Services Korea (20)

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit Seoul 2019

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 김무현, Data Scientist AWS ML Solutions Lab
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon ML Solutions Lab Brainstorming Modeling Teaching Leverage Amazon experts with decades of ML experience with technologies like Amazon Echo, Amazon Alexa, Prime Air and Amazon Go Amazon ML Solutions Lab provides ML expertise
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Now let’s make it as fast, efficient and inexpensive as possible Put machine learning in the hands of every developer
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark M L F R A M E W O R K S & I N F R A S T R U C T U R E The Amazon ML Stack: Broadest & Deepest Set of Capabilities A I S E R V I C E S R E K O G N I T I O N I M A G E P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D C O M P R E H E N D M E D I C A L L E XR E K O G N I T I O N V I D E O Vision Speech Chatbots A M A Z O N S A G E M A K E R B U I L D T R A I N F O R E C A S TT E X T R A C T P E R S O N A L I Z E D E P L O Y Pre-built algorithms & notebooks Data labeling (G R O U N D T R U T H ) One-click model training & tuning Optimization ( N E O ) One-click deployment & hosting M L S E R V I C E S F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e E C 2 P 3 & P 3 d n E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C I N F E R E N C E Models without training data (REINFORCEMENT LEARNING) Algorithms & models ( A W S M A R K E T P L A C E ) Language Forecasting Recommendations NEW NEWNEW NEW NEW NEWNEW NEW NEW RL Coach
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • Optimizing Infrastructure and Frameworks • Distributed training for TensorFlow, MXNet, Keras, PyTorch • Let’s tune models using Amazon SageMaker HPO • Optimizing the trained model for deployment
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Where to train and deploy deep learning models Amazon SageMaker Amazon Elastic Container Service for Kubernetes Amazon Elastic Container Service AWS Deep Learning Containers Amazon EC2 AWS Deep Learning AMIs
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Making TensorFlow faster Training a ResNet-50 benchmark with the synthetic ImageNet dataset using our optimized build of TensorFlow 1.11 on a c5.18xlarge instance type is 11x faster than training on the stock binaries. https://aws.amazon.com/about-aws/whats-new/2018/10/chainer4-4_theano_1-0-2_launch_deep_learning_ami/ October 2018 Available with Amazon SageMaker, AWS Deep Learning AMIs, and AWS Deep Learning Containers
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 P3dn https://aws.amazon.com/blogs/aws/new-ec2-p3dn-gpu-instances-with-100-gbps-networking-local-nvme-storage-for-faster-machine-learning-p3-price-reduction/ Reduce machine learning training time Better GPU utilization Support larger, more complex models K E Y F E A T U R E S 100Gbps of networking bandwidth 8 NVIDIA Tesla V100 GPUs 32GB of memory per GPU (2x more P3) 96 Intel Skylake vCPUs (50% more than P3) with AVX-512
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 P3 instance type has the most powerful GPU, NVIDIA V100 But Are you fully utilizing GPUs?
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Tensor Core and mixed-precision training https://arxiv.org/abs/1710.03740
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How to port training scripts for mixed precision https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html Porting the model to use FP16 data type where appropriate. 1. Use float16 data type on models containing convolutions or matrix multiplication 2. Check if trainable variables is float32 before converting to float16 3. Use float32 for softmax calculation Adding loss scaling to preserve small gradient values. 1. Multiply by a scale factor before computing gradient 2. Divide the calculated gradient by the same scale factor
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Code snip for mix-precision training in TensorFlow x = tf.placeholder(tf.float32, [None, 784]) W1 = tf.Variable(tf.truncated_normal([784, FLAGS.num_hunits])) b1 = tf.Variable(tf.zeros([FLAGS.num_hunits])) z = tf.nn.relu(tf.matmul(x, W1) + b1) W2 = tf.Variable(tf.truncated_normal([FLAGS.num_hunits, 10])) b2 = tf.Variable(tf.zeros([10])) y = tf.matmul(z, W2) + b2 y_ = tf.placeholder(tf.int64, [None]) cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_, logits=y) train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) data = tf.placeholder(tf.float16, shape=(None, 784)) W1 = tf.get_variable('w1', (784, FLAGS.num_hunits), tf.float16) b1 = tf.get_variable('b1', (FLAGS.num_hunits), tf.float16, initializer=tf.zeros_initializer()) z = tf.nn.relu(tf.matmul(data, W1) + b1) W2 = tf.get_variable('w2', (FLAGS.num_hunits, 10), tf.float16) b2 = tf.get_variable('b2', (10), tf.float16, initializer=tf.zeros_initializer()) y = tf.matmul(z, W2) + b2 y_ = tf.placeholder(tf.int64, shape=(None)) loss = tf.losses.sparse_softmax_cross_entropy(y_, tf.cast(y, tf.float32)) * Source code from https://github.com/khcs/fp16-demo-tf MLP normal implementation MLP mixed-precision implementation
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Code snip for mix-precision training in TensorFlow sess = tf.InteractiveSession() tf.global_variables_initializer().run() # Train for _ in range(3000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) def gradients_with_loss_scaling(loss, variables, loss_scale): return [grad / loss_scale for grad in tf.gradients(loss * loss_scale, variables)] with tf.device('/gpu:0'), tf.variable_scope( 'fp32_storage', custom_getter=float32_variable_storage_getter): data, target, logits, loss = create_model(nbatch, nin, nout, dtype) variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) grads = gradients_with_loss_scaling(loss, variables, loss_scale) optimizer = tf.train.MomentumOptimizer(learning_rate, momentum) training_step_op = optimizer.apply_gradients(zip(grads, variables)) init_op = tf.global_variables_initializer() sess.run(init_op) for step in range(6000): batch_xs, batch_ys = mnist.train.next_batch(100) np_loss, _ = sess.run([loss, training_step_op], feed_dict={data: batch_xs, target: batch_ys})* Source code from https://github.com/khcs/fp16-demo-tf MLP normal implementation MLP mixed-precision implementation
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. For other Deep Learning frameworks such as Apache MXNet, PyTorch, etc please refer to AWS Deep Learning AMI Developer Guide https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-gpu-opt-training.html NVIDIA Deep Learning SDK https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scaling TensorFlow near-linearly 256 GPUs https://aws.amazon.com/about-aws/whats-new/2018/11/tensorflow-scalability-to-256-gpus/ Stock TensorFlow 65% scaling efficiency with 256 GPUs 30m training time AWS-Optimized TensorFlow 90% scaling efficiency with 256 GPUs Available with Amazon SageMaker and the AWS Deep Learning AMIs 14m training time
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. I also have huge amount of data or large models for training How to scale deep learning training tasks?
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Infra for distributed training - scale up Amazon Elastic Block Store (EBS) Amazon EC2 GPU GPU GPU GPU GPU GPU GPU GPU
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Infra for distributed training - scale out Amazon Elastic Block Store (EBS) Amazon EC2
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-GPUs and multi-nodes options Using DL framework’s feature • TensorFlow - Multi-powering for multi-GPUs training - Parameter server for multi-node training • Apache MXNet - Multi-GPUs by defining context with list of GPUs - Parameter server for multi-node training Using Horovod • https://eng.uber.com/horovod/ • Open source distributed training framework based on Message Passing Interface (MPI) • Baidu’s draft implementation of the TensorFlow ring-allreduce algorithm • Support famous deep learning frameworks such as TensorFlow, MXNet, Keras, PyTorch Performance scalability using Horovod
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod Install Horovod and related packages à AWS Deep Learning AMI and Deep Learning Containers have all already Modify your training code to be trained using Horovod Run multi-GPUs or distributed training using Horovod mpirun command
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod with TensorFlow import tensorflow as tf import horovod.tensorflow as hvd # Initialize Horovod hvd.init() # Pin GPU to be used to process local rank (one GPU per process) config = tf.ConfigProto() config.gpu_options.visible_device_list = str(hvd.local_rank()) # Build model... loss = ... opt = tf.train.AdagradOptimizer(0.01 * hvd.size()) # Add Horovod Distributed Optimizer opt = hvd.DistributedOptimizer(opt) # Add hook to broadcast variables from rank 0 to all other processes during # initialization. hooks = [hvd.BroadcastGlobalVariablesHook(0)] # Make training operation train_op = opt.minimize(loss) # Save checkpoints only on worker 0 to prevent other workers from corrupting them. checkpoint_dir = '/tmp/train_logs' if hvd.rank() == 0 else None # The MonitoredTrainingSession takes care of session initialization, # restoring from a checkpoint, saving to a checkpoint, and closing when done # or an error occurs. with tf.train.MonitoredTrainingSession(checkpoint_dir=checkpoint_dir, config=config, hooks=hooks) as mon_sess: while not mon_sess.should_stop(): # Perform synchronous training. mon_sess.run(train_op) ( source code from https://github.com/horovod/horovod )
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod with Apache MXNet import mxnet as mx import horovod.mxnet as hvd from mxnet import autograd # Initialize Horovod hvd.init() # Pin GPU to be used to process local rank context = mx.gpu(hvd.local_rank()) num_workers = hvd.size() # Build model model = ... model.hybridize() # Create optimizer optimizer_params = ... opt = mx.optimizer.create('sgd', **optimizer_params) # Initialize parameters model.initialize(initializer, ctx=context) # Fetch and broadcast parameters params = model.collect_params() if params is not None: hvd.broadcast_parameters(params, root_rank=0) # Create DistributedTrainer, a subclass of gluon.Trainer trainer = hvd.DistributedTrainer(params, opt) # Create loss function loss_fn = ... # Train model for epoch in range(num_epoch): train_data.reset() for nbatch, batch in enumerate(train_data, start=1): data = batch.data[0].as_in_context(context) label = batch.label[0].as_in_context(context) with autograd.record(): output = model(data.astype(dtype, copy=False)) loss = loss_fn(output, label) loss.backward() trainer.step(batch_size) ( source code from https://github.com/horovod/horovod )
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod with Keras import keras import horovod.keras as hvd # Horovod: initialize Horovod. hvd.init() # Horovod: pin GPU to be used to process local rank (one GPU per process) config = tf.ConfigProto() config.gpu_options.allow_growth = True config.gpu_options.visible_device_list = str(hvd.local_rank()) # Horovod: adjust number of epochs based on number of GPUs. epochs = int(math.ceil(12.0 / hvd.size())) model = ... # Horovod: adjust learning rate based on number of GPUs. opt = keras.optimizers.Adadelta(1.0 * hvd.size()) # Horovod: add Horovod Distributed Optimizer. opt = hvd.DistributedOptimizer(opt) model.compile(loss=keras.losses.categorical_crossentropy, optimizer=opt, metrics=['accuracy']))) callbacks = [ # Horovod: broadcast initial variable states from rank 0 to all other processes. # This is necessary to ensure consistent initialization of all workers when # training is started with random weights or restored from a checkpoint. hvd.callbacks.BroadcastGlobalVariablesCallback(0), ] # Horovod: save checkpoints only on worker 0 to prevent other workers from corrupting them. if hvd.rank() == 0: callbacks.append(keras.callbacks.ModelCheckpoint( './checkpoint-{epoch}.h5')) model.fit(x_train, y_train, batch_size=batch_size, callbacks=callbacks, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) ( source code from https://github.com/horovod/horovod )
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod in Amazon EC2 https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-horovod-tensorflow.html STEP 1. Configure Horovod Hosts file 172.100.1.200 slots=8 172.200.8.99 slots=8 172.48.3.124 slots=8 localhost slots=8 STEP 2. Configure nodes to not do “StrickHostKeyChecking” STEP 3. Execute training script using mpirun command ~/anaconda3/envs/tensorflow_p36/bin/mpirun -np $gpus -hostfile ~/hosts -mca plm_rsh_no_tree_spawn 1 -bind-to socket -map-by slot -x HOROVOD_HIERARCHICAL_ALLREDUCE=1 -x HOROVOD_FUSION_THRESHOLD=16777216 -x NCCL_MIN_NRINGS=4 -x LD_LIBRARY_PATH -x PATH -mca pml ob1 -mca btl ^openib -x NCCL_SOCKET_IFNAME=$INTERFACE -mca btl_tcp_if_exclude lo,docker0 -x TF_CPP_MIN_LOG_LEVEL=0 python -W ignore ~/examples/horovod/tensorflow/train_imagenet_resnet_hvd.py --data_dir ~/data/tf-imagenet/ --num_epochs 90 --increased_aug -b $BATCH_SIZE --mom 0.977 --wdecay 0.0005 --loss_scale 256. --use_larc --lr_decay_mode linear_cosine --warmup_epochs 5 --clear_log
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod in Amazon EKS https://docs.aws.amazon.com/dlami/latest/devguide/deep-learning-containers-eks-tutorials-distributed-gpu-training.html STEP 1. Install Kubeflow to setup a cluster for distributed training STEP 2. Set the app name and initialize it. STEP 3. Install mpi-operator from kubeflow STEP 4. Create a MPI Job template, define the number of nodes (replicas), number of GPUs each node has (gpusPerReplica) STEP 5. Apply the manifest to the default environment. The MPI Job will create a launch pod
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod in Amazon SageMaker from sagemaker.tensorflow import TensorFlow distributions = {'mpi': {'enabled': True, "processes_per_host": 2}} # METHOD 1 - Using Amazon SageMaker provided VPC estimator = TensorFlow(entry_point=train_script, role=sagemaker_iam_role, train_instance_count=2, train_instance_type='ml.p3.8xlarge', script_mode=True, framework_version='1.12', distributions=distributions) # METHOD 2 - Using your own VPC for training performance improvement estimator = TensorFlow(entry_point=train_script, role=sagemaker_iam_role, train_instance_count=2, train_instance_type='ml.p3.8xlarge', script_mode=True, framework_version='1.12', distributions=distributions, security_group_ids=['sg-0919a36a89a15222f'], subnets=['subnet-0c07198f3eb022ede', 'subnet-055b2819caae2fd1f’]) estimator.fit({"train":s3_train_path, "test":s3_test_path})
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Examples of hyperparameters Neural Networks Number of layers Hidden layer width Learning rate Embedding dimensions Dropout … Decision Trees Tree depth Max leaf nodes Gamma Eta Lambda Alpha …
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automatic Model Tuning Finding the optimal set of hyperparameters 1. Manual Search (”I know what I’m doing”) 2. Grid Search (“X marks the spot”) • Typically training hundreds of models • Slow and expensive 3. Random Search (“Spray and pray”) • Works better and faster than Grid Search • But… but… but… it’s random! 4. HPO: use Machine Learning • Training fewer models • Gaussian Process Regression and Bayesian Optimization • You can now resume from a previous tuning job
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How to use Amazon SageMaker HPO Configuration Training Jobs Resulting Models Estimator
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hardware optimization is extremely complex
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Neo is a compiler and runtime for machine learning Compiler Runtime Processor vendors can integrate hardware-specific optimizations Device makers can embed runtime into edge devices and IoT github.com/neo-ai Apache Software License Neo
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How to compile a model https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation-cli.html Configure the compilation job { "RoleArn":$ROLE_ARN, "InputConfig": { "S3Uri":"s3://jsimon-neo/model.tar.gz", "DataInputConfig": "{"data": [1, 3, 224, 224]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://jsimon-neo/", "TargetDevice": "rasp3b" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } } Compile the model $ aws sagemaker create-compilation-job --cli-input-json file://config.json --compilation-job-name resnet50-mxnet-pi $ aws s3 cp s3://jsimon-neo/model- rasp3b.tar.gz . $ gtar tfz model-rasp3b.tar.gz compiled.params compiled_model.json compiled.so Predict with the compiled model from dlr import DLRModel model = DLRModel('resnet50', input_shape, output_shape, device) out = model.run(input_data)
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Model compilation using AWS console
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Performance improvement result Image file name MXNet model (seconds) Neo-compiled model (seconds) Improvement (mxnet model / neo- compiled model) input_001 0.0299 0.0128 233.59% input_002 0.0223 0.0129 172.86% input_003 0.0275 0.0125 220.00%
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Do I need really that much complex & deep neural networks to meet the required accuracy?
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Compressing deep learning models • Compression is the process of reducing the size of a trained network, either by removing certain layers or by shrinking layers, while maintaining accuracy. • A smaller model will predict faster and require less memory. • The number of possible combinations makes is difficult to perform this task manually, or even programmatically. • Reinforcement learning to the rescue!
  • 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Defining the problem • Objective: find the smallest possible network architecture from a pre-trained network architecture, while producing the best accuracy. • Environment: a custom developed environment that accepts a Boolean array of layers to remove from the RL agent and produces an observation describing layers. • State: the layers. • Action: A boolean array one for each layer. • Reward: a combination of compression ratio and accuracy.
  • 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker RL Reinforcement learning for every developer and data scientist Broad support for frameworks Broad support for simulation environments 2D & 3D physics environments and OpenGym support Support Amazon Sumerian, AWS RoboMaker and the open source Robotics Operating System (ROS) project Fully managed Example notebooks and tutorials K E Y F E A T U R E S
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://github.com/awslabs/amazon-sagemaker- examples/tree/master/reinforcement_learning/rl_network_compression_ray_custom
  • 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Predictions drive complexity and cost in production Training 10% Inference 90%
  • 47. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Are you making the most of your infrastructure? One size does not fit allLow utilization and high costs
  • 48. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Elastic Inference https://aws.amazon.com/blogs/aws/amazon-elastic-inference-gpu-powered-deep-learning-inference-acceleration/ Match capacity to demand Available between 1 to 32 TFLOPS K E Y F E A T U R E S Integrated with Amazon EC2, Amazon SageMaker, and Amazon DL AMIs Support for TensorFlow, Apache MXNet, and ONNX with PyTorch coming soon Single and mixed-precision operations Lower inference costs up to 75%
  • 49. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Elastic Inference with TensorFlow OPTION 1 - Using Elastic Inference TensorFlow Serving $ amazonei_tensorflow_model_server --model_name=ssdresnet --model_base_path=/tmp/ssd_resnet50_v1_coco --port=9000 OPTION 2 - Using Elastic Inference TensorFlow Predictor from tensorflow.contrib.ei.python.predictor.ei_predictor import EIPredictor img = mpimg.imread(FLAGS.image) img = np.expand_dims(img, axis=0) ssd_resnet_input = {'inputs': img} eia_predictor = EIPredictor(model_dir='/tmp/ssd_resnet50_v1_coco/1/') pred = eia_predictor(ssd_resnet_input)
  • 50. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Elastic Inference with Apache MXNet OPTION 1 - Use EI with the MXNet Symbol API import mxnet as mx data = mx.sym.var('data', shape=(1,)) sym = mx.sym.exp(data) # Pass mx.eia() as context during simple bind operation executor = sym.simple_bind(ctx=mx.eia(), grad_req='null') # Forward call is performed on remote accelerator executor.forward(data=mx.nd.ones((1,))) print('Inference %d, output = %s' % (i, executor.outputs[0])) OPTION 2 - Use EI with the Module API ctx = mx.eia() sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-152', 0) mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
  • 51. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Other tips SageMaker Pipemode using TensorFlow Pipemode Dataset extension https://github.com/aws/sagemaker-tensorflow- extensions Apache MXNet can read training data from Amazon S3 directly https://mxnet.incubator.apache.org/versions/master/ faq/s3_integration.html * dataset – a 3.9 GB CSV file– contained 2 million records, each record having 100 comma-separated, single-precision floating-point values.
  • 52. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Summary Training • Make it sure to utilize Tensor Core by using mix-precision training • Learn to use Horovod for efficient multi-GPU or multi-node distributed training • Find the most optimal hyperparameter using SageMaker HPO Deployment • Compile your model using Amazon SageMaker Neo • Use Amazon Elastic Inference to reduce inference cost if applicable
  • 53. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dive into Deep Learning An interactive deep learning book with code, math, and discussions http://d2l.ai/ http://ko.d2l.ai/ STAT 157 Course at UC Berkeley, Spring 2019 한국어 version of the first 4 chapters is available NOW. • GitHub Pull Request for any correction is welcome • Raise issue at https://github.com/d2l-ai/d2l-ko/issues
  • 54. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Getting started https://ml.aws https://aws.amazon.com/blogs/machine-learning https://aws.amazon.com/sagemaker https://github.com/awslabs/amazon-sagemaker-examples https://medium.com/@julsimon