SlideShare a Scribd company logo
1 of 32
Download to read offline
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
대용량 한글 자연어처리 모델의
클라우드 분산 학습 및 배포 사례
전희원
연구원
SK Telecom
김무현
시니어 데이터 사이언티스트
AWS Korea
강지양
시니어 딥러닝 아키텍트
AWS Korea
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
GPT(Generative Pre-Training)2 - 1
• Language Model based Transformer
• Language Model
•P(아버지가 방에 들어가신다. ) > P(아버지 가방에 들어가신다.)
• Unsupervised pre-training
• Transformer decoder
Masked Multi
Self Attention
Layer Norm
Feed Forward
Layer Norm
+
+
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
GPT(Generative Pre-Training)2 - 2
> A robot must obey the orders given it by human beings except where such orders would
conflict with the First Law.
http://jalammar.github.io/illustrated-gpt2/
KoGPT2 Training - Data
Corpus
• 20GB raw text (5 times larger than KoBERT)
Tokenizer
• Trained with 25M sentences(wiki + news)
• BPE(Byte Pair Encoding)
• 50,000 vocab
Data # of Sentences # of Words
Korean Wiki 5M 54M
Korean News 120M 1.6B
Other corpus 9.4M, 18M 88M, 82M
KoGPT2 Training - Distributed training
• 'Single Reducer' to 'Ring Reducer'
• Linear scale training performance with Horovod
• Instant corpus pre-processing
• No need to load all data on memory.
• Syncing training chunks on every epoch.
• No need to stop training to add more data.
• Fused GELU (with GluonNLP team)
• About 15% performance improved
KoGPT2 Training - Performances
Model Test Accuracy
BERT base
multilingual cased
0.875
KoBERT 0.901
KoGPT2 0.899
Sentiment Analysis (Naver movie review data)
Model Test Accuracy
KoBERT 0.912
KoGPT2 0.943
Paraphrase Detection
KoGPT2 Demo
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Infra for distributed training - scale OUT
Multiple Computing Resources
Shared Storage
High bandwidth inter-node communication
Tools for multi-GPU and distributed training
Deep learning framework
• TensorFlow - tf.distributed.Strategy
• PyTorch - DistributedDataParallel
• Apache MXNet - Parameter server
Toolkit for distributed training on top of deep learning frameworks
• All-reduce based distributed training framework, Horovod
• A deep learning optimization library, DeepSpeed
AWS ML services for distributed training
Amazon EC2
• Amazon EC2 P3dn.24xlarge for compute (NVIDIA V100 TensorCore GPU)
• Elastic Fabric Adapter for high speed network (100Gbps)
• Amazon FSx for Lustre for shared storage
• Deep Learning AMI
• EC2 Launch Templates or AWS Parallel Cluster for automation
Amazon SageMaker
• ml.p3dn.24xlarge ML instance for compute
• Elastic Fabric Adapter for high speed network
• Amazon S3 with SageMaker Pipe mode or Amazon FSx for Lustre
AWS architecture for KoGPT-2 training
Deep Learning
Scientist
MLOps
Engineer
Amazon FSx
for Lustre
Amazon Simple Storage
Service
VPC
AWS Cloud
Availability Zone 1
Amazon CloudWatch
AWS Lambda
Placement Group
Elastic Fabric
Adapter
Data management
instance
What we, as a human team, did
CPU and GPU utilization monitoring
• GPU monitoring script
• Amazon CloudWatch
Bottleneck analysis
• Apache MXNet profiling
• Horovod timeline profiling
Code Review for better choices
Training running state check
Identifying bottleneck
Tuning trials
• Activation function, GELU (+++)
• NVIDIA implementation of Adam, BERTAdam (+)
• Horovod option tuning
• Mixed precision training (+)
• hvd.DistributedTrainer instead of hvd.DistributedOptimizer
• Data transposing using GPU instead of using CPU (+)
Tuning result
Final training decision
84.9
146.8
253.9
411.7
84.9
169.8
339.6
679.3
0.0
100.0
200.0
300.0
400.0
500.0
600.0
700.0
800.0
8 16 32 64
throughput(records/sec)
number of GPUs
Distributed training scalability test result (KoGPT2)
throughput (record/sec) ideal throughput
Why not using Amazon SageMaker for training?
Yes!
Since all the training configuration and tuning have completed,
both training new pretrained models on new dataset
and
fine-tuning for downstream NLP tasks
on Amazon SageMaker is a desired option.
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker:
Build, Train, and Deploy ML Models at Scale
Inference vs Training
Inference Training
Usually run on a single input in real
time
Requires high parallelism with large
batch processing for higher throughput
Less compute/memory intensive Compute/memory intensive
Integrated into the application stack
workflows
Standalone, not integrated into an
application stack
Runs on different devices at the edge
and in the cloud
Run in the cloud
Runs all the time
Typically runs less frequently (train
once, repeat infrequently)
The majority of the cost and
complexity of Machine
Learning (ML) in production is
due to Inference
Inference
90%
Training
10%
Amazon SageMaker – Instance Types for Inference
https://aws.amazon.com/sagemaker/pricing/instance-types/
Instance
Family
t family m family r family c family p family g family Inf1
Elastic
Inference
Workload
Type
Short jobs/
Notebooks
Standard
CPU/
Memory ratio
Memory
optimized
Compute
optimized
Accelerated
Computing-
Training and
Inference
Accelerated
inference,
smaller
training jobs
Accelerated
Computing -
Inference
Cost-
effective
Inference
Accelerators
t3.2xlarge
8 vCPU
32 Mem
m5.2xlarge
8 vCPU
32 Mem
r5.2xlarge
8 vCPU
64 Mem
c5.2xlarge
8 vCPU
16 Mem
p3.2xlarge
8 vCPU
61 Mem
1xV100 GPU
g4dn.2xlarge
8 vCPU
32 Mem
1xT4 GPU
GPUs chipsCPUs
SageMaker Endpoint - Scalable and Highly Available
• Deploy more than one instance
behind an inference endpoint
for high availability
• Enable automatic scaling with a
predefined metric or a custom
metric
• You can manually change the
instance number and type
without incurring downtime
• Set Amazon CloudWatch alarms
on availability and error rates
Load Balancer
Availability Zone 2Availability Zone 1
VPC
Endpoint
Deploying in SageMaker at Any Scale
sagemaker_session = sagemaker.Session()
role = get_execution_role()
model_data = 's3://<your bucket name>/gpt2-model/model.tar.gz'
entry_point = './gpt2-inference.py'
mxnet_model = MXNetModel(model_data=model_data,
role=role,
entry_point=entry_point,
py_version='py3’,
framework_version='1.6.0’,
image='<AWS account id>.dkr.ecr.<AWS region>.amazonaws.com/kogpt2:latest’
)
# HTTPS endpoint backed by 16 instances, multi-AZ and load-balanced
predictor = mxnet_model.deploy(instance_type='ml.c5.xlarge', initial_instance_count=16)
Auto Scaling Your Endpoint
- Latency
- Errors
- Invocations
- Invocations per Instance
Scaling policy : Target value of 25000 (per min) for
SageMakerVariantInvocationsPerInstance
Auto Scaling OFF (1 instance)
Auto Scaling OFF (1 instance)
Auto Scaling OFF (1 instance)
Auto Scaling ON
Auto Scaling ON
Auto Scaling ON
References
KoGPT-2 : https://github.com/SKT-AI/KoGPT2
AWS Samples GitHub : https://github.com/aws-samples
AWS Korea Blog : https://aws.amazon.com/ko/blogs/korea/
Machine Learning on AWS : https://ml.aws
Amazon SageMaker : https://aws.amazon.com/sagemaker/
GluonNLP : https://gluon-nlp.mxnet.io/
Amazon ML Solutions Lab : https://aws.amazon.com/ml-solutions-lab/
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
감사합니다
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

More from Amazon Web Services Korea

Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon Web Services Korea
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon Web Services Korea
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Amazon Web Services Korea
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Web Services Korea
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...Amazon Web Services Korea
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...Amazon Web Services Korea
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon Web Services Korea
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...Amazon Web Services Korea
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...Amazon Web Services Korea
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...Amazon Web Services Korea
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...Amazon Web Services Korea
 
[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...
[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...
[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...Amazon Web Services Korea
 
AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기
AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기
AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기Amazon Web Services Korea
 
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기Amazon Web Services Korea
 
AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례
AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례
AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례Amazon Web Services Korea
 
AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례
AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례
AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례Amazon Web Services Korea
 
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기Amazon Web Services Korea
 
AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처
AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처
AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처Amazon Web Services Korea
 
AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기
AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기
AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기Amazon Web Services Korea
 

More from Amazon Web Services Korea (20)

Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
 
[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...
[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...
[Keynote] Data Driven Organizations with AWS Data - 발표자: Agnes Panosian, Head...
 
AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기
AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기
AWS Summit Seoul 2023 | Amazon Neptune 및 Elastic을 이용한 추천 서비스 및 검색 플랫폼 구축하기
 
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
 
AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례
AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례
AWS Summit Seoul 2023 | 스타트업의 서버리스 기반 SaaS 데이터 처리 및 데이터웨어하우스 구축 사례
 
AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례
AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례
AWS Summit Seoul 2023 | Amazon EKS 데이터 전송 비용 절감 및 카오스 엔지니어링 적용 사례
 
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
 
AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처
AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처
AWS Summit Seoul 2023 | 12가지 디자인 패턴으로 알아보는 클라우드 네이티브 마이크로서비스 아키텍처
 
AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기
AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기
AWS Summit Seoul 2023 | AWS에서 OpenTelemetry 기반의 애플리케이션 Observability 구축/활용하기
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

대용량 한글 자연어처리 모델의 클라우드 분산 학습 및 배포 사례 – 김무현, AWS시니어 데이터 사이언티스트 / 강지양, AWS 시니어 딥러닝 아키텍트/전희원, SK Telecom연구원:: AWS Summit Online Korea 2020

  • 1. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. 대용량 한글 자연어처리 모델의 클라우드 분산 학습 및 배포 사례 전희원 연구원 SK Telecom 김무현 시니어 데이터 사이언티스트 AWS Korea 강지양 시니어 딥러닝 아키텍트 AWS Korea
  • 2. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 3. GPT(Generative Pre-Training)2 - 1 • Language Model based Transformer • Language Model •P(아버지가 방에 들어가신다. ) > P(아버지 가방에 들어가신다.) • Unsupervised pre-training • Transformer decoder Masked Multi Self Attention Layer Norm Feed Forward Layer Norm + + https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
  • 4. GPT(Generative Pre-Training)2 - 2 > A robot must obey the orders given it by human beings except where such orders would conflict with the First Law. http://jalammar.github.io/illustrated-gpt2/
  • 5. KoGPT2 Training - Data Corpus • 20GB raw text (5 times larger than KoBERT) Tokenizer • Trained with 25M sentences(wiki + news) • BPE(Byte Pair Encoding) • 50,000 vocab Data # of Sentences # of Words Korean Wiki 5M 54M Korean News 120M 1.6B Other corpus 9.4M, 18M 88M, 82M
  • 6. KoGPT2 Training - Distributed training • 'Single Reducer' to 'Ring Reducer' • Linear scale training performance with Horovod • Instant corpus pre-processing • No need to load all data on memory. • Syncing training chunks on every epoch. • No need to stop training to add more data. • Fused GELU (with GluonNLP team) • About 15% performance improved
  • 7. KoGPT2 Training - Performances Model Test Accuracy BERT base multilingual cased 0.875 KoBERT 0.901 KoGPT2 0.899 Sentiment Analysis (Naver movie review data) Model Test Accuracy KoBERT 0.912 KoGPT2 0.943 Paraphrase Detection
  • 9.
  • 10. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 11. Infra for distributed training - scale OUT Multiple Computing Resources Shared Storage High bandwidth inter-node communication
  • 12. Tools for multi-GPU and distributed training Deep learning framework • TensorFlow - tf.distributed.Strategy • PyTorch - DistributedDataParallel • Apache MXNet - Parameter server Toolkit for distributed training on top of deep learning frameworks • All-reduce based distributed training framework, Horovod • A deep learning optimization library, DeepSpeed
  • 13. AWS ML services for distributed training Amazon EC2 • Amazon EC2 P3dn.24xlarge for compute (NVIDIA V100 TensorCore GPU) • Elastic Fabric Adapter for high speed network (100Gbps) • Amazon FSx for Lustre for shared storage • Deep Learning AMI • EC2 Launch Templates or AWS Parallel Cluster for automation Amazon SageMaker • ml.p3dn.24xlarge ML instance for compute • Elastic Fabric Adapter for high speed network • Amazon S3 with SageMaker Pipe mode or Amazon FSx for Lustre
  • 14. AWS architecture for KoGPT-2 training Deep Learning Scientist MLOps Engineer Amazon FSx for Lustre Amazon Simple Storage Service VPC AWS Cloud Availability Zone 1 Amazon CloudWatch AWS Lambda Placement Group Elastic Fabric Adapter Data management instance
  • 15. What we, as a human team, did CPU and GPU utilization monitoring • GPU monitoring script • Amazon CloudWatch Bottleneck analysis • Apache MXNet profiling • Horovod timeline profiling Code Review for better choices Training running state check
  • 17. Tuning trials • Activation function, GELU (+++) • NVIDIA implementation of Adam, BERTAdam (+) • Horovod option tuning • Mixed precision training (+) • hvd.DistributedTrainer instead of hvd.DistributedOptimizer • Data transposing using GPU instead of using CPU (+)
  • 19. Final training decision 84.9 146.8 253.9 411.7 84.9 169.8 339.6 679.3 0.0 100.0 200.0 300.0 400.0 500.0 600.0 700.0 800.0 8 16 32 64 throughput(records/sec) number of GPUs Distributed training scalability test result (KoGPT2) throughput (record/sec) ideal throughput
  • 20. Why not using Amazon SageMaker for training? Yes! Since all the training configuration and tuning have completed, both training new pretrained models on new dataset and fine-tuning for downstream NLP tasks on Amazon SageMaker is a desired option.
  • 21. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 22. Amazon SageMaker: Build, Train, and Deploy ML Models at Scale
  • 23. Inference vs Training Inference Training Usually run on a single input in real time Requires high parallelism with large batch processing for higher throughput Less compute/memory intensive Compute/memory intensive Integrated into the application stack workflows Standalone, not integrated into an application stack Runs on different devices at the edge and in the cloud Run in the cloud Runs all the time Typically runs less frequently (train once, repeat infrequently) The majority of the cost and complexity of Machine Learning (ML) in production is due to Inference Inference 90% Training 10%
  • 24. Amazon SageMaker – Instance Types for Inference https://aws.amazon.com/sagemaker/pricing/instance-types/ Instance Family t family m family r family c family p family g family Inf1 Elastic Inference Workload Type Short jobs/ Notebooks Standard CPU/ Memory ratio Memory optimized Compute optimized Accelerated Computing- Training and Inference Accelerated inference, smaller training jobs Accelerated Computing - Inference Cost- effective Inference Accelerators t3.2xlarge 8 vCPU 32 Mem m5.2xlarge 8 vCPU 32 Mem r5.2xlarge 8 vCPU 64 Mem c5.2xlarge 8 vCPU 16 Mem p3.2xlarge 8 vCPU 61 Mem 1xV100 GPU g4dn.2xlarge 8 vCPU 32 Mem 1xT4 GPU GPUs chipsCPUs
  • 25. SageMaker Endpoint - Scalable and Highly Available • Deploy more than one instance behind an inference endpoint for high availability • Enable automatic scaling with a predefined metric or a custom metric • You can manually change the instance number and type without incurring downtime • Set Amazon CloudWatch alarms on availability and error rates Load Balancer Availability Zone 2Availability Zone 1 VPC Endpoint
  • 26. Deploying in SageMaker at Any Scale sagemaker_session = sagemaker.Session() role = get_execution_role() model_data = 's3://<your bucket name>/gpt2-model/model.tar.gz' entry_point = './gpt2-inference.py' mxnet_model = MXNetModel(model_data=model_data, role=role, entry_point=entry_point, py_version='py3’, framework_version='1.6.0’, image='<AWS account id>.dkr.ecr.<AWS region>.amazonaws.com/kogpt2:latest’ ) # HTTPS endpoint backed by 16 instances, multi-AZ and load-balanced predictor = mxnet_model.deploy(instance_type='ml.c5.xlarge', initial_instance_count=16)
  • 27.
  • 28.
  • 29. Auto Scaling Your Endpoint - Latency - Errors - Invocations - Invocations per Instance Scaling policy : Target value of 25000 (per min) for SageMakerVariantInvocationsPerInstance Auto Scaling OFF (1 instance) Auto Scaling OFF (1 instance) Auto Scaling OFF (1 instance) Auto Scaling ON Auto Scaling ON Auto Scaling ON
  • 30. References KoGPT-2 : https://github.com/SKT-AI/KoGPT2 AWS Samples GitHub : https://github.com/aws-samples AWS Korea Blog : https://aws.amazon.com/ko/blogs/korea/ Machine Learning on AWS : https://ml.aws Amazon SageMaker : https://aws.amazon.com/sagemaker/ GluonNLP : https://gluon-nlp.mxnet.io/ Amazon ML Solutions Lab : https://aws.amazon.com/ml-solutions-lab/
  • 31. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 32. 감사합니다 © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.