SlideShare a Scribd company logo
1 of 87
Download to read offline
OpenAI Gym을 이용한
강화학습 에이전트 만들기
파이콘 코리아 2017 튜토리얼
이의령, 이영무, 이웅원, 양혁렬, 김건우
RLCode
https://github.com/zzing0907/pycon-tutorial
•
•
https://github.com/rlcode/reinforcement-learning-
kr/tree/master/wiki
http://web.stanford.edu/class/cs234/slides/lecture1_introduction.pdf
à à
http://www.popsci.com/googles-alphago-ai-defeats-lee-se-dol-at-game-go
à à
•
à
•
à
https://www.youtube.com/watch?v=0CqoMwcqIbQ
•
à
à
•
•
•
à
à
𝑠", 𝑎", 𝑟&, 𝑠&, 𝑎&, 𝑟', ⋯ , 𝑠)
à
•
•
•
•
•
•
•
• 𝑃,,-
.
𝑠/
• 𝛾
• 𝜋 𝑎	 	𝑠)
à
•
•
•
= 𝑅56& + 𝑅56' +	⋯	+ 𝑅)
• à
0.1 + 0.1 +	⋯ = 	∞
1 + 1 +	⋯ = 	∞
= 𝑅56& + 𝛾𝑅56' +	⋯	+ 𝛾)<5<&
𝑅)	
0 ≤ 𝛾 ≤ 1
•
	𝑣(𝑠) = 𝑬 𝑅56& + 𝛾𝑅56' +	⋯	|𝑆5 = 𝑠
•
	𝑞(𝑠, 𝑎) = 𝑬 𝑅56& + 𝛾𝑅56' +	⋯	|𝑆5 = 𝑠, 𝐴5 = 𝑎
• à
•
	𝑞E(𝑠, 𝑎) = 𝑬 𝝅 𝑅56& + 𝛾𝑅56' +	⋯	|𝑆5 = 𝑠, 𝐴5 = 𝑎
	𝑣E(𝑠) = 𝑬 𝝅 𝑅56& + 𝛾𝑅56' +	⋯	|𝑆5 = 𝑠 	
	𝜋(𝑎|𝑠) = 𝑷 𝐴5 = 𝑎|𝑆5 = 𝑠
•
	𝑣E(𝑠) = 𝑬 𝝅 𝑅56& + 𝛾𝑅56' +	⋯	|𝑆5 = 𝑠
𝑣E(𝑠) = H 𝜋 𝑎	 	𝑠)
.∈K
(𝑅56& + 𝛾𝑣E(𝑆56&))
	𝑣E(𝑠) = 𝑬 𝝅 𝑅56& + 𝛾𝑣E(𝑆56&)|𝑆5 = 𝑠
𝑣E(𝑠) = H 𝜋 𝑎	 	𝑠)
.∈K
(𝑅56& + 𝛾𝑃,,/
.
𝑣E(𝑆56&))
•
•
𝛾
𝑣∗ 𝑠 = 𝑚𝑎𝑥. 𝐸 𝑅56& + 𝛾𝑣∗ 𝑆56& 𝑆5 = 𝑠, 𝐴5 = 𝑎]
𝑞∗ 𝑠, 𝑎 = 𝐸 𝑅56& + 𝛾𝑚𝑎𝑥. 𝑞∗ 𝑆56&, 𝑎′ 𝑆5 = 𝑠, 𝐴5 = 𝑎]
• à
•
	𝑞E(𝑠, 𝑎) = 𝑬 𝝅 𝑅56& + 𝛾𝑅56' +	⋯	|𝑆5 = 𝑠, 𝐴5 = 𝑎
𝑞E(𝑠, 𝑎) = 𝑬 𝝅 𝑅56& + 𝛾(𝑅56' +	⋯ )|𝑆5 = 𝑠, 𝐴5 = 𝑎
𝑞E(𝑠, 𝑎) = 𝑬 𝝅 𝑅56& + 𝛾𝑞E(𝑆56&, 𝐴56&)|𝑆5 = 𝑠, 𝐴5 = 𝑎
•
•
•
𝑣R
2. 	𝑉R(𝑠/
) 𝛾
𝑅,
.
	
𝑅,
.
𝛾 	𝑉R(𝑠/
)
																												𝜋(𝑎|𝑠)(𝑅,
.
𝛾 	𝑉R(𝑠/
))
∑ 𝜋(𝑎|𝑠)(𝑅,
.
𝛾 	𝑉R(𝑠/
))
∈ 𝑆에
•
𝑣E(𝑠) = H 𝜋 𝑎	 	𝑠)
.∈K
(𝑅56& + 𝛾𝑣E(𝑆56&))
•
	𝜋′(𝑠) = 𝑎𝑟𝑔𝑚𝑎𝑥.	𝑞E(𝑠, 𝑎)
•
	𝜋′(𝑠) = 𝑎𝑟𝑔𝑚𝑎𝑥.	𝑞E(𝑠, 𝑎)
https://github.com/zzing0907/pycon-tutorial/tree/master/policy_iteration
•
•
•
•
𝑣R6&(𝑠) = 𝑚𝑎𝑥.(𝑅,
. + 𝛾𝑣R(𝑆56&)
•
𝑣R6&(𝑠) = 𝑚𝑎𝑥.(𝑅,
. + 𝛾𝑣R(𝑆56&)
https://github.com/zzing0907/pycon-tutorial/tree/master/value_iteration
•
•
•
•
•
• à
ß
𝑞E(𝑠, 𝑎) = 𝑬 𝝅 𝑅56& + 𝛾𝑞E(𝑆56&, 𝐴56&)|𝑆5 = 𝑠, 𝐴5 = 𝑎
𝑞 𝑠, 𝑎 ← 𝑟 + 𝛾𝑞E(𝑠′, 𝑎′)
𝑞 𝑠, 𝑎 = 𝑞 𝑠, 𝑎 + 𝛼(𝑟 + 𝛾𝑞 𝑠′, 𝑎′ − 𝑞 𝑠, 𝑎 )
•
à
𝑞 𝑠, 𝑎 = 𝑞 𝑠, 𝑎 + 𝛼(𝑟 + 𝛾𝑞 𝑠′, 𝑎′ − 𝑞 𝑠, 𝑎 )
𝜀 − 	𝜋(𝑠) = [
𝑎∗
= 𝑎𝑟𝑔𝑚𝑎𝑥.	𝑞(𝑠, 𝑎), 1 − 𝜀
	𝑎 ≠ 𝑎∗
, 𝜀
𝜀 −
• à
𝜀 − 	𝜋(𝑠) = [
𝑎∗ = 𝑎𝑟𝑔𝑚𝑎𝑥.	𝑞(𝑠, 𝑎), 1 − 𝜀
	𝑎 ≠ 𝑎∗, 𝜀
	𝜋′(𝑠) = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑎	𝑞 𝜋(𝑠, 𝑎)
https://github.com/zzing0907/pycon-tutorial/tree/master/SARSA
•
à
𝑞 𝑠, 𝑎 = 𝑞 𝑠, 𝑎 + 𝛼(𝑟 + 𝛾 max
./
𝑞(𝑠/
, 𝑎′) − 𝑞 𝑠, 𝑎 )
𝜖
𝜖	
https://github.com/rlcode/reinforcement-learning-kr/tree/master/1-grid-world/5-q-learning
(𝑆/) 𝑎/
(𝑆/)
https://github.com/zzing0907/pycon-tutorial/tree/master/Q-Learning
•
•
•
•
•
•
•
ℎ𝑖𝑠𝑡𝑜𝑟𝑦	 → 		𝑞 − 𝑛𝑒𝑡𝑤𝑜𝑟𝑘	 → 𝑞	𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛	𝑜𝑓	𝑒𝑎𝑐ℎ	𝑎𝑐𝑡𝑖𝑜𝑛
•
•
𝑞 𝑠, 𝑎 = 𝑞 𝑠, 𝑎 + 𝛼(𝑟 + 𝛾 max
./
𝑞(𝑠/
, 𝑎′) − 𝑞 𝑠, 𝑎 )
인공신경망 업데
이트
•
à
•
• 𝜃
•
𝑞 𝑠, 𝑎 = 𝑞 𝑠, 𝑎 + 𝛼(𝑟 + 𝛾 max
./
𝑞(𝑠/, 𝑎′) − 𝑞 𝑠, 𝑎 )
𝑞o 𝑠, 𝑎 = 𝑞o 𝑠, 𝑎 + 𝛼(𝑟 + 𝛾 max
.-
𝑞o 𝑠/, 𝑎/ − 𝑞o 𝑠, 𝑎 )
𝑀𝑆𝐸 =	 𝑟 + 𝛾 max
.-
𝑞o 𝑠/, 𝑎/ − 𝑞o 𝑠, 𝑎
'
•
https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf
•
어떤 필터인지에 따라 이미지를 다양하게 변형 가능
https://en.wikipedia.org/wiki/Kernel_(image_processing)
•
DQN의 CNN 학습된 필터의 예시
• à
•
𝜃<
•
𝑀𝑆𝐸 = 𝑟 + 𝛾 max
.-
𝑞oq 𝑠/
, 𝑎/
− 𝑞o 𝑠, 𝑎
'
•
•
•
•
•
•
•
•
𝑥
𝑥̇
𝜃
𝜃̇
•
•
https://github.com/zzing0907/pycon-tutorial/tree/master/cartpole_dqn
•
•
•
from gym import wrappers
env = gym.make(‘CartPole-v1')
env = wrappers.Monitor(env, ‘/tmp/cartpole_upload’, force=True)
•
env.close()
gym.scoreboard.api_key = ‘your api key’
gym.upload(‘/tmp/cartpole_upload’)
http://wikibook.co.kr/reinforcement-learning/
•
•
•
•
•
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기

More Related Content

What's hot

강화학습 알고리즘의 흐름도 Part 2
강화학습 알고리즘의 흐름도 Part 2강화학습 알고리즘의 흐름도 Part 2
강화학습 알고리즘의 흐름도 Part 2Dongmin Lee
 
안.전.제.일. 강화학습!
안.전.제.일. 강화학습!안.전.제.일. 강화학습!
안.전.제.일. 강화학습!Dongmin Lee
 
Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)Suhyun Cho
 
Proximal Policy Optimization (Reinforcement Learning)
Proximal Policy Optimization (Reinforcement Learning)Proximal Policy Optimization (Reinforcement Learning)
Proximal Policy Optimization (Reinforcement Learning)Thom Lane
 
ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)
ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)
ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)MITSUNARI Shigeo
 
강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introduction강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introductionTaehoon Kim
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것NAVER Engineering
 
Soft Actor-Critic Algorithms and Applications 한국어 리뷰
Soft Actor-Critic Algorithms and Applications 한국어 리뷰Soft Actor-Critic Algorithms and Applications 한국어 리뷰
Soft Actor-Critic Algorithms and Applications 한국어 리뷰태영 정
 
Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)Taehoon Kim
 
pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"
pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"
pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"YeChan(Paul) Kim
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기NAVER D2
 
[DL輪読会]Deep Learning 第7章 深層学習のための正則化
[DL輪読会]Deep Learning 第7章 深層学習のための正則化[DL輪読会]Deep Learning 第7章 深層学習のための正則化
[DL輪読会]Deep Learning 第7章 深層学習のための正則化Deep Learning JP
 
분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현정주 김
 
파이썬으로 나만의 강화학습 환경 만들기
파이썬으로 나만의 강화학습 환경 만들기파이썬으로 나만의 강화학습 환경 만들기
파이썬으로 나만의 강화학습 환경 만들기정주 김
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
Control as Inference.pptx
Control as Inference.pptxControl as Inference.pptx
Control as Inference.pptxssuserbd1647
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningNAVER Engineering
 
Doing Deep Reinforcement learning with PPO
Doing Deep Reinforcement learning with PPODoing Deep Reinforcement learning with PPO
Doing Deep Reinforcement learning with PPO이 의령
 
Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化
Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化
Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化MITSUNARI Shigeo
 

What's hot (20)

강화학습 알고리즘의 흐름도 Part 2
강화학습 알고리즘의 흐름도 Part 2강화학습 알고리즘의 흐름도 Part 2
강화학습 알고리즘의 흐름도 Part 2
 
안.전.제.일. 강화학습!
안.전.제.일. 강화학습!안.전.제.일. 강화학습!
안.전.제.일. 강화학습!
 
Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)
 
Proximal Policy Optimization (Reinforcement Learning)
Proximal Policy Optimization (Reinforcement Learning)Proximal Policy Optimization (Reinforcement Learning)
Proximal Policy Optimization (Reinforcement Learning)
 
ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)
ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)
ペアリングベースの効率的なレベル2準同型暗号(SCIS2018)
 
강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introduction강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introduction
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
 
Soft Actor-Critic Algorithms and Applications 한국어 리뷰
Soft Actor-Critic Algorithms and Applications 한국어 리뷰Soft Actor-Critic Algorithms and Applications 한국어 리뷰
Soft Actor-Critic Algorithms and Applications 한국어 리뷰
 
Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)
 
pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"
pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"
pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
 
[DL輪読会]Deep Learning 第7章 深層学習のための正則化
[DL輪読会]Deep Learning 第7章 深層学習のための正則化[DL輪読会]Deep Learning 第7章 深層学習のための正則化
[DL輪読会]Deep Learning 第7章 深層学習のための正則化
 
분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현
 
파이썬으로 나만의 강화학습 환경 만들기
파이썬으로 나만의 강화학습 환경 만들기파이썬으로 나만의 강화학습 환경 만들기
파이썬으로 나만의 강화학습 환경 만들기
 
ddpg seminar
ddpg seminarddpg seminar
ddpg seminar
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Control as Inference.pptx
Control as Inference.pptxControl as Inference.pptx
Control as Inference.pptx
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
Doing Deep Reinforcement learning with PPO
Doing Deep Reinforcement learning with PPODoing Deep Reinforcement learning with PPO
Doing Deep Reinforcement learning with PPO
 
Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化
Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化
Lifted-ElGamal暗号を用いた任意関数演算の二者間秘密計算プロトコルのmaliciousモデルにおける効率化
 

Viewers also liked

Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement LearningDongHyun Kwak
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부Donghun Lee
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단NAVER Engineering
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answeringNAVER Engineering
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkNAVER Engineering
 
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망NAVER Engineering
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고Donghun Lee
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QANAVER Engineering
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in VideosNAVER Engineering
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review상은 박
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리Shane (Seungwhan) Moon
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Jeongkyu Shin
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_shareNAVER D2
 
[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래NAVER D2
 
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기NAVER D2
 
밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어NAVER D2
 
[124]자율주행과 기계학습
[124]자율주행과 기계학습[124]자율주행과 기계학습
[124]자율주행과 기계학습NAVER D2
 
[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라
[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라
[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라NAVER D2
 

Viewers also liked (20)

Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answering
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QA
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in Videos
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_share
 
[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래
 
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
 
밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어
 
[124]자율주행과 기계학습
[124]자율주행과 기계학습[124]자율주행과 기계학습
[124]자율주행과 기계학습
 
[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라
[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라
[141] 오픈소스를 쓰려는 자, 리베이스의 무게를 견뎌라
 

Similar to [2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기

第5回NIPS読み会・関西発表資料
第5回NIPS読み会・関西発表資料第5回NIPS読み会・関西発表資料
第5回NIPS読み会・関西発表資料Kyoichiro Kobayashi
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressionsa3sec
 
Joomla! Day Chicago 2011 - Templating the right way - Jonathan Shroyer
Joomla! Day Chicago 2011 - Templating the right way - Jonathan ShroyerJoomla! Day Chicago 2011 - Templating the right way - Jonathan Shroyer
Joomla! Day Chicago 2011 - Templating the right way - Jonathan ShroyerSteven Pignataro
 
Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Qiangning Hong
 
Programming Contest Hacks
Programming Contest HacksProgramming Contest Hacks
Programming Contest HacksKosei Moriyama
 
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...Amazon Web Services Korea
 
개발자가 Serverless로 운동하는 방법
개발자가 Serverless로 운동하는 방법개발자가 Serverless로 운동하는 방법
개발자가 Serverless로 운동하는 방법Jiyeon Seo
 
หัดเขียน A.I. แบบ AlphaGo กันชิวๆ
หัดเขียน A.I. แบบ AlphaGo กันชิวๆหัดเขียน A.I. แบบ AlphaGo กันชิวๆ
หัดเขียน A.I. แบบ AlphaGo กันชิวๆKan Ouivirach, Ph.D.
 
Key Value Storage Systems ... and Beyond ... with Python
Key Value Storage Systems ... and Beyond ... with PythonKey Value Storage Systems ... and Beyond ... with Python
Key Value Storage Systems ... and Beyond ... with PythonIan Lewis
 
How to not blow up spaceships
How to not blow up spaceshipsHow to not blow up spaceships
How to not blow up spaceshipsSabin Marcu
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processingNAVER Engineering
 
Py "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotech
Py "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotechPy "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotech
Py "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotechShinichi Nakagawa
 
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기Heejong Ahn
 
FizzBuzzではじめるテスト
FizzBuzzではじめるテストFizzBuzzではじめるテスト
FizzBuzzではじめるテストMasashi Shinbara
 
Faster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesFaster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesOSCON Byrum
 
TISMatsuriLT MackerelとZabbix
TISMatsuriLT MackerelとZabbixTISMatsuriLT MackerelとZabbix
TISMatsuriLT MackerelとZabbixDaisuke Ikeda
 
Application Modeling with Graph Databases
Application Modeling with Graph DatabasesApplication Modeling with Graph Databases
Application Modeling with Graph DatabasesJosh Adell
 

Similar to [2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기 (20)

第5回NIPS読み会・関西発表資料
第5回NIPS読み会・関西発表資料第5回NIPS読み会・関西発表資料
第5回NIPS読み会・関西発表資料
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
 
Joomla! Day Chicago 2011 - Templating the right way - Jonathan Shroyer
Joomla! Day Chicago 2011 - Templating the right way - Jonathan ShroyerJoomla! Day Chicago 2011 - Templating the right way - Jonathan Shroyer
Joomla! Day Chicago 2011 - Templating the right way - Jonathan Shroyer
 
Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010
 
Programming Contest Hacks
Programming Contest HacksProgramming Contest Hacks
Programming Contest Hacks
 
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...
 
개발자가 Serverless로 운동하는 방법
개발자가 Serverless로 운동하는 방법개발자가 Serverless로 운동하는 방법
개발자가 Serverless로 운동하는 방법
 
Soft Actor Critic 解説
Soft Actor Critic 解説Soft Actor Critic 解説
Soft Actor Critic 解説
 
หัดเขียน A.I. แบบ AlphaGo กันชิวๆ
หัดเขียน A.I. แบบ AlphaGo กันชิวๆหัดเขียน A.I. แบบ AlphaGo กันชิวๆ
หัดเขียน A.I. แบบ AlphaGo กันชิวๆ
 
Key Value Storage Systems ... and Beyond ... with Python
Key Value Storage Systems ... and Beyond ... with PythonKey Value Storage Systems ... and Beyond ... with Python
Key Value Storage Systems ... and Beyond ... with Python
 
How to not blow up spaceships
How to not blow up spaceshipsHow to not blow up spaceships
How to not blow up spaceships
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
 
Py "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotech
Py "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotechPy "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotech
Py "Baseball" Data入門〜サービス(と野球)を支えるデータ分析基盤 #monotarotech
 
SWC-2015
SWC-2015SWC-2015
SWC-2015
 
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
 
FizzBuzzではじめるテスト
FizzBuzzではじめるテストFizzBuzzではじめるテスト
FizzBuzzではじめるテスト
 
Machine Learning at Geeky Base 2
Machine Learning at Geeky Base 2Machine Learning at Geeky Base 2
Machine Learning at Geeky Base 2
 
Faster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesFaster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypes
 
TISMatsuriLT MackerelとZabbix
TISMatsuriLT MackerelとZabbixTISMatsuriLT MackerelとZabbix
TISMatsuriLT MackerelとZabbix
 
Application Modeling with Graph Databases
Application Modeling with Graph DatabasesApplication Modeling with Graph Databases
Application Modeling with Graph Databases
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기