SlideShare a Scribd company logo
1 of 17
Download to read offline
Value	Iteration	Networks
A.	Tamar,	Y.	Wu,	G.	Thomas,	S.	Levine,	and	P.	Abbeel
Dept.	of	Electrical	Engineering	and	Computer	Sciences,	UC	Berkeley
Presenter:	Keisuke	Fujimoto
(Twitter	@peisuke)
Value	Iteration	Networks
Purpose: Machine	learning	based	robot	path	planning.	This	planner	is	available	in	
new	environment not	included	in	train	data	set.
Strategy: Prediction	of	optimal	action.	The	method	can	learn	rewards	of	each	place	
and	action	to	get	good	rewards.
Result: Planning	in	28	x	28	grid	map,	Applicable	to	continuous	control	robot	
Map
Pose
Velocity
Goal
Action
A.	Tamar,	Y.	Wu,	G.	Thomas,	S.	Levine,	and	P.	Abbeel
Dept.	of	Electrical	Engineering	and	Computer	Sciences,	UC	Berkeley
Presenter:	
Keisuke	Fujimoto
(ABEJA)
Background
Target	:	Autonomous	Robot
• Manipulation	robot,	Navigation	robot,	Transfer	robot
Problem	:	
• Reinforcement	learning	can	not	work	outside	of	training	
environments.	
Goal
Target	object
Manipulation	robot Navigation	robot
Contribution
• Value	Iteration	Networks	(VIN)
• Model	free	training
• It	does	not	require	robot	dynamics	models.
• Generalized	action	prediction	in	new	environments
• It	can	not	work	outside	of	training	environments.
• Key	approach
• Represents	value-iteration	planning	by	CNN
• Prediction	of	reward	map	and	computation	of	sum	
of	future	rewards.
Overview	of	VIN
Input	:	State	of	the	robot	(pose,	velocity),	goal,	map	(left	fig.)
Output	:	Action	(direction,	mortar's	torque)
Strategy	:	Determination	of	optimal	action	using	predicted		
rewards	(right	fig.).
State Rewards
Reward	propagation
• Action	can	be	determined	by	sum	
of	future	reward	generated	using	
reward	propagation
-10 -10 -10
-10 -10 1
-10 -10
Map Reward	from	map
Left	move	action
-10 -10 -9 -10
-10 -10 -9 1 0.9
-10 -10 -9
-10 -10 -10
-10 -10 1 -9
-9 -10 -10 0.9
-9 -9
Up	move	from	map
One-step	propagation	example:
Determination	of	action
• Optimal	action	at	reward	propagated	
place	is	max	reward	action	(middle	fig.)
• Determination	of	optimal	action	using	
propagated	reward	(right	fig.)
Left	move	action
-10 -10 -9 -10
-10 -10 -9 1 0.9
-10 -10 -9
-10 -10 -10
-10 -10 1 -9
-9 -10 -10 0.9
-9 -9
Up	move	from	map -10 -10 -9 -10
-10 -10 -9 1 0.9
-9 -10 -10 0.9
-9 -9
Max
After	Reward	propagation	
-10 -10 -9 -8 -10
-10 -10 -9 1 0.9
-9 -10 -10 0.9 0.8
-8 -9 -9 0.8 0.7
-7 -8 -8 0.7 0.6
Current	robot	pose
Value	Iteration	Module
• Reward	propagation	with	Convolutional	Neural	Network
• Input	is	reward	map	and	output	is	sum	of	feature	reward	map
• Q	is	hidden	reward	map,	V	is	sum	of	feature	reward	map
Output
Convolution
Max
Value	Iteration	Networks
• Deep	Architecture	of	Value	Iteration	Networks
• Input	is	map	and	state,	fR predicts	reward	map
• Attention	modules	crops	the	value	map	around	robot	position
• 𝜓 outputs	optimal	action
Attention	function
• Attention	module	crops	a	subset	of	the	values	around	
current	robot	pose.
• Optimal	pose	have	relative	to	only	current	robot	pose.
• Due	to	this	attention	module,	prediction	of	optimal	
action	becomes	easy.
-10 -10 -9 -8 -10
-10 -10 -9 1 0.9
-9 -10 -10 0.9 0.8
-8 -9 -9 0.8 0.7
-7 -8 -8 0.7 0.6
If	robot	is	here.
-10 0.9 0.8
-9 0.8 0.7
-8 0.7 0.6
Selected	area
Grid-World	Domain
Environment	:
Occupancy	grid	map,	test	size	is	8x8	to	28x28
The	number	of	recurrence	is	20	for	the	28x28	maps
Training	dataset	is	5000	maps,	7	trajectories.
Networks	Arch.	:	
Competitive	method	:
CNN	based	Deep	Q-Network,	Direct	action	prediction	using	FCN	
Map,	Goal
CNN Reward	map VI	module Attention FC	layer
Action
Current	Position
3	layer	net
150	hidden	node 10	channels	in	Q-layer 80	parameters
Results	of	Grid-World	Domain
Predicted	path Reward Sum	of	feature	reward
Mars	Rover	Navigation
Environment	:
• Navigating	the	surface	of	Mars	by	a	rover.
• It	predicts	path	from	only	surface	image	without	obstacle	
information.
• Success	rate	is	90.3%.
Red	point	shows	elevation	sharper,	in	prediction	time,	vin	
does	not	uses	the	elevation	shape	information
Continuous	Control
Environment	:
• Apply	to	continuous	control	space.
• Grid	size	is	28x28
• input	is	position	and	velocity	
which	is	float	data.	
• Output	is	2d	continuous	control	
parameters.
Comparison	about	final	distance	to	the	goal
This	result	is	from	author's	presentation
WebNav	Challenge
Environment	:
• Navigate	website	links	to	find	a	query
• Features:	average	word	embeddings
• Using	an	approximate	graph	for	planning
Evaluation:	
• Success	rate	of	within	top-4	predictions	
• Test	set	1:	start	from	index	page	
• Test	set	2:	start	from	random	page	
Result:
Conclusion
Purpose	:
• Machine	learning	based	robot	path	planning.	
Method	:	
• Learning	rewards	of	each	place	and	predict	action	
using	propagated	reward.
Result	:	
• VIN	policies	learn	an	approximate	planning	
computation	relevant	for	solving	the	task.
• Grid-worlds,	to	continuous	control,	and	even	to	
navigation	of	Wikipedia	links.
Code:
https://github.com/peisuke/vin
This	code	is	implemented	in	chainer!
Twitter:		
@peisuke
We	are	hiring	!!	
https://www.wantedly.com/companies/abeja

More Related Content

What's hot

007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium ThermodynamicsHa Phuong
 
「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )
「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )
「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )aich_08_
 
[DL輪読会]Unsupervised Learning by Predicting Noise
[DL輪読会]Unsupervised Learning by Predicting Noise[DL輪読会]Unsupervised Learning by Predicting Noise
[DL輪読会]Unsupervised Learning by Predicting NoiseDeep Learning JP
 
全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワーク
全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワーク全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワーク
全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワークErika_Fujita
 
CNNの構造最適化手法について
CNNの構造最適化手法についてCNNの構造最適化手法について
CNNの構造最適化手法についてMasanoriSuganuma
 
深層強化学習の self-playで、複雑な行動を機械に学習させたい!
深層強化学習の self-playで、複雑な行動を機械に学習させたい!深層強化学習の self-playで、複雑な行動を機械に学習させたい!
深層強化学習の self-playで、複雑な行動を機械に学習させたい!Junichiro Katsuta
 
第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
Recommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learningRecommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learningArithmer Inc.
 
PRML輪読#13
PRML輪読#13PRML輪読#13
PRML輪読#13matsuolab
 
Dynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameDynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameSuelen Carvalho
 
2013 Matrix Computations 4th.pdf
2013 Matrix Computations 4th.pdf2013 Matrix Computations 4th.pdf
2013 Matrix Computations 4th.pdfssuserfd0175
 
変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)Takao Yamanaka
 
170120107066 power series.ppt
170120107066 power series.ppt170120107066 power series.ppt
170120107066 power series.pptharsh kothari
 
深層学習 - 画像認識のための深層学習 ②
深層学習 - 画像認識のための深層学習 ②深層学習 - 画像認識のための深層学習 ②
深層学習 - 画像認識のための深層学習 ②Shohei Miyashita
 
DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器hirono kawashima
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithmJie-Han Chen
 
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...Deep Learning JP
 
統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333Issei Kurahashi
 

What's hot (20)

Interpolation
InterpolationInterpolation
Interpolation
 
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
 
「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )
「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )
「続・わかりやすいパターン認識」 第12章 ディリクレ過程混合モデルによるクラスタリング(前半 : 12.1 )
 
[DL輪読会]Unsupervised Learning by Predicting Noise
[DL輪読会]Unsupervised Learning by Predicting Noise[DL輪読会]Unsupervised Learning by Predicting Noise
[DL輪読会]Unsupervised Learning by Predicting Noise
 
全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワーク
全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワーク全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワーク
全脳アーキテクチャ若手の会 機械学習勉強会 ベイジアンネットワーク
 
CNNの構造最適化手法について
CNNの構造最適化手法についてCNNの構造最適化手法について
CNNの構造最適化手法について
 
深層強化学習の self-playで、複雑な行動を機械に学習させたい!
深層強化学習の self-playで、複雑な行動を機械に学習させたい!深層強化学習の self-playで、複雑な行動を機械に学習させたい!
深層強化学習の self-playで、複雑な行動を機械に学習させたい!
 
그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
 
第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)
 
Recommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learningRecommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learning
 
PRML輪読#13
PRML輪読#13PRML輪読#13
PRML輪読#13
 
Dynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameDynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris Game
 
2013 Matrix Computations 4th.pdf
2013 Matrix Computations 4th.pdf2013 Matrix Computations 4th.pdf
2013 Matrix Computations 4th.pdf
 
変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)
 
170120107066 power series.ppt
170120107066 power series.ppt170120107066 power series.ppt
170120107066 power series.ppt
 
深層学習 - 画像認識のための深層学習 ②
深層学習 - 画像認識のための深層学習 ②深層学習 - 画像認識のための深層学習 ②
深層学習 - 画像認識のための深層学習 ②
 
DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithm
 
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
 
統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333
 

Viewers also liked

Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsKen Kuroki
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoderssuga93
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansKimikazu Kato
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Toru Fujino
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...Shuhei Yoshida
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learningmooopan
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmKatsuki Ohto
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Kazuto Fukuchi
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentHiroyuki Fukuda
 
時系列データ3
時系列データ3時系列データ3
時系列データ3graySpace999
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowTatsuya Shirakawa
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence LearningDeep Learning JP
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics Koichi Hamada
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...Kusano Hitoshi
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Kentaro Minami
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learningKazuki Fujikawa
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介Kohei Hayashi
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural NetworksSeiya Tokui
 

Viewers also liked (18)

Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-Means
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
 
時系列データ3
時系列データ3時系列データ3
時系列データ3
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 

Similar to Value iteration networks

自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用Ryo Iwaki
 
Rapid motor adaptation for legged robots
Rapid motor adaptation for legged robotsRapid motor adaptation for legged robots
Rapid motor adaptation for legged robotsRohit Choudhury
 
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...Neo4j
 
Artificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot NavigationArtificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot NavigationMithun Chowdhury
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesisSeoung-Ho Choi
 
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networksJehong Lee
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning민재 정
 
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRLcrowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL민재 정
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigationguest90654fd
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigationguest90654fd
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overviewNatalia Díaz Rodríguez
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahulKirtoniya
 
Analysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachAnalysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachLorenzo Cesaretti
 
Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.SUMITRAJ312049
 
Farkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdfFarkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdfMonesseKHAMISSIA1
 
Procedural modeling using autoencoder networks
Procedural modeling using autoencoder networksProcedural modeling using autoencoder networks
Procedural modeling using autoencoder networksShuhei Iitsuka
 
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...AutonomyIncubator
 

Similar to Value iteration networks (20)

自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用
 
Rapid motor adaptation for legged robots
Rapid motor adaptation for legged robotsRapid motor adaptation for legged robots
Rapid motor adaptation for legged robots
 
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
 
Artificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot NavigationArtificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot Navigation
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
 
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
 
Introduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement LearningIntroduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement Learning
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
 
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRLcrowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigation
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigation
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overview
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
Analysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachAnalysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approach
 
Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.
 
Farkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdfFarkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdf
 
Procedural modeling using autoencoder networks
Procedural modeling using autoencoder networksProcedural modeling using autoencoder networks
Procedural modeling using autoencoder networks
 
Conv xg
Conv xgConv xg
Conv xg
 
Resume_2016
Resume_2016Resume_2016
Resume_2016
 
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
 

More from Fujimoto Keisuke

A quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point setsA quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point setsFujimoto Keisuke
 
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...Fujimoto Keisuke
 
YOLACT real-time instance segmentation
YOLACT real-time instance segmentationYOLACT real-time instance segmentation
YOLACT real-time instance segmentationFujimoto Keisuke
 
Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異Fujimoto Keisuke
 
ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)Fujimoto Keisuke
 
Temporal Cycle Consistency Learning
Temporal Cycle Consistency LearningTemporal Cycle Consistency Learning
Temporal Cycle Consistency LearningFujimoto Keisuke
 
20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction Survey20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction SurveyFujimoto Keisuke
 
20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説Fujimoto Keisuke
 
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture ModelsSliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture ModelsFujimoto Keisuke
 
LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料Fujimoto Keisuke
 
Stock trading using ChainerRL
Stock trading using ChainerRLStock trading using ChainerRL
Stock trading using ChainerRLFujimoto Keisuke
 
Cold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy GradientCold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy GradientFujimoto Keisuke
 
Representation learning by learning to count
Representation learning by learning to countRepresentation learning by learning to count
Representation learning by learning to countFujimoto Keisuke
 
Dynamic Routing Between Capsules
Dynamic Routing Between CapsulesDynamic Routing Between Capsules
Dynamic Routing Between CapsulesFujimoto Keisuke
 
Deep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPUDeep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPUFujimoto Keisuke
 
Global optimality in neural network training
Global optimality in neural network trainingGlobal optimality in neural network training
Global optimality in neural network trainingFujimoto Keisuke
 

More from Fujimoto Keisuke (20)

A quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point setsA quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point sets
 
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
 
YOLACT real-time instance segmentation
YOLACT real-time instance segmentationYOLACT real-time instance segmentation
YOLACT real-time instance segmentation
 
Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異
 
ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)
 
Temporal Cycle Consistency Learning
Temporal Cycle Consistency LearningTemporal Cycle Consistency Learning
Temporal Cycle Consistency Learning
 
ML@Loft
ML@LoftML@Loft
ML@Loft
 
20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction Survey20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction Survey
 
Chainer meetup 9
Chainer meetup 9Chainer meetup 9
Chainer meetup 9
 
20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説
 
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture ModelsSliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
 
LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料
 
Stock trading using ChainerRL
Stock trading using ChainerRLStock trading using ChainerRL
Stock trading using ChainerRL
 
Cold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy GradientCold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy Gradient
 
Representation learning by learning to count
Representation learning by learning to countRepresentation learning by learning to count
Representation learning by learning to count
 
Dynamic Routing Between Capsules
Dynamic Routing Between CapsulesDynamic Routing Between Capsules
Dynamic Routing Between Capsules
 
Deep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPUDeep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPU
 
ICCV2017一人読み会
ICCV2017一人読み会ICCV2017一人読み会
ICCV2017一人読み会
 
Global optimality in neural network training
Global optimality in neural network trainingGlobal optimality in neural network training
Global optimality in neural network training
 
CVPR2017 oral survey
CVPR2017 oral surveyCVPR2017 oral survey
CVPR2017 oral survey
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Value iteration networks