Submit Search
Upload
動的ボルツマンマシンとPommerman
•
2 likes
•
1,192 views
T
Takayuki Osogami
Follow
2019年7月19日全脳アーキテクチャ勉強会講演資料
Read less
Read more
Science
Report
Share
Report
Share
1 of 33
Download now
Download to read offline
Recommended
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
[DL輪読会]1次近似系MAMLとその理論的背景
[DL輪読会]1次近似系MAMLとその理論的背景
Deep Learning JP
JVS:フリーの日本語多数話者音声コーパス
JVS:フリーの日本語多数話者音声コーパス
Shinnosuke Takamichi
Nakai22sp03 presentation
Nakai22sp03 presentation
Yuki Saito
Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験
myxymyxomatosis
[DL輪読会]Glow: Generative Flow with Invertible 1×1 Convolutions
[DL輪読会]Glow: Generative Flow with Invertible 1×1 Convolutions
Deep Learning JP
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
Deep Learning JP
研究について思うところ | What i think about research (in Japanese)
研究について思うところ | What i think about research (in Japanese)
Yuta Itoh
Recommended
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
[DL輪読会]1次近似系MAMLとその理論的背景
[DL輪読会]1次近似系MAMLとその理論的背景
Deep Learning JP
JVS:フリーの日本語多数話者音声コーパス
JVS:フリーの日本語多数話者音声コーパス
Shinnosuke Takamichi
Nakai22sp03 presentation
Nakai22sp03 presentation
Yuki Saito
Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験
myxymyxomatosis
[DL輪読会]Glow: Generative Flow with Invertible 1×1 Convolutions
[DL輪読会]Glow: Generative Flow with Invertible 1×1 Convolutions
Deep Learning JP
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
Deep Learning JP
研究について思うところ | What i think about research (in Japanese)
研究について思うところ | What i think about research (in Japanese)
Yuta Itoh
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
Deep Learning JP
【DL輪読会】Data-Efficient Reinforcement Learning with Self-Predictive Representat...
【DL輪読会】Data-Efficient Reinforcement Learning with Self-Predictive Representat...
Deep Learning JP
Non-autoregressive text generation
Non-autoregressive text generation
nlab_utokyo
【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】Monocular real time volumetric performance capture
Deep Learning JP
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
Deep Learning JP
【DL輪読会】GIT RE-BASIN: MERGING MODELS MODULO PERMU- TATION SYMMETRIES
【DL輪読会】GIT RE-BASIN: MERGING MODELS MODULO PERMU- TATION SYMMETRIES
Deep Learning JP
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
Deep Learning JP
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
Deep Learning JP
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
Deep Learning JP
敵対的学習による統合型ソースフィルタネットワーク
敵対的学習による統合型ソースフィルタネットワーク
NU_I_TODALAB
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
Deep Learning JP
ICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性について
tmtm otm
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
Deep Learning JP
JDLA主催「CVPR2023技術報告会」発表資料
JDLA主催「CVPR2023技術報告会」発表資料
Morpho, Inc.
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
Deep Learning JP
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
Deep Learning JP
[DL輪読会]End-to-End Object Detection with Transformers
[DL輪読会]End-to-End Object Detection with Transformers
Deep Learning JP
文献紹介:Benchmarking Neural Network Robustness to Common Corruptions and Perturb...
文献紹介:Benchmarking Neural Network Robustness to Common Corruptions and Perturb...
Toru Tamaki
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
Sho Takase
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Yusuke Uchida
More Related Content
What's hot
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
Deep Learning JP
【DL輪読会】Data-Efficient Reinforcement Learning with Self-Predictive Representat...
【DL輪読会】Data-Efficient Reinforcement Learning with Self-Predictive Representat...
Deep Learning JP
Non-autoregressive text generation
Non-autoregressive text generation
nlab_utokyo
【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】Monocular real time volumetric performance capture
Deep Learning JP
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
Deep Learning JP
【DL輪読会】GIT RE-BASIN: MERGING MODELS MODULO PERMU- TATION SYMMETRIES
【DL輪読会】GIT RE-BASIN: MERGING MODELS MODULO PERMU- TATION SYMMETRIES
Deep Learning JP
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
Deep Learning JP
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
Deep Learning JP
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
Deep Learning JP
敵対的学習による統合型ソースフィルタネットワーク
敵対的学習による統合型ソースフィルタネットワーク
NU_I_TODALAB
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
Deep Learning JP
ICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性について
tmtm otm
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
Deep Learning JP
JDLA主催「CVPR2023技術報告会」発表資料
JDLA主催「CVPR2023技術報告会」発表資料
Morpho, Inc.
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
Deep Learning JP
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
Deep Learning JP
[DL輪読会]End-to-End Object Detection with Transformers
[DL輪読会]End-to-End Object Detection with Transformers
Deep Learning JP
文献紹介:Benchmarking Neural Network Robustness to Common Corruptions and Perturb...
文献紹介:Benchmarking Neural Network Robustness to Common Corruptions and Perturb...
Toru Tamaki
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
Sho Takase
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Yusuke Uchida
What's hot
(20)
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】Data-Efficient Reinforcement Learning with Self-Predictive Representat...
【DL輪読会】Data-Efficient Reinforcement Learning with Self-Predictive Representat...
Non-autoregressive text generation
Non-autoregressive text generation
【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】GIT RE-BASIN: MERGING MODELS MODULO PERMU- TATION SYMMETRIES
【DL輪読会】GIT RE-BASIN: MERGING MODELS MODULO PERMU- TATION SYMMETRIES
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
敵対的学習による統合型ソースフィルタネットワーク
敵対的学習による統合型ソースフィルタネットワーク
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
ICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性について
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
JDLA主催「CVPR2023技術報告会」発表資料
JDLA主催「CVPR2023技術報告会」発表資料
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
[DL輪読会]End-to-End Object Detection with Transformers
[DL輪読会]End-to-End Object Detection with Transformers
文献紹介:Benchmarking Neural Network Robustness to Common Corruptions and Perturb...
文献紹介:Benchmarking Neural Network Robustness to Common Corruptions and Perturb...
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
動的ボルツマンマシンとPommerman
1.
動的ボルツマンマシンとPommerman IBM東京基礎研究所 恐神貴行 © 2019 IBM
Corporation 1
2.
恐神貴行 @TOsogami © 2019
IBM Corporation 2 1998年 日本アイ・ビー・エム(株)入社 東京基礎研究所配属 2005年 米国学術博士(カーネギーメロン大学コンピュータ・サイエンス学科) 2013-19年 JST CRESTプロジェクト主たる共同研究者 2015年 IBMアカデミー会員 2019年 IBMシニア・テクニカル・スタッフ・メンバー 現在 数学アドバンストイノベーションプラットフォーム(AIMaP) 運営委員 産業数学の先進的・基礎的共同研究拠点 共同利用・共同研究委員会委員 人工知能・機械学習関連の学会で活動 など 興味 確率モデル、逐次的意思決定、強化学習
3.
基礎研究 受賞 • 人工知能学会全国大会優秀賞 (2004,
2006, 2015, 2017) • IBISワークショップ・ベストプレゼンテーション賞 (2015) • 待ち行列研究部会論文賞 (2015) 学術書 基礎研究からビジネスのイノベーションへ IBM東京基礎研究所.数理科学部門の取り組み ビジネスのイノベーション 日本OR学会 実施賞 (2003) ICDM データマイ ニング・コンテス ト優勝 (2007) PDOS 製造プロセスの最適化 Image courtesy of worradmu at FreeDigitalPhotos.net 日本OR学会 文献賞奨励賞 (2010) ANACONDA センサーデータからの異常検知 Finance trend predictor 金融市場の予測 NeurIPS Pommerman コンペティション優勝 (2018) © 2019 IBM Corporation 3
4.
Dynamic Boltzmann machine
(DyBM) from scientific contributions to business innovations © 2019 IBM Corporation Publication in a Nature journal (2015) Business innovation (2018)
5.
How can we
make effective use of spike-timing dependent plasticity (STDP) in artificial neural networks? © 2019 IBM Corporation Hebb’s rule (’49) STDP (’90s) Cells that fire together, wire together Bi & Poo (1998) Dan & Poo (2006) Amount of changes depends on timing of spikes Today’s artificial neural networks ?[Nessler et al. 2013, Bengio et al. 2016, Scellier & Bengio 2016]
6.
DyBM provides theoretical
underpinnings for STDP, similar to Boltzmann machine for Hebb’s rule © 2019 IBM Corporation Boltzmann machine Dynamic Boltzmann machine Hebb’s rule Spike-timing dependent plasticity Bi & Poo (1998) Dan & Poo (2006) MLE MLE Cells that fire together, wire together Refine Boltzmann machine Hebb’s rule Derive
7.
Learning rule of
Boltzmann machine, maximizing log-likelihood [Hinton et al. ’83] © 2019 IBM Corporation Neuron Neuron Synapse 𝒙∈ Expected value: 𝒙 Log likelihood of training data : 𝒙∈ cf. Hebb’s rule Stochastic gradient
8.
Pre-synaptic neuron Post-synaptic neuron Image
courtesy of dream designs at FreeDigitalPhotos.net © 2019 IBM Corporation
9.
Spike-timing dependent plasticity
(STDP): Amount of changes depends on timing of spikes © 2019 IBM Corporation Synapse strengthened (Long Term Potentiation) Bi & Poo (1998) Dan & Poo (2006) Pre-synaptic neuron Post-synaptic neuronSynapse Synapse weakened (Long Term Depression)
10.
Dynamic Boltzmann machine
as a limit of a sequence of Boltzmann machines © 2019 IBM Corporation Time Dynamic Boltzmann machine Historical values Next value Weight from neuron at time to neuron at time We learn Boltzmann machine for a -th order Markov model
11.
Inference with Dynamic
Boltzmann machine (LTP only) © 2019 IBM Corporation Conduction delay, Synaptic eligibility trace: [ ] [ ] Probability for neuron to fire at time : :
12.
Learning with DyBM,
maximizing log-likelihood © 2019 IBM Corporation Conduction delay, Synaptic eligibility trace: [ ] [ ] [: ] : Stochastic gradient update for LTP weight: : Spike-timing dependent How recently/often spikes reached from neuron cf. Boltzmann machine
13.
No back propagation
through time in DyBM’s learning © 2019 IBM Corporation : [ ] *summation is over pre-synaptic neurons connected toPer-step learning time is independent of the length of time-series (local in time & space) cf. Back propagation through time needed for recurrent neural networks (including LSTM)
14.
Online learning can
also improve predictive accuracy for non-stationary data © 2019 IBM Corporation Training Test Batch 0.932 0.863 Online 0.980 0.958 Training Test Predictive accuracy* Batch: Train DyBM optimally → Test with fixed parameters Online: Train DyBM optimally Further online learning → Test while learning online *Predictive accuracy is the coefficient of correlation between prediction and realized values in sensor data from a power generator, but Figure is IBM stock price from Yahoo! Finance
15.
DyBM provides theoretical
underpinnings for STDP © 2019 IBM Corporation Hebb’s rule (’49) Motivated artificial neural networks - Perceptron (’58) Failure 1950 1960 1970 1980 1990 Theoretical underpinnings - Hopfield network (’82) - Boltzmann machine (’83) 2000 2010 Success - Deep learning STDP (’90s) Theoretical underpinnings - Dynamic Boltzmann machine Successful applications
16.
Extensions of DyBM ©
2019 IBM Corporation 16 To structured time-series • T. Osogami, R. Raymond, A. Goel, T. Shirai, and T. Maehara, “Dynamic determinantal point processes,” AAAI-18 To real-valued time-series • S. Dasgupta and T. Osogami, “Nonlinear dynamic Boltzmann machines for time series prediction,” AAAI-17 To models with hidden units • T. Osogami, H. Kajino, and T. Sekiyama, “Bidirectional learning for time-series models with hidden units,” ICML 2017 To continuous space • H. Kajino, “A functional dynamic Boltzmann machine,” IJCAI-17
17.
References © 2019 IBM
Corporation 17 • 恐神貴行, ボルツマンマシン, コロナ社, 2019• T. Osogami and M. Otsuka, “Seven neurons memorizing sequences of alphabetical images via spike-timing dependent plasticity,” Scientific Reports 5, 14149 (2015). www.nature.com/articles/srep14149 • T. Osogami and S. Dasgupta, Energy-based machine learning, IJCAI-17 tutorial researcher.watson.ibm.com/researcher/view_g roup.php?id=7834 • github.com/ibm-research-tokyo/dybm
18.
NeurIPS 2018 Pommerman
コンペティションで優勝しました © 2019 IBM Corporation 18
19.
Pommermanは今日のAI技術では手に負えません © 2019 IBM
Corporation Pommermanの難しさ: • 実時間での意思決定 • 複数のエージェントの協調 • 部分観測 • ⾧期のプラニング AIの学会では、この様な難しい課題を コンペティションとすることで技術の 発展を目指しています IBM エージェント (赤) vs. デフォルト・エージェント (青) 19
20.
最終的に目標が達成されるように 逐次的にアクションを選びます © 2019 IBM
Corporation 20 壁を壊す アイテムを 取得する 敵を追い 詰める 勝利 勝利するために 何をするべきか 逐次的意思決定
21.
逐次的意思決定問題へのアプローチ © 2019 IBM
Corporation 21 環境が既知環境が未知 • 環境をシミュレート可 • 他者の動きが未知 • 一部観測不可 強化学習 プラニング
22.
環境をシミュレート できる場合には、 木探索が有効です © 2019 IBM
Corporation 22 (爆,右,爆,上) (左,右,右,上) . . . . . .
23.
Pommermanでは、巨大な探索木に対して 実時間の意思決定が必要です © 2019 IBM
Corporation 分岐数 ~ 通り 最低10手先 (爆弾の寿命) を考慮 通り 0.1秒で 意思決定 . . . . . . . . . 23
24.
新技術 悲観的シナリオによる実時間での木探索 © 2019 IBM
Corporation T. Osogami & T. Takahashi, Real-time tree search with pessimistic scenarios, arXiv:1902.10870 確率的シナリオ による木探索 決定的・悲観的 シナリオによる 評価 24
25.
相手に複数の行動を同時にとらせることで、 Pommermanにおける悲観的なシナリオを作 ることができます © 2019 IBM
Corporation 25
26.
自己対戦により、最適な悲観度を学習しました © 2019 IBM
Corporation 26 悲観度0 悲観度1 悲観度2 悲観度3
27.
エージェントが移動できる場所の数が 「生存可能性」の強さを表します © 2019 IBM
Corporation エージェントが 移動できる場所 良いアクション - 自分・仲間の生存可能性↑ - 敵の生存可能性↓ - 生存可能性を一定以上に 保って、アイテムを収集 27
28.
悲観的シナリオによる木探索の応用可能性 © 2019 IBM
Corporation 28 ゲーム • デバッグ • ゲーム内キャラクター 映像・シミュレーション 自律飛行・走行
29.
Pommermanを動かしてみるには © 2019 IBM
Corporation 29 $ git clone https://github.com/MultiAgentLearning/playground.git $ cd playground $ pip install –r requirements.txt $ python examples/simple_ffa_run.py 詳細は https://github.com/MultiAgentLearning/playground/tree/master/docs
30.
NeurIPS 2019でもPommermanコンペティションが 開催されます © 2019
IBM Corporation 30 昨年と同ルール 新ルール • エージェント間の通信可 詳細は https://www.pommerman.com/competitions
31.
協力しながら競争することで、 勝つエージェントができました © 2019 IBM
Corporation 31 情報共有 • アイデア・手法 • うまく行ったこと それぞれ、勝つものを作る
32.
Pommermanまとめ © 2019 IBM
Corporation 悲観的なシナリオによる木探索は、 高い安全性が要求される状況での、 実時間での逐次的意思決定に有効 応用の可能性
33.
動的ボルツマンマシンとPommerman 恐神貴行 IBM東京基礎研究所 © 2019 IBM
Corporation ありがとうございました
Download now