SlideShare a Scribd company logo
1 of 10
IaGo: an Othello AI
inspired by AlphaGo
Shion HONDA
@DSP
Overview
2
• I implemented an Othello AI (IaGo) inspired
by AlphaGo algorithm
• AlphaGo is composed of 3 parts:
• SL policy network: predict next action
• Value network: evaluate board state
• MCTS: choose action using 2 networks
Background
Game Search space AI Year
Othello 10^60 NEC Logistello 1997
Go 10^360 DeepMind AlphaGo 2016
3
• Go has extremely huge search space: 10360
• c.f. Estimated number of all atoms existing in the
universe: 1080
• Before AlphaGo, it had been thought to take
10 more years for Go AIs to beat human
professional due to its huge search space
• Since I don’t have enough machine resources
for replicating AlphaGo, I made Othello
version
Dataset
4
Board state Place of next stone
6 million -> 48 million
• Data were from online Othello game records
• 6 million sets of board state & the place of
next stone
• Augmented them by 8 times using rotation &
transposition symmetry
SL policy network (classification)
• Input: 2-ch matrices of board state
• Output: Probability distribution of next choice
• Network: 9 layers of convolution with
softmax output layer
• 57% accuracy of prediction
5
RL policy network
• Polished SL policy with policy gradients
-> Reinforcement Learning policy network
• After training, generated teacher data for
value network
• Played games between RL policy networks
-> 1.25 million sets of board state and result
• Augmented by 8 times -> 10 million
6
SL policy network
SL policy network
(opponent)
VS
WIN -> encourage its plays
LOSE -> discourage its plays
(32*400=12,800 times)
Value network (regression)
• Input: 2-ch matrices of board state
• Output: Value of the board state
(Win: +1, Lose: -1, Draw: 0)
• Network: 9 layers of convolution (similar to
the SL policy network)
7
Prediction examples
Monte Carlo tree search
• Rollout policy: simplified SL policy network
that works faster
• MCTS: search deeper for a good path
1. Make child node by
SL policy network
2. Evaluate current node
by value network and
the result of rollout policy
self-play
3. Update ancestor nodes’ value
4. Choose most visited node
8
Results
• IaGo (complete) beat simple SL policy in
approx. 90% of games!
• Still, there is room for improvement…
• It takes too long time for calculation
• IaGo seems to have a weak point
• Teacher data were from games
between amateurs
• Objective/quantitative evaluation is
needed
• Graphical User Interface
-> Upload to web!
9
Summary
• IaGo is composed of 3 parts:
• SL policy network: predict next action
• Value network: evaluate board state
• MCTS: choose action using 2 networks
• IaGo became a good player through training
10

More Related Content

Similar to IaGo: an Othello AI inspired by AlphaGo

Final Presentation - Edan&Itzik
Final Presentation - Edan&ItzikFinal Presentation - Edan&Itzik
Final Presentation - Edan&Itzik
itzik cohen
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 
osdi23_slides_lo_v2.pdf
osdi23_slides_lo_v2.pdfosdi23_slides_lo_v2.pdf
osdi23_slides_lo_v2.pdf
gmdvmk
 

Similar to IaGo: an Othello AI inspired by AlphaGo (20)

Introduction to Alphago Zero
Introduction to Alphago ZeroIntroduction to Alphago Zero
Introduction to Alphago Zero
 
J-Fall 2017 - AI Self-learning Game Playing
J-Fall 2017 - AI Self-learning Game PlayingJ-Fall 2017 - AI Self-learning Game Playing
J-Fall 2017 - AI Self-learning Game Playing
 
Study on Evaluation Function Design of Mahjong using Supervised Learning
Study on Evaluation Function Design of Mahjong using Supervised LearningStudy on Evaluation Function Design of Mahjong using Supervised Learning
Study on Evaluation Function Design of Mahjong using Supervised Learning
 
A Presentation on the Paper: Mastering the game of Go with deep neural networ...
A Presentation on the Paper: Mastering the game of Go with deep neural networ...A Presentation on the Paper: Mastering the game of Go with deep neural networ...
A Presentation on the Paper: Mastering the game of Go with deep neural networ...
 
G-Store: High-Performance Graph Store for Trillion-Edge Processing
G-Store: High-Performance Graph Store for Trillion-Edge ProcessingG-Store: High-Performance Graph Store for Trillion-Edge Processing
G-Store: High-Performance Graph Store for Trillion-Edge Processing
 
ConvNets_C_Focke2
ConvNets_C_Focke2ConvNets_C_Focke2
ConvNets_C_Focke2
 
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
 
How DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoHow DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of Go
 
Final Presentation - Edan&Itzik
Final Presentation - Edan&ItzikFinal Presentation - Edan&Itzik
Final Presentation - Edan&Itzik
 
Games.4
Games.4Games.4
Games.4
 
Starcraft 2016
Starcraft 2016Starcraft 2016
Starcraft 2016
 
Artificial neural networks introduction
Artificial neural networks introductionArtificial neural networks introduction
Artificial neural networks introduction
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers
 
Harlan Beverly Lag The Barrier to innovation gdc austin 2009
Harlan Beverly Lag The Barrier to innovation gdc austin 2009Harlan Beverly Lag The Barrier to innovation gdc austin 2009
Harlan Beverly Lag The Barrier to innovation gdc austin 2009
 
Monte Carlo C++
Monte Carlo C++Monte Carlo C++
Monte Carlo C++
 
Gdc gameplay replication in acu with videos
Gdc   gameplay replication in acu with videosGdc   gameplay replication in acu with videos
Gdc gameplay replication in acu with videos
 
.NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov).NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov)
 
osdi23_slides_lo_v2.pdf
osdi23_slides_lo_v2.pdfosdi23_slides_lo_v2.pdf
osdi23_slides_lo_v2.pdf
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 

More from Shion Honda

More from Shion Honda (11)

BERTをブラウザで動かしたい! ―MobileBERTとTensorFlow.js―
BERTをブラウザで動かしたい!―MobileBERTとTensorFlow.js―BERTをブラウザで動かしたい!―MobileBERTとTensorFlow.js―
BERTをブラウザで動かしたい! ―MobileBERTとTensorFlow.js―
 
Bridging between Vision and Language
Bridging between Vision and LanguageBridging between Vision and Language
Bridging between Vision and Language
 
Graph U-Nets
Graph U-NetsGraph U-Nets
Graph U-Nets
 
Deep Learning Chap. 12: Applications
Deep Learning Chap. 12: ApplicationsDeep Learning Chap. 12: Applications
Deep Learning Chap. 12: Applications
 
Deep Learning Chap. 6: Deep Feedforward Networks
Deep Learning Chap. 6: Deep Feedforward NetworksDeep Learning Chap. 6: Deep Feedforward Networks
Deep Learning Chap. 6: Deep Feedforward Networks
 
画像認識 第9章 さらなる話題
画像認識 第9章 さらなる話題画像認識 第9章 さらなる話題
画像認識 第9章 さらなる話題
 
Towards Predicting Molecular Property by Graph Neural Networks
Towards Predicting Molecular Property by Graph Neural NetworksTowards Predicting Molecular Property by Graph Neural Networks
Towards Predicting Molecular Property by Graph Neural Networks
 
画像認識 6.3-6.6 畳込みニューラル ネットワーク
画像認識 6.3-6.6 畳込みニューラルネットワーク画像認識 6.3-6.6 畳込みニューラルネットワーク
画像認識 6.3-6.6 畳込みニューラル ネットワーク
 
深層学習による自然言語処理 第2章 ニューラルネットの基礎
深層学習による自然言語処理 第2章 ニューラルネットの基礎深層学習による自然言語処理 第2章 ニューラルネットの基礎
深層学習による自然言語処理 第2章 ニューラルネットの基礎
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understand...
BERT: Pre-training of Deep Bidirectional Transformers for Language Understand...BERT: Pre-training of Deep Bidirectional Transformers for Language Understand...
BERT: Pre-training of Deep Bidirectional Transformers for Language Understand...
 
Planning chemical syntheses with deep neural networks and symbolic AI
Planning chemical syntheses with deep neural networks and symbolic AIPlanning chemical syntheses with deep neural networks and symbolic AI
Planning chemical syntheses with deep neural networks and symbolic AI
 

Recently uploaded

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 

Recently uploaded (20)

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 

IaGo: an Othello AI inspired by AlphaGo

  • 1. IaGo: an Othello AI inspired by AlphaGo Shion HONDA @DSP
  • 2. Overview 2 • I implemented an Othello AI (IaGo) inspired by AlphaGo algorithm • AlphaGo is composed of 3 parts: • SL policy network: predict next action • Value network: evaluate board state • MCTS: choose action using 2 networks
  • 3. Background Game Search space AI Year Othello 10^60 NEC Logistello 1997 Go 10^360 DeepMind AlphaGo 2016 3 • Go has extremely huge search space: 10360 • c.f. Estimated number of all atoms existing in the universe: 1080 • Before AlphaGo, it had been thought to take 10 more years for Go AIs to beat human professional due to its huge search space • Since I don’t have enough machine resources for replicating AlphaGo, I made Othello version
  • 4. Dataset 4 Board state Place of next stone 6 million -> 48 million • Data were from online Othello game records • 6 million sets of board state & the place of next stone • Augmented them by 8 times using rotation & transposition symmetry
  • 5. SL policy network (classification) • Input: 2-ch matrices of board state • Output: Probability distribution of next choice • Network: 9 layers of convolution with softmax output layer • 57% accuracy of prediction 5
  • 6. RL policy network • Polished SL policy with policy gradients -> Reinforcement Learning policy network • After training, generated teacher data for value network • Played games between RL policy networks -> 1.25 million sets of board state and result • Augmented by 8 times -> 10 million 6 SL policy network SL policy network (opponent) VS WIN -> encourage its plays LOSE -> discourage its plays (32*400=12,800 times)
  • 7. Value network (regression) • Input: 2-ch matrices of board state • Output: Value of the board state (Win: +1, Lose: -1, Draw: 0) • Network: 9 layers of convolution (similar to the SL policy network) 7 Prediction examples
  • 8. Monte Carlo tree search • Rollout policy: simplified SL policy network that works faster • MCTS: search deeper for a good path 1. Make child node by SL policy network 2. Evaluate current node by value network and the result of rollout policy self-play 3. Update ancestor nodes’ value 4. Choose most visited node 8
  • 9. Results • IaGo (complete) beat simple SL policy in approx. 90% of games! • Still, there is room for improvement… • It takes too long time for calculation • IaGo seems to have a weak point • Teacher data were from games between amateurs • Objective/quantitative evaluation is needed • Graphical User Interface -> Upload to web! 9
  • 10. Summary • IaGo is composed of 3 parts: • SL policy network: predict next action • Value network: evaluate board state • MCTS: choose action using 2 networks • IaGo became a good player through training 10

Editor's Notes

  1. Thank you Mr. Bayne. Good afternoon! Recently I learned about AlphaGo, an AI for playing game of Go, and implemented its algorithm in an othello version. So, let me tell you how I made it and how it works.
  2. AlphaGo is composed of these 3 parts: First, policy network, that predicts next action. Second, value network. that evaluates board state. And third, Monte Carlo tree search, that chooses action using two networks. So, I’ll now explain them a little in detail.
  3. First of all, let me mention that go has extremely huge search space of 10 to the 360th power. I guess it's hard to imagine, So I'll give you one example. Estimated number of all atoms existing in the universe. It's 10 to the 80th power. Again, the search space of Go is 10 to the 360th power, so it's far far far bigger than the number of all atoms in the universe . Because of this huge search space, before AlphaGo, it had been thought to take 10 more years for Go AIs to beat human professionals. Imagine what a big achievement AlphaGo made! But since I don't have enough machine resources for replicating AlphaGo, I made an Othello version. The search space of Othello is just 10 to the 60 power.
  4. I’ve now told you about the background. I’ll move on to dataset I used for training IaGo. Data were from online Othello game records that you can get for free on the internet. It includes 6 million sets of Board state and the place of next stone. Then I augmented them by 8 times using rotation and transposition symmetry. So finally, I got 48 million sets of board state and the place of next stone.
  5. The first part of IaGo: Supervised Learning policy network. It got 2 channel matrix of board state as an input, and output probability distribution of next choice, next action. The network was 9 layers of convolution with softmax output layer. After training, it predicted human plays at the accuracy of 57%.
  6. Next, I polished SL policy network with policy gradients algorithm. The polished network is called reinforcement learning policy Network or RL policy network for short. In the process of reinforcement learning, 2 SL policy networks played games against each other. Parameters of network was updated so that good actions were encouraged and bad actions were discouraged, according to the result of the game. I repeated this for more than 12000 times. After training, RL policy network generated teacher data for value Network. 2 RL policy networks played games against each other. Then I got 1.25 Million sets of board state and result. Again I augmented them by 8 times so finally I got 10 million sets of  Board State and result.
  7. Next I'll talk about Value Network. This Network is very similar to the SL policy Network in terms of the structure. What’s the difference? While SL policy network is for classification of next action, value network is for regression of the game result. Value network gets 2 channel Matrix of board state and outputs the value of the board state. I defined the value of the Board State as +1 for win, - 1 for lose, and 0 for draw. So the value means the likelihood of winning of the white player. Look at the example pictures. For the left one, white player is almost winning, so the value is 0.67 roughly equal 1. For one on the center, the white player is almost losing so the value is nearly equal to -1. And for the right one, you'll never know the result so the value is around to 0.
  8. Let's move on to the final part of the algorithm, Monte Carlo tree search. First I made a rollout policy. This is a simplified SL policy Network. Its prediction accuracy was lower than SL policy network but worked much faster. In MCTS I have to run many many simulations so I need a predictor that works fast. MCTS, in short, is an algorithm that searches deeper for a good path in the game tree using self-play simulation. And it’s composed of four steps. Step 1, make a child node by SL policy Network. Step 2, evaluate current node by value Network and the result of rollout policy self play. Step 3, update ancestor nodes’ value according to the rollout policy self-play. Step 4, choose most visited node.
  9. I’ve told you about the algorithm of Iago, so I’ll now talk about its performance. Iago played some games against simple SL policy Network and won approximately 90% of games. Still, there is room for improvement. First, it takes too long time for calculation. If I can make it shorter, then IaGo can run more simulations and will become stronger. Second Iago seems to have a weak point. The picture on the right side was taken when I beat complete version of Iago. I took all of its stones, and the game was over in the course of it. I'm not sure about it's cause, but I guess one reason is that teacher data were from games between amateur players, not professionals. Fourth, I couldn't really evaluate IaGo’s performance in an objective or quantitative way, so a more appropriate evaluation is needed. And finally, I’d like to develop a sophisticated graphical user interface and uploaded it to the web so that everyone can play it easily just by clicking.
  10. Let me summarize my presentation. I’ve explained IaGo’s algorithm and its performance. IaGo is composed of three parts. SL policy Network that predicts next action Value network that evaluates board state. Monte Carlo tree search that uses action using these two Networks. And Iago became a good player through training using huge dataset. That's it for my presentation. Do you have any questions?