Presented at Deepmind & Blizzard StarCraft2 AI Workshop
[Description]
Developing advanced StarCraft II RL agent is an extremely challenging project for me. So, I designed the roadmap to hack StarCraft II RL. My first naive idea is to develop Actor-Critic agent and optimal scripted agent. and I'll teach Actor-Critic agent through the trajectories of the optimal scripted agent. This idea will be tested on the mini-game map, "CollectMineralShards".
- 2017.11.04 Anaheim Hilton Hotel
6. StarCra& II AI Workshop
Reinforcement Learning
Tensorflow. "Newbie" Contributor
Microsoft AI MVP
Tensorflow-KR Admin (Korean No1. ML Community)
Ex-game developer (using Unity3D)
sjhshy@gmail.com
Posting StarCraft II Reinforcement Learning Tutorials on
http://chris-chris.ai
Kakao Corp. Data Engineer
- Data Pipeline management, Real-time log processing
- Business Intelligence, Marketing Intelligence
- Develop APIs & Dev-ops
7. StarCra& II AI Workshop
Cool Chris presents
Insert Subtitle Text Here
Insert Subtitle Text Here
Insert Subtitle Text Here
1. Problem DefiniBon
2. Lessons learned from pysc2
3. Actor-CriBc ImitaBon Learning Agent
16. StarCra& II AI Workshop
Almost 100 million actions you can make at one step
Multi agents : Agents should cooperate for one common goal.
Complexity : Too large action / observation space.
Each agents should solve problems like below
- Strategy
- Economy
- Production
- Tactics
- Recon
18. StarCra& II AI Workshop
Lesson 2
Make the model simple
Agent can cover all possible
actions in StarCraft II RL
environment.
Agent can select an unit,
and handle control groups
and move them.
Agent can recall control
group, and move the unit.
14 policy network model 7 policy network model 3 policy network model (now)
20. StarCra& II AI Workshop
Lesson 4
32x32 map size
Default map size is 64x64
but, you don't need 4 pixel point
to represent one marine.
Reduce the map size
as much as you can.
34. StarCra& II AI Workshop
Question
I know how to develop
optimal scripted agents.
but how can I make
Actor-Critic agent
learn from it?
Question
Actor-Critic Agent
(Learning Agent)
Optimal Scripted Agent
(Optimal Agent)
40. StarCra& II AI Workshop
Detailed description of this idea will be
covered on the paper or the blog post.
The source code is on my Github
https://github.com/chris-chris/pysc2-examples
python train_mineral_shards.py
--algorithm=a2c --num_agents=2
--num_scripts=2
41. StarCra& II AI Workshop
Insert Subtitle Text Here
Insert Subtitle Text Here
Insert Subtitle Text Here
1. Problem DefiniBon
Goal : Make the RL agent learn from expert game plays.
2. Lessons learned from pysc2
Simple model / one-hot encoding / u.clip_by_norm() / Actor-CriBc Architecture
3. Hybrid Actor-CriBc ImitaBon Learning
Hybrid agent learns faster, beFer with robustness.
Wrap up
42. StarCra& II AI Workshop
- Special Thanks to
Seungil You(Google) helped me to understand the papers and the algorithms
and supported me to find bugs and improve my tensorflow RL model.
- Deepmind and Blizzard Team
Thank you for the StarCraft II Learning Environment.
- Thanks for the precious advices
Sungjoon Choi(Disney Research), Nako Sung(Naver),
Woongwon Lee(RLCode), Doyun Lee(NC Soft).