DeNAにおける機械学習・深層学習活用

Copyright (C) DeNA Co.,Ltd. All Rights Reserved.
Analy&cs Promo&on Dept.
System Management Unit
Kazuki Fujikawa
DeNAにおける機械学習・深層学習活⽤
2016/7/3, DeNA TechCon for Student

⾃⼰紹介
n  藤川和樹
⁃  所属
•  DeNA システム本部分析推進部分析基盤グループ
⁃  2014.4 新卒でDeNAへ⼊社（3年⽬）
•  これまでの主な業務内容
⁃  ソーシャルゲームの各種課題分析、それに伴うデータ基盤の整備
⁃  mobageプラットフォーム・キュレーションサービスにおける
パーソナライズ・レコメンドシステムの開発
⁃  mobageプラットフォーム上における対話型⼈⼯知能システムの開発
n  経歴
⁃  2014.3 神⼾⼤学⼤学院システム情報⼯学研究科修了
•  研究分野
⁃  深層学習、⾃然⾔語処理
•  テーマ
⁃  深層学習による複数⽂書の圧縮表現の獲得と株価動向推定への応⽤

AGENDA
n  ⼤規模ユーザーデータを活⽤した個々のユーザーへのサービス提供
n  深層学習の成功と発展
n  深層学習を活⽤した新たな価値提供へ向けた取り組み
⁃  スタイル、テイストの似た商品のレコメンド
⁃  新たなゲームアイテムを⽣み出す画像⽣成器
⁃  ⼈に代わってゲームをプレイして難易度を評価してくれるAI
⁃  話していて楽しい対話bot

モバゲープラットフォーム
n  サービス規模
⁃  有効会員数: 数千万⼈
⁃  ゲームの種類: 1000種類以上
⁃  ユーザーアクション: 数⼗億超 / Day
⁃  ログデータ量: 1.2TB / Day
n  プラットフォームで提供したいサービス体験
⁃  多種多様なゲーム、コンテンツの中で、興味のあるものに出会える
⁃  親しい友⼈と⼀緒に楽しめる情報、機会を提供する
⾏動ログを活⽤したレコメンデーションを提供

ユーザーひとりひとりに対する適切な情報・サービス提供
SOCIAL
芸能News
スポーツNews
▶ Play
✔ Read
興味A
興味B
INTEREST
☆ Like
ACTION
n  興味 × ソーシャル × ⾏動パターンを基にユーザーをモデリングし、
興味に合う・興味を広げるコンテンツを提供する

レコメンデーション例
SOCIAL
芸能News
スポーツNews
▶ Play
✔ Read
興味A
興味B
INTEREST
☆ Like
ACTION
n  親しい友⼈の最近の興味を知ることで、⾃分の興味を広げることが
できる
Social
PF
Friend
Game
Friend
Game
Communication
FriendCommunication
PF
Communication
User
Impression
User
Click
ムに出会え、親しい仲間と複数ゲームを楽しめる
親しいユーザと
一緒に楽しめる
）
親しいユーザが楽しんでいる
ゲームに出会う機会を提供する
（親しいユーザ軸でのゲームとの出会い）
親しいユーザが
増える
興味があうユーザと
出会い・親しくなれる
機会を提供する（友
達推薦・コミュニケー
ション推薦、等）
Familiarities
SANTOS
(Social-Activity NeTwork Optimization System)

親しい友⼈と⼀緒に楽しめそうなサービスのレコメンデーション
サービスの利⽤
Service1
n  サービス横断での親密度、⼀緒に楽しめる度合いを算出する
⁃  ユーザーの各サービスへの興味、個別親密度を推定する
⁃  サービス横断での親密度を統合、⼀緒に楽しめるサービスの算出を⾏う
Service2
Service3
各サービスへの
興味
各サービス内での
親密度
⼀緒に楽しめそ
うなサービス
Service4
Service5
Service6
各サービスへの
興味

レコメンデーション例
SOCIAL
芸能News
スポーツNews
▶ Play
✔ Read
興味A
興味B
PF
Activity
Game
Activity
Game
Impression
Game
Click
Game
Install
AttentionActivity
興味にあったゲーム
興味にあったゲームを
楽しめる
Interests
興味にあった
ゲームに出会う機会提供する
（興味軸でのゲームとの出会い）
BARCA
(BAyesian network
ReCommendation Algorithm)
体験提供
INTEREST
☆ Like
ACTION
n  ⾃分と同じ興味を持つ⼈の⾏動パターンから、ユーザーに適した
コンテンツを推薦する

同じ興味を持つ⼈の⾏動パターンを利⽤したレコメンデーション
News1 News2
パーソナル興味の抽出
興味ワード1
興味ニュースの閲覧
興味ゲーム特徴1
ゲームの利⽤
Game1 Game2 興味ゲーム特徴2

News1 News2
NEWS A 閲覧 ✔
ユーザーが次に興味を⽰しそうな
コンテンツ
興味
Game A 利⽤継続 ▶
Game C Install⇓
Game B 利⽤継続▶
Game D Install⇓
ゲームの利⽤
Game1 Game2
興味→⾏動確率パターンを基にベイジアンネットワークを構成
興味ワード1
興味ワード1

News1 News2
NEWS A 閲覧 ✔
コンテンツ
興味
Game C Install⇓
Game D Install⇓
提⽰
反応
興味
クリックC
表⽰A
インストール D
表⽰B
表⽰C
表⽰D
クリックD
反応強化学習
ゲームの利⽤
Game1 Game2
興味ワード1
興味ワード1

News1 News2
NEWS A 閲覧 ✔
興味ワードニュースの閲覧
コンテンツ
興味
Game C Install⇓
Game D Install⇓
提⽰
反応
興味
クリックC
表⽰A
インストール D
表⽰B
表⽰C
表⽰D
クリックD
反応強化学習
ゲームの利⽤
Game1 Game2
興味ワード1
興味ワード1

画像認識における深層学習の成功
n  ILSVRC2012（画像認識のコンテスト）での深層学習の圧勝
⁃  Deep Neural Networkを活⽤したチームが他を圧倒
⁃  近年の深層学習ブームの⽕付け役に
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
2
4
6
8
10
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
2
4
6
8
10
Error&(5&predic1ons/image)&
#&Submissions&
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
2
4
6
8
10
ILSVRC&2010&
ILSVRC&2011&
ILSVRC$2012$
0.28&
0.26&
0.16&
Figure 4: (Left) Eight ILSVRC-2010 test images and the ﬁve labels considered most pro
The correct label is written under each image, and the probability assigned to the correct
Figure 2: An illustration of the architecture of our CNN, explicitly showing the delineation of responsibilities
between the two GPUs. One GPU runs the layer-parts at the top of the ﬁgure while the other runs the layer-parts
at the bottom. The GPUs communicate only at certain layers. The network’s input is 150,528-dimensional, and
the number of neurons in the network’s remaining layers is given by 253,440–186,624–64,896–64,896–43,264–
http://www.image-net.org/challenges/LSVRC/
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
Deng, J., et al. "Large scale visual recognition challenge (2012)." (2012).

画像認識における深層学習の成功
n  ⼀般物体認識タスクにおいて、⼈間を超える識別能⼒へ進展
⁃  計算機の性能向上、過学習や勾配消失問題に対する解決策登場などにより、
Deepなネットワークが学習が可能に
http://image-net.org/challenges/talks/ilsvrc2015_deep_residual_learning_kaiminghe.pdf
http://www.slideshare.net/hamadakoichi/dena-deep-learning

深層学習 × 画像⽣成
n  Variational Autoencoder (Kingma+, 2014)
⁃  ⼆つのニューラルネットワークを同時に学習する
•  実際の画像xから潜在変数zを推論するニューラルネットワーク
•  潜在変数zとラベルyから画像xʼを⽣成するニューラルネットワーク
(a) Handwriting styles for MNIST obtained by fixing the class label and varying the 2D latent variable z
(b) MNIST analogies (c) SVHN analogies
Figure 1: (a) Visualisation of handwriting styles learned by the model with 2D z-space. (b,c)
Analogical reasoning with generative semi-supervised models using a high-dimensional z-space.
The leftmost columns show images from the test set. The other columns show analogical fantasies
of x by the generative model, where the latent variable z of each row is set to the value inferred from
the test-set image on the left by the inference network. Each column corresponds to a class label y.
Table 2: Semi-supervised classification on
the SVHN dataset with 1000 labels.
KNN TSVM M1+KNN M1+TSVM M1+M2
Table 3: Semi-supervised classification on
the NORB dataset with 1000 labels.
KNN TSVM M1+KNN M1+TSVM
ラベルを固定、潜在変数の値を連続的に変えて数字を⽣成
(a) Handwriting styles for MNIST obtained by fixing the class label and varying the 2D latent variable z
潜在変数を固定し、ラベルを0 ~ 9 まで変えて⽣成
Kingma, Diederik P., et al. "Semi-supervised learning with deep generative models." Advances in Neural Information Processing Systems. 2014.

深層学習 × 画像⽣成
n  DCGAN (Radford+, 2015)
⁃  ⼆つのニューラルネットワークを同時に学習する
•  データセットを基に、ありそうな画像を⽣成するニューラルネットワーク
•  ⼊⼒された画像が本物画像なのか、⽣成器により作られた偽物なのかを識別
するニューラルネットワーク
Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint
arXiv:1511.06434 (2015).
⽣成された画像例画像の演算も可能に

深層学習 × ⾃然⾔語処理
n  Sequence to Sequence Learning (Sutskever+, 2014)
⁃  ⼊⼒された⽂に対し、対になる⽂を単語単位で⽣成する
n  Neural Conversational Model (Vinyals+, 2015)
⁃  翻訳などで⽤いられたseq2seqを対話タスクへ応⽤
道徳観についてのやりとり
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014.
Oriol Vinyals, Quoc Le.A Neural Conversational Model. arXiv:1506.05869. In ICML2015.
Human: are you a man or a woman ?
Machine: i ’m a woman .
Human: why are we here ?
Machine: i ’m not sure .
Human: okay, bye !
Machine: bye .
Human: see you later .
Machine: bye .
Conversation 2: Simple Q&A (contexts and multiple
choice)
Human: my name is david . what is my name ?
Machine: david .
Human: my name is john . what is my name ?
Machine: john .
Human: are you a leader or a follower ?
Machine: i ’m a leader .
Human: are you a follower or a leader ?
Machine: i ’m a leader .
Conversation 3: General knowledge Q&A
Human: who is skywalker ?
Machine: he is a hero .
Human: who is bill clinton ?
Machine: he ’s a billionaire .
Human: is sky blue or black ?
Machine: blue .
Human: does a cat have a tail ?
Machine: yes .
Human: does a cat have a wing ?
Machine: no
Human: can a cat fly ?
Machine: no .
Human: how many legs does a cat have ?
Machine: four , i think .
Human: how many legs does a spider have ?
Machine: three , i think .
Human: how many legs does a centipede have ?
Machine: eight .
Human: what is the color of the sky ?
Machine: blue .
Human: what is the color of water ?
Machine: water .
Human: what is the color of blood ?
Machine: it is the same as a black eye .
Human: what is the usual color of a leaf ?
Machine: it is a green one .
Human: what is the color of a yellow car ?
Machine: yellow .
Human: how much is two plus two ?
Machine: four .
Machine: seventy - two .
Conversation 4: Philosophical Q&A
Human: what is the purpose of life ?
Machine: to serve the greater good .
Human: what is the purpose of living ?
Machine: to live forever .
Human: what is the purpose of existence ?
Machine: to find out what happens when we get
to the planet earth .
Human: where are you now ?
Machine: i ’m in the middle of nowhere .
Human: what is the purpose of dying ?
Machine: to have a life .
Human: what is the purpose of being intelligent
?
Machine: to find out what it is .
Human: what is the purpose of emotions ?
Machine: i don ’t know .
Conversation 5: Morality
Human: what is moral ?
Machine: what empowered humanity , what in-
tellectual the essence is .
Human: what is immoral ?
Machine: the fact that you have a child .
Human: what is morality ?
Machine: what is altruism ?
Human: what is the definition of altruism ?
Machine: if you don ’t believe in god , then you
don ’t know .
Human: ok ... so what is the definition of moral-
ity ?
Machine: well , the truth is , you ’re not a be-
liever in god almighty .
Human: tell me the definition of morality , i am
quite upset now !
Machine: i ’m not ashamed of being a philoso-
pher !
Human: tell me the definition of morality .
Machine: i don ’t have ethics .
Human: look , i need help , i need to know more
about morality ...
Machine: i don ’t know what ethics is .
Human: seriously , what is morality ?
Machine: what is the definition of living ?
Human: why living has anything to do with
morality ?
Machine: you ’re not a cop .
Human: okay , i need to know how should i be-
have morally ...
A Neural Con
Machine: i don ’t know how to tell you .
Human: is morality and ethics the same ?
Machine: yes , sir .
Human: what are the things that i do to be im-
moral ?
Machine: i don ’t know .
Human: give me some examples of moral ac-
tions ...
Machine: i ’m not a moralist .
Human: alright , morality ?
Machine: integrity .
Human: what is integrity ?
Machine: i ’m sorry , i don ’t know what else to
say .
Human: be moral !
Machine: be a man !
Human: i really like our discussion on morality
and ethics ...
Machine: and how i ’m not in the mood for a
philosophical debate .
Human: what do you like to talk about ?
Machine: nothing .
Conversation 6: Opinions (notice the model outputs “he”
“she” or “it” depending on the subject being asked)
Human: what do you think about tesla ?
Machine: he ’s a good conductor .
Human: what do you think about bill gates ?
Machine: he ’s a good man .
Human: What do you think about messi ?
Machine: he ’s a great player .
Human: what do you think about cleopatra ?
Sequence to Sequence
sequence of words representing the answer. It is therefore clear that a domain-independent method
that learns to map sequences to sequences would be useful.
Sequences pose a challenge for DNNs because they require that the dimensionality of the inputs and
outputs is known and fixed. In this paper, we show that a straightforward application of the Long
Short-Term Memory (LSTM) architecture [16] can solve general sequence to sequence problems.
The idea is to use one LSTM to read the input sequence, one timestep at a time, to obtain large fixed-
dimensional vector representation, and then to use another LSTM to extract the output sequence
from that vector (fig. 1). The second LSTM is essentially a recurrent neural network language model
[28, 23, 30] except that it is conditioned on the input sequence. The LSTM’s ability to successfully
learn on data with long range temporal dependencies makes it a natural choice for this application
due to the considerable time lag between the inputs and their corresponding outputs (fig. 1).
There have been a number of related attempts to address the general sequence to sequence learning
problem with neural networks. Our approach is closely related to Kalchbrenner and Blunsom [18]
who were the first to map the entire input sentence to vector, and is very similar to Cho et al. [5].
Graves [10] introduced a novel differentiable attention mechanism that allows neural networks to
focus on different parts of their input, and an elegant variant of this idea was successfully applied
to machine translation by Bahdanau et al. [2]. The Connectionist Sequence Classification is another
popular technique for mapping sequences to sequences with neural networks, although it assumes a
monotonic alignment between the inputs and the outputs [11].
Figure 1: Our model reads an input sentence “ABC” and produces “WXYZ” as the output sentence. The
model stops making predictions after outputting the end-of-sentence token. Note that the LSTM reads the
input sentence in reverse, because doing so introduces many short term dependencies in the data that make the
optimization problem much easier.
The main result of this work is the following. On the WMT’14 English to French translation task,
we obtained a BLEU score of 34.81 by directly extracting translations from an ensemble of 5 deep
LSTMs (with 380M parameters each) using a simple left-to-right beam-search decoder. This is
by far the best result achieved by direct translation with large neural networks. For comparison,
the BLEU score of a SMT baseline on this dataset is 33.30 [29]. The 34.81 BLEU score was
achieved by an LSTM with a vocabulary of 80k words, so the score was penalized whenever the
reference translation contained a word not covered by these 80k. This result shows that a relatively
unoptimized neural network architecture which has much room for improvement outperforms a
mature phrase-based SMT system.
Finally, we used the LSTM to rescore the publicly available 1000-best lists of the SMT baseline on
the same task [29]. By doing so, we obtained a BLEU score of 36.5, which improves the baseline
by 3.2 BLEU points and is close to the previous state-of-the-art (which is 37.0 [9]).
Surprisingly, the LSTM did not suffer on very long sentences, despite the recent experience of other
researchers with related architectures [26]. We were able to do well on long sentences because we
reversed the order of words in the source sentence but not the target sentences in the training and test
set. By doing so, we introduced many short term dependencies that made the optimization problem
much simpler (see sec. 2 and 3.3). As a result, SGD could learn LSTMs that had no trouble with
long sentences. The simple trick of reversing the words in the source sentence is one of the key
technical contributions of this work.
A useful property of the LSTM is that it learns to map an input sentence of variable length into
a fixed-dimensional vector representation. Given that translations tend to be paraphrases of the
source sentences, the translation objective encourages the LSTM to find sentence representations
that capture their meaning, as sentences with similar meanings are close to each other while different

深層学習 × ⾃然⾔語処理
n  Memory Network (Sukhbaatar+, 2015)
⁃  質問応答などのタスクで、質問⽂以外に何か別のリソースを参照して返答
することを可能にする
Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015.
http://www.thespermwhale.com/jaseweston/icml2016/icml2016-memnn-tutorial.pdf
⽂章を参照した質問応答
interest in using neural network based models for the task, with RNNs [14] and LSTMs [10, 20]
showing clear performance gains over traditional methods. Indeed, the current state-of-the-art is
held by variants of these models, for example very large LSTMs with Dropout [25] or RNNs with
diagonal constraints on the weight matrix [15]. With appropriate weight tying, our model can be
regarded as a modified form of RNN, where the recurrence is indexed by memory lookups to the
word sequence rather than indexed by the sequence itself.
4 Synthetic Question and Answering Experiments
We perform experiments on the synthetic QA tasks defined in [22] (using version 1.1 of the dataset).
A given QA task consists of a set of statements, followed by a question whose answer is typically
a single word (in a few tasks, answers are a set of words). The answer is available to the model at
training time, but must be predicted at test time. There are a total of 20 different types of tasks that
probe different forms of reasoning and deduction. Here are samples of three of the tasks:
Sam walks into the kitchen. Brian is a lion. Mary journeyed to the den.
Sam picks up an apple. Julius is a lion. Mary went back to the kitchen.
Sam walks into the bedroom. Julius is white. John journeyed to the bedroom.
Sam drops the apple. Bernhard is green. Mary discarded the milk.
Q: Where is the apple? Q: What color is Brian? Q: Where was the milk before the den?
A. Bedroom A. White A. Hallway
Note that for each question, only some subset of the statements contain information needed for
the answer, and the others are essentially irrelevant distractors (e.g. the first sentence in the first
example). In the Memory Networks of Weston et al. [22], this supporting subset was explicitly
indicated to the model during training and the key difference between that work and this one is that
this information is no longer provided. Hence, the model must deduce for itself at training and test
time which sentences are relevant and which are not.
Formally, for one of the 20 QA tasks, we are given example problems, each having a set of I
sentences {xi} where I  320; a question sentence q and answer a. Let the jth word of sentence
i be xij, represented by a one-hot vector of length V (where the vocabulary is of size V = 177,
reflecting the simplistic nature of the QA language). The same representation is used for the
question q and answer a. Two versions of the data are used, one that has 1000 training problems
per task and a second larger one with 10,000 per task.
4.1 Model Details
Unless otherwise stated, all experiments used a K = 3 hops model with the adjacent weight sharing
scheme. For all tasks that output lists (i.e. the answers are multiple words), we take each possible
wikipediaを参照した質問応答
Recent Work: New Models for QA on documents
Miller et al. Key-Value Memory Networks for Directly
Reading Documents. arXiv:1606.03126.
2.1 Single Layer
We start by describing our model in the single layer case, which implements a single memory hop
operation. We then show it can be stacked to give multiple hops in memory.
Input memory representation: Suppose we are given an input set x1, .., xi to be stored in memory.
The entire set of {xi} are converted into memory vectors {mi} of dimension d computed by
embedding each xi in a continuous space, in the simplest case, using an embedding matrix A (of
size d⇥V ). The query q is also embedded (again, in the simplest case via another embedding matrix
B with the same dimensions as A) to obtain an internal state u. In the embedding space, we compute
the match between u and each memory mi by taking the inner product followed by a softmax:
pi = Softmax(uT
mi). (1)
where Softmax(zi) = ezi
/
P
j ezj
. Defined in this way p is a probability vector over the inputs.
Output memory representation: Each xi has a corresponding output vector ci (given in the
simplest case by another embedding matrix C). The response vector from the memory o is then a
sum over the transformed inputs ci, weighted by the probability vector from the input:
o =
X
i
pici. (2)
Because the function from input to output is smooth, we can easily compute gradients and back-
propagate through it. Other recently proposed forms of memory or attention take this approach,
notably Bahdanau et al. [2] and Graves et al. [8], see also [9].
Generating the final prediction: In the single layer case, the sum of the output vector o and the
input embedding u is then passed through a final weight matrix W (of size V ⇥ d) and a softmax
to produce the predicted label:
â = Softmax(W(o + u)) (3)
The overall model is shown in Fig. 1(a). During training, all three embedding matrices A, B and C,
as well as W are jointly learned by minimizing a standard cross-entropy loss between â and the true
label a. Training is performed using stochastic gradient descent (see Section 4.2 for more details).
Question
q
OutputInput
Embedding B
Embedding C
Weights
Softmax
Weighted Sum
pi
ci
mi
Sentences
{xi}
Embedding A
o W
Softmax
Predicted
Answer
a^
u
u
Inner Product
Out3In3
B
Sentences
W a^
{xi}
o1
u1
o2
u2
o3
u3
A1
C1
A3
C3
A2
C2
Question q
Out2In2Out1In1
Predicted
Answer
(a) (b)
Figure 1: (a): A single layer version of our model. (b): A three layer version of our model. In
practice, we can constrain several of the embedding matrices to be the same (see Section 2.2).
2.2 Multiple Layers
We now extend our model to handle K hop operations. The memory layers are stacked in the
following way:
• The input to layers above the first is the sum of the output ok
and the input uk
from layer k
(different ways to combine ok
and uk
are proposed later):
uk+1
= uk
+ ok
. (4)
2

深層学習 × ゲームを攻略するAI
n  Deep Q Network (Mnih+, 2015)
⁃  ブロック崩しやインベーダーゲームなどを、画⾯の画像特徴を基に
強化学習で攻略する
n  AlphaGo (Silver+, 2016)
⁃  教師あり学習と強化学習を組み合わせて囲碁の戦略を学習する
⁃  ⼈間のプロ囲碁棋⼠をハンデ無しで破ったことで話題になった
Deep Q Network
Figure 1: Neural network training pipeline and architecture. a A fast rollout policy p⇡ and su-
pervised learning (SL) policy network p are trained to predict human expert moves in a data-set of
positions. A reinforcement learning (RL) policy network p⇢ is initialised to the SL policy network,
and is then improved by policy gradient learning to maximize the outcome (i.e. winning more
games) against previous versions of the policy network. A new data-set is generated by playing
games of self-play with the RL policy network. Finally, a value network v✓ is trained by regression
to predict the expected outcome (i.e. whether the current player wins) in positions from the self-
AlphaGo
Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533.
Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

AGENDA
n  ⼤規模ユーザーデータを活⽤した個々のユーザーへのサービス提供
n  深層学習の成功と発展
⁃  スタイル、テイストの似た商品のレコメンド
⁃  話していて楽しい対話bot
⁃  新たなゲームアイテムを⽣み出す画像⽣成器
⁃  ⼈に代わってゲームをプレイして難易度を評価してくれるAI

スタイル・テイストの似た商品のレコメンド

※１女性向け情報メディアの対象サイトは当社にて選定の上、Nielsen Mobile NetView（2015年8月、Webおよびアプリからのアクセス）で比較しております。
※２月間UUは、Google Analyticsの集計によるのべ月間利用者数のことで、１ユーザーによるスマートフォンやPC等からのデバイス横断でのアクセスの重複も一部含みます。なおMERYのデバイス比率はスマートフォン93.8%、PC6.2%です。
2015年8月時点の数値。

MERYで提供したいユーザー体験
n  買い物体験の⼀例
ショップAで気になる
アイテムを発⾒！
でも、値段が⾼いし、もう
少し夏っぽいものが
いいな...

ウィンドウショッピングを
続けていると、ショップBに
さっきのものに似たものが！
値段も安く、袖⼝のデザイン
も理想的！
いいな...

いいな...
ウィンドウショッピングを
続けていると、ショップBに
さっきのものに似たものが！
値段も安く、袖⼝のデザイン
も理想的！
⾃ら似た商品を探す必要があって⼤変...
実は別の店にもっといいものがあるかも...

n  気になる商品とスタイル・テイストの近い商品を⾒⽐べながら
買い物ができるような、ユーザー体験を提供したい

⁃  距離算出に適した空間を構成できる構造追加した、Convolutional Neural Networkの
学習
商品スタイル・テイスト類似商品
ファッションのスタイル・テイストの類似商品算出
距離空間構成する
構造追加した
CNN

商品スタイル・テイスト類似商品
⁃  距離算出に適した空間を構成できる構造追加した、Convolutional Neural Networkの学習
⁃  商品に関係しない領域に引きずられず、スタイル・テイストが似ている商品をたどれる
「商品を着⽤したモデル画像」「商品画像」区別なく、類似商品算出
構造追加した
CNN

マンガボックスで提供したいユーザー体験
n  膨⼤なマンガの中から、ユーザーの好みに近いマンガに探さず出会える
体験を作りたい
⁃  ユーザー⾏動からのレコメンドは有効だが、⼈気の作品に寄り過ぎるものも存在する
同じ興味を持つ⼈が読んでいるマンガ
同じ興味を持つ⼈の
⾏動パターンを利⽤した
レコメンデーション
マンガ

n  膨⼤なマンガの中から、ユーザーの好みのマンガを⾃ら探さず出会える
体験を作りたい
⁃  ユーザー⾏動からのレコメンドは有効だが、⼈気の作品に寄り過ぎるものも存在する
⁃  興味のあるマンガに、“イラスト画⾵が似ている”という軸を⼊れることで、⼈気作品に
引きずられず興味のあるマンガを⾒つけることができる
マンガイラスト画⾵が似ているマンガ
構造追加した
CNN

作品
類似画像スタイル・テイスト作品　：類似度: 高 ← → 類似度: 低作品
n  イラスト画⾵の似ているマンガの例

⁃  劇画

⁃  少⼥漫画

⁃  癒やし

⁃  BL
作品作品

新たなゲームアイテムを⽣み出す画像⽣成器

新たなアバターアイテムの⽣成に関する取り組み
n  ユーザーひとりひとりが⾃分のセンスでアイテムを⽣み出し、もっと
⾃由にアバターのコーディネートで個性を表現できると嬉しい

n  アバターアイテムの表現ベクトル空間を学習
アバターアイテム

Deep
Generative
Model
学習
アバターアイテム

Deep
Generative
Model
学習
⽣成
アバターアイテム新たに⽣成されたアイテム

n  ⼆つのアイテムを合成し、新たなアイテムを⽣み出す
アイテム1 アイテム2

n  ⼆つのアイテムを合成し、新たなアイテムを⽣み出す
アイテム1 アイテム2合成⽣成されたアイテム

⼈に代わってゲームをプレイして難易度を
評価してくれるAI
© SQUARE ENIX CO., LTD. / DeNA Co., Ltd.

モバイル端末向けゲームの運⽤特徴
n  モバイル端末向けゲームは、
「⼀度出して終わり」ではない
⁃  新しいイベントを週単位で提供している
⁃  イベントを楽しんでもらうためには、
戦っていて楽しい⾼難易度ボスの存在が重要
•  難しすぎず簡単すぎず、様々な技を駆使して
ギリギリクリアできるくらいの難易度

モバイル端末向けゲームの運⽤課題
n  難しすぎず簡単すぎない難易度を実現する
ボスパラメータの設計は⾮常に困難
⁃  プランナーが繰り返しプレイして難易度調整
⁃  最終的には体感で判断
⁃  ランダム要素が絡むと試⾏時には発⽣しな
かった事件が発⽣し得る

モバイル端末向けゲームの運⽤課題への解決アプローチ
n  難しすぎず簡単すぎない難易度を実現する
ボスパラメータの設計は⾮常に困難
⁃  プランナーが繰り返しプレイして難易度調整
•  ⼈間の代わりに機械にプレイさせる
⁃  最終的には体感で判断
•  妥当な難易度か否かを機械が判定
⁃  ランダム要素が絡むと試⾏時には発⽣しな
かった事件が発⽣し得る
•  機械による圧倒的なプレイ回数でカバー
⼈に代わってゲームをプレイして難易度を評価するAIが欲しい！

ゲームをクリアするAIの仕組み
n  ニューラルネットワーク × 遺伝的アルゴリズム
⁃  ボスを攻略可能なニューラルネットワークのパラメータを探索的に
発⾒する
•  適応度の低い個体は淘汰させ、適応度の⾼い個体を優先して次の世代へ
引き継がせる
第1世代

発⾒する
•  適応度の低い個体は淘汰させ、適応どの⾼い個体を優先して次の世代へ
引き継がせる
•  交叉・突然変異などを繰り返して最適解に近づけていく
第2世代第1世代
突然変異
交叉

第N世代
発⾒する
•  適応度の低い個体は淘汰させ、適応どの⾼い個体を優先して次の世代へ
引き継がせる
•  交叉・突然変異などを繰り返して最適解に近づけていく
第2世代第1世代
突然変異
交叉

n  ニューラルネットワーク × 強化学習
⁃  Q学習を⽤いて最適⽅策を学習する
•  状態sにおける各⾏動aの、未来を含めた⾒込み報酬Qを予測する
環境エージェント
⾏動決定関数
argmax Q(s, a)
状態s
（味⽅HP、ボスHPなど）
⾏動a
（攻撃、防御、必殺技など）
a

•  今回得た報酬r、それによって移⾏した状態sʼを基に、教師信号を定義する
⁃  target = r + γ max Q(sʼ, aʼ) (γ: 割引率)
⾏動決定関数
argmax Q(s, a)
状態s
⾏動a
報酬r
（与被ダメージの和など）
a
aʼ

•  今回得た報酬r、それによって移⾏した状態sʼを基に、教師信号を定義する
⁃  target = r + γ max Q(sʼ, aʼ)
•  教師信号に近づけられるよう、ニューラルネットワークを学習させる
⾏動決定関数
argmax Q(s, a)
状態s
⾏動a
報酬r
（与被ダメージの和など）
パラメータ更新
a
aʼ

学習に必要な環境を独⾃に開発し、AIを訓練
n  AIがボス攻略の最適戦略を学習する
n  ボスパラメータの⾼速チューニングや未知の攻略⽅法を発⾒する
AIによる⾼速バトルシミュレーター
AIが⾃ら試⾏錯誤し、
各キャラクターごとの最適な振る舞いを学習
⾏動
聖なる守護神
（ヘイスト・防御UP）
シェルガ
（防御UP）
フルブレイク
（攻撃）
ブリザガ剣
（攻撃）
ボス攻撃
ブリザガ剣
（攻撃）
ケアルダ
（回復）
…...
序盤は補助魔法で攻撃
や防御を上げておこう
攻撃を受けてHPが
減ったから回復しよう
この属性攻撃が
有効のようだ

話していて楽しい対話BOT

みんなとチャット

n  サービス概要
⁃  「誰でも、いつでも、どの部屋にでも」
⼊ってすぐに会話が楽しめる、オープン
チャットコーナー
•  誰でも30分間限定のルームを作成可能
•  複数の部屋を⾏き来しながら⾃由に話せる
n  提供したいユーザー体験
⁃  ここに来れば絶対誰かと話せる
⁃  かまってもらえる、寂しくない

n  サービス課題
⁃  盛り上がりを⾒せる部屋もある⼀⽅、
⼈気の無い部屋も散⾒される
•  ⼊室あるものの、会話が成⽴しない部屋
•  会話があったが挨拶から発展しない部屋
•  ⼊室が無く30分終える部屋
⁃  サービスの狙いである、「かまってもらえ
る」が実現できていないルームが存在

n  サービス課題の解決
⁃  「かまって欲しい」の解決には、⼈も
ロボットも関係ないのでは？
⁃  親⾝に悩み相談に乗ってくれたり、会話
の盛り上げが上⼿いBotがいれば、それ
をきっかけに会話が弾む
n  PFN社と試験プロジェクトを開始
⁃  会話が続かないチャットルームにユーザ
としてAIが⼊室、⾃分から積極的に会話を
「盛り上げる」
⁃  100%Mobageで蓄積した膨⼤な対話データ
を使⽤

みんなとチャット w/ AI Bot
n  ⼈間の振る舞いに近づけるために
⁃  いつでもどこでもアクティブだとbotっぽい
•  3⼈のbotを、8H交代制で配置
•  同時にアクティブに投稿する部屋数を制限
⁃  全発話に返答しているとbotっぽい
•  ⼈が反応出来る範囲の発話にのみ返答する

みんなとチャット w/ AI Bot
n  ⽬指すAI像
⁃  エンタメに特化した”楽しい”対話ができる
⁃  キャラをブレさせず応答
•  AIキャライメージ:
⁃  30代、⼩学⽣の⼦持ち主婦（元ヤン）
⁃  ⽂脈に応じて適切に返答
対話AIの試験運⽤などを通じ、
合弁会社の設⽴に⾄る

やり取りの⼀例

AI
やり取りの⼀例

まとめ
n  ⼤規模ユーザーデータを活⽤した個々のユーザーへの体験提供
⁃  ⼤規模ユーザーデータを活⽤し、複数のサービスでユーザー⼀⼈ひとりの
興味を学習した付加価値の提供を⾏っている
⁃  最新の研究動向をキャッチアップして、事業へ活かせそうな技術があれば
積極的に導⼊検討を⾏っている
⁃  MERYやマンガボックスでのレコメンドや、みんなとチャットでの対話応答
では既にサービスへの導⼊を⾏っており、他にもゲームAIや画像⽣成など、
可能性のある技術のサービス化に向けたチャレンジを⾏っている

DeNAにおける機械学習・深層学習活用

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to DeNAにおける機械学習・深層学習活用

Similar to DeNAにおける機械学習・深層学習活用 (20)

More from Kazuki Fujikawa

More from Kazuki Fujikawa (15)

DeNAにおける機械学習・深層学習活用