Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013

BIG DATA @
RIOT GAMES
USING HADOOP TO IMPROVE THE PLAYER EXPERIENCE
BARRY LIVINGSTON & SANDEEP SHRESTHA | JULY 2013

CONTEXT
HIGH LEVEL ARCHITECTURE
PLAYER EXPERIENCE USE CASES
SUMMARY
QUICK DATA WAREHOUSE HISTORY

WHAT IS LEAGUE OF LEGENDS?
2009
LAUNCH
TEAM
ORIENTED
100+
CHAMPS
MODERN
FANTASY

LEAGUE OF LEGENDS GAMEPLAY - CHAMPIONS

LEAGUE OF LEGENDS GAMEPLAY - GAMEPLAY

INITIAL LAUNCH / SCRAPPY START UP PHASE
‣  Had
a
single,
dedicated
MySQL
instance
for
the
DW

‣  Data
was
ETL’d
from
produc@on
slaves
into
this
instance

‣  Queries
were
run
in
MySQL

‣  Repor@ng
was
done
in
Excel

▾  All
ETLs,
queries
and
repor@ng
were
done
by
one
person

HISTORY
START-‐UP

THIS WORKED GREAT!

THEN – CRAZY GROWTH
HISTORY
START-‐UP

@me

#
unique
logins

TOTAL
ACTIVE
PLAYERS

June
2012

CRAZY

GROWTH

THE BREAKING POINT
HISTORY
START-‐UP

CRAZY

GROWTH

BREAKING

POINT

‣  Data
warehouse
reached
a
breaking
point

▾  24
hours
of
data
took
24.5
hours
to
ETL

‣  We
couldn’t
handle…

▾  Mul@ple
environments
in
a
ver@cal
MySQL
instance

▾  A
single
environment
in
a
ver@cal
MySQL
instance

‣  We
needed
to
change

INTRODUCTION OF HADOOP
HISTORY
START-‐UP

CRAZY

GROWTH

BREAKING

POINT

‣  Hadoop
has
a
number
of
great
quali@es

▾  Cost
eﬀec@ve

▾  Scalable

▾  Open
source

▾  We
could
execute
quickly

HADOOP

HIGH LEVEL ARCHITECTURE – JUNE 2012
Tableau

Hive
Data
Warehouse

Pentaho

+

Custom

ETL

+

Sqoop

MySQL
Pentaho

Analysts

EUROPE

Audit
Plat

LoL

KOREA

Audit
Plat

LoL

NORTH
AMERICA

Audit
Plat

LoL

Business

Analyst

BUT, THIS WASN’T GOOD ENOUGH
‣  The
@me
to
arrive
at
insight
was
too
long!

‣  Our
solu@on
required
too
much
data
team
involvement

▾  Schema
changes

▾  ETL
tweaks

▾  Hive
metadata
updates

‣  Hive
is
painful
for
ad-‐hoc
or
interac@ve
analysis

▾  Especially
for
non-‐technical
folks

GOALS
‣  Democra@ze
data
access

▾  Enable
Self-‐service
Data
Collec@on
and

Analysis

‣  Create
ac@onable
insights

‣  Increase
speed
to
insight

USE CASE:
GAME CLIENT PERFORMANCE

CLIENT FOOTPRINT
‣  Signiﬁcant
por@on
of
our
soware
runs
directly
on
players’

machines

▾  High
performance
graphics

▾  Responsiveness

‣  There
is
logic
in
these
components
that's
ONLY
exercised

on
the
client-‐side

‣  Understanding
the
performance,
reliability
and
stability
of

these
features
is
paramount
to
improving
the
player

experience

CHALLENGE: THE GAME IS ALIVE
The
game
is
a
living,
breathing
service
that’s
always
in
mo@on

‣  New
champions

‣  New
items

‣  New
eﬀects/par@cles

‣  Changes
in
environment

‣  Changes
in
design
and
design

balance

UPDATE
2-3WEEKS

CHALLENGE: PC VARIABILITY
‣  Hardware
and
OS
profiles
are
significantly
different
even

within
regions

▾  OS
and
patch
level

▾  CPU

▾  Memory

▾  Video
card

▾  Video
card
memory

▾  Drivers

IMPROVING THE PLAYER EXPERIENCE
‣  We
need
to
gather
informa@on
across
all
of
these

dimensions
in
order
to
UNDERSTAND
the
player
experience

‣  We
use
this
info
to:

▾  React
quickly
to
changes

▾  Op@mize
performance

▾  Op@mize
designs

▾  Improve
our
tes@ng

•  Like
crea@ng
our
compa@bility
tes@ng
lab

OPTIMIZING DESIGN AND PERFORMANCE

HOW DID WE SOLVE THIS
WE HAVE AN ARMY OF TEEMOS WATCHING PLAYERS’ MACHINES THROUGH THEIR TELESCOPES?!
(NOT REALLY, BUT WE DID CONSIDER IT)

HONU: GENERATE - COLLECT - ANALYZE
‣  Riot’s
self-‐service
end-‐to-‐end
Big
Data
pipeline

▾  Cloud-‐ready
(AWS
compa@ble)

▾  Internal
data-‐center
ready

▾  Persistent
storage:
HDFS/S3

▾  Batch
processing:
Apache
Hadoop/AWS
EMR

▾  Data
publish:
Apache
Hive

EVENT GENERATION
‣  Honu
SDKs:
Java,
C++,
Erlang

‣  Collector
discovery

‣  Failover

‣  Load
balancing

‣  Buﬀering/Batching

‣  Dispatching

‣  Thri
transport

HONU CLIENT SDK
Select
avg(f[‘pingAVG’])
from
game_client_stats
group
by
f[‘serverId’];

pingAvg
serverId
system
source

app
@mestamp

1234567890
99.123.456.78
game_client
220.9542
12.345.678.90
Intel64
…

GAME_CLIENT_STATS

EVENT COLLECTION
‣  Honu
collector

‣  Online
system

‣  High
availability
–
100%
up@me

‣  Horizontally
scalable

‣  Elas@c

‣  Fault
tolerant

‣  Neulix
OSS
Eureka
discovery
service

HONU COLLECTOR
‣  Collect
events
from
mul@ple
clients

(Thri/NIO)

‣  Save
all
events
to
one
compressed

ﬁle
locally

‣  Upload
that
ﬁle
every
XX
minutes
to

HDFS/S3

‣  Send
a
message
to
Queue/SQS
for

Demux

H
o
n
u
C
o
l
l
e
c
t
o
r
s

S
Q
S

S
3

EVENT ORGANIZATION
‣  Honu
demux

‣  Mul@-‐stage
batch
processing
pipeline

‣  Elas@c
producer-‐consumer

‣  Apache
Hadoop
map
reduce

‣  Standalone
map
reduce
mode

‣  Apache
Hive
integra@on

HONU DEMUX
‣  Mul@-‐Stage
batch

processing
pipeline

‣  Bucket
events
to
separate

tables

‣  Write
Hive
par@@on
ﬁles

‣  Add
par@@ons
to
Hive

metastore

‣  Merge
par@@ons

Demux

SQS

S3
S3

Standalone
Demux
Standalone
Demux
Standalone
Demux
Standalone
Demux
S3 S3
S3 S3
HIVE

MERGE

HONU PIPELINE
HONU
CLIENT
SDK
HONU
COLLECTORS
HONU
DEMUX
ORGANIZECOLLECTGENERATE

PLAYER BEHAVIOR INITIATIVES
TRIBUNAL JUSTICE
‣  Community
regulated

‣  In-‐game
chat
log

‣  Player
stats

‣  Inventory

‣  Game
Info

PLAYER BEHAVIOR INITIATIVES
HONOR SYSTEM
‣  Recognize
posi@ve
experience

‣  Improve
sportsmanship

STARTUP TIPS
TEAMS THAT USE SMART PINGS TO ALERT OTHER PLAYERS TO THREATS ARE MORE LIKELY TO WIN GAME
PLAYERS WHO FOLLOW THE SUMMONER'S CODE WIN 27% MORE GAMES
THE TRIBUNAL BANS PLAYERS FOR NEGATIVE BEHAVIOR SUCH AS VERBAL HARASSMENT
PLAYERS WHO COOPERATE WITH THEIR TEAM WIN 31% MORE GAMES

HOW WE SOLVED IT – EXTEND HONU
HONU
CLIENT
SDK
HONU
COLLECTORS
HONU
DEMUX
ORGANIZECOLLECTGENERATE

HONU TOOLS: DRADIS
‣  Hwp
based
data
collec@on

‣  Large
volume
of
data
from

untrusted
source

‣  C10K

‣  Nginx
+
Newy

‣  4+
billion
API
calls/day

‣  Peak
100K+
calls/sec

HONU TOOLS: DRADIS
‣  Json
Messages:

▾  curl
-‐d
’[

{"messageType":
"Foo",
"@mestamp":
1369064555,
"fact":
"Hello
World!"},
{"messageType":

"Foo",
"@mestamp":
1369064555,
"fact":
"Hello
Dradis!",

"ﬁc@on":
"Hello
Honu!"}]’

‣  Hive
Query:

▾  Select
*
from
foo
where
f[‘fact’]
=
‘Hello
Dradis!’

Table:
Foo

HONU TOOLS: ECHO SERVICE
‣  Web
UI
to
easily
and
immediately
visualize
the
data
that
has
been
sent

to
Honu
collectors

‣  Self-‐service
end-‐to-‐end
pipeline

HONU TOOLS: METADATA SERVICE
‣  Data
discovery

‣  Schema
management

‣  Counter,
@me

HONU TOOLS: REAL-TIME SLICING/DICING
‣  Integration with Platfora
‣  End-user ad-hoc analysis tool
‣  Interactive visual feedback
‣  Realtime exploration/graphing @ 109 data points

HONU TOOLS: REAL-TIME SLICING/DICING

HONU TOOLS: WORKFLOW MANAGEMENT
ENTERPRISE WORKFLOW
MANAGEMENT
MATT GOEKE
@ LATER TODAY
ClientMobile
WWW

HONU STATS
‣  7+ billion events/day
‣  Tested @ 70+ billion events/day
‣  100+ tables
▾  10+ tables @ 100M – 1B rows/day
‣  7 Petabytes Game Event Dataset
‣  Semi-global deployment
‣  0 downtime
‣  Runs in cloud (AWS) +
datacenter

GOALS
ü Democra@ze
Data
Access

ü Enable
Self-‐service
Data
Collec@on
and
Analysis

ü Create
Ac@onable
Insights

ü Increase
Speed
to
Insight

HONU
HONU
CLIENT
SDK

FUTURE
‣  Improve
self-‐service
workﬂow
&
tooling

▾  Metadata
management

▾  Discovery
of
captured
data

▾  Workﬂow
management

▾  Plauora
to
all
teams

‣  Real@me
event
aggrega@on

‣  Global
data
infrastructure

‣  Replace
legacy
audit/event
logging
services

HANDLE INCREASING DATA VELOCITY
JUNE 2012 JULY 2013
MySQL
tables
180
1200

Pipeline
Events/day
0
7+
Billion

Workﬂows
Cronjob
+
Pentaho
Oozie

Environment
Datacenter
DC
+
AWS

SLA
1
day
2
hours

Event
tracking
•  2+
weeks
(DB

update)

•  Dependencies:
DBA

teams
+
ETL
teams
+

Tools
teams

•  Down@me
(3h
min.)

•  10
minutes

•  Self-‐Service

•  No
down@me

SHAMELESS HIRING PLUG
Like most everybody else at this conference… we’re hiring!
PLAYER EXPERIENCE FIRST
CHALLENGE CONVENTION
FOCUS ON TALENT AND TEAM
TAKE PLAY SERIOUSLY
STAY HUNGRY, STAY HUMBLE
THE RIOT MANIFESTO

SHAMELESS HIRING PLUG
AND YES, YOU CAN PLAY GAMES AT WORK
IT’S ENCOURAGED!

THANK YOU! QUESTIONS?
BARRY LIVINGSTON
blivingston@riotgames.com
SANDEEP SHRESTHA
sshrestha@riotgames.com

Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013

Similar to Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013 (20)

More from StampedeCon

More from StampedeCon (20)

Recently uploaded

Recently uploaded (20)

Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013