SlideShare a Scribd company logo
1 of 69
BIG DATA @
RIOT GAMES
USING HADOOP TO IMPROVE THE PLAYER EXPERIENCE
BARRY LIVINGSTON & SANDEEP SHRESTHA | JULY 2013
SPEAKERS
CONTEXT
HIGH LEVEL ARCHITECTURE
PLAYER EXPERIENCE USE CASES
SUMMARY
QUICK DATA WAREHOUSE HISTORY
FIRST, A BIT OF CONTEXT…
WHAT IS LEAGUE OF LEGENDS?
2009
LAUNCH
TEAM
ORIENTED
100+
CHAMPS
MODERN
FANTASY
WHAT IS LEAGUE OF LEGENDS?
LEAGUE OF LEGENDS GAMEPLAY - CHAMPIONS
LEAGUE OF LEGENDS GAMEPLAY - GAMEPLAY
A QUICK HISTORY
INITIAL LAUNCH / SCRAPPY START UP PHASE
‣  Had	
  a	
  single,	
  dedicated	
  MySQL	
  instance	
  for	
  the	
  DW	
  
‣  Data	
  was	
  ETL’d	
  from	
  produc@on	
  slaves	
  into	
  this	
  instance	
  
‣  Queries	
  were	
  run	
  in	
  MySQL	
  
‣  Repor@ng	
  was	
  done	
  in	
  Excel	
  
▾  All	
  ETLs,	
  queries	
  and	
  repor@ng	
  were	
  done	
  by	
  one	
  person	
  
HISTORY	
   START-­‐UP	
  
THIS WORKED GREAT!
THEN – CRAZY GROWTH
HISTORY	
   START-­‐UP	
  
@me	
  
#	
  unique	
  logins	
  
TOTAL	
  ACTIVE	
  PLAYERS	
  
	
  June	
  2012	
  
CRAZY	
  
GROWTH	
  
THE BREAKING POINT
HISTORY	
   START-­‐UP	
  
CRAZY	
  
GROWTH	
  
BREAKING	
  
POINT	
  
‣  Data	
  warehouse	
  reached	
  a	
  breaking	
  point	
  
▾  24	
  hours	
  of	
  data	
  took	
  24.5	
  hours	
  to	
  ETL	
  
‣  We	
  couldn’t	
  handle…	
  
▾  Mul@ple	
  environments	
  in	
  a	
  ver@cal	
  MySQL	
  instance	
  	
  
▾  A	
  single	
  environment	
  in	
  a	
  ver@cal	
  MySQL	
  instance	
  
‣  We	
  needed	
  to	
  change	
  
	
  
INTRODUCTION OF HADOOP
HISTORY	
   START-­‐UP	
  
CRAZY	
  
GROWTH	
  
BREAKING	
  
POINT	
  
‣  Hadoop	
  has	
  a	
  number	
  of	
  great	
  quali@es	
  
▾  Cost	
  effec@ve	
  
▾  Scalable	
  
▾  Open	
  source	
  
▾  We	
  could	
  execute	
  quickly	
  
HADOOP	
  
HIGH LEVEL ARCHITECTURE – JUNE 2012
Tableau	
  
	
  
Hive	
  Data	
  Warehouse	
  
Pentaho	
  
	
  
+	
  
	
  
Custom	
  
ETL	
  
	
  
+	
  
	
  
Sqoop	
  
MySQL	
  Pentaho	
  
Analysts	
  
EUROPE	
  
Audit	
   Plat	
  
LoL	
  
KOREA	
  
Audit	
   Plat	
  
LoL	
  
NORTH	
  AMERICA	
  
Audit	
   Plat	
  
LoL	
  
Business	
  
Analyst	
  
BUT, THIS WASN’T GOOD ENOUGH
‣  The	
  @me	
  to	
  arrive	
  at	
  insight	
  was	
  too	
  long!	
  
‣  Our	
  solu@on	
  required	
  too	
  much	
  data	
  team	
  involvement	
  
▾  Schema	
  changes	
  
▾  ETL	
  tweaks	
  
▾  Hive	
  metadata	
  updates	
  
‣  Hive	
  is	
  painful	
  for	
  ad-­‐hoc	
  or	
  interac@ve	
  analysis	
  
▾  Especially	
  for	
  non-­‐technical	
  folks	
  
GOALS
‣  Democra@ze	
  data	
  access	
  
▾  Enable	
  Self-­‐service	
  Data	
  Collec@on	
  and	
  
Analysis	
  
‣  Create	
  ac@onable	
  insights	
  
‣  Increase	
  speed	
  to	
  insight	
  
USE CASE:
GAME CLIENT PERFORMANCE
CLIENT FOOTPRINT
‣  Significant	
  por@on	
  of	
  our	
  soware	
  runs	
  directly	
  on	
  players’	
  
machines	
  
▾  High	
  performance	
  graphics	
  
▾  Responsiveness	
  
‣  There	
  is	
  logic	
  in	
  these	
  components	
  that's	
  ONLY	
  exercised	
  
on	
  the	
  client-­‐side	
  
‣  Understanding	
  the	
  performance,	
  reliability	
  and	
  stability	
  of	
  
these	
  features	
  is	
  paramount	
  to	
  improving	
  the	
  player	
  
experience	
  
PATCHER
LOBBY CLIENT
GAME CLIENT
ITEM SHOP
CHALLENGE: THE GAME IS ALIVE
The	
  game	
  is	
  a	
  living,	
  breathing	
  service	
  that’s	
  always	
  in	
  mo@on	
  
‣  New	
  champions	
  
‣  New	
  items 	
  	
  
‣  New	
  effects/par@cles	
  
‣  Changes	
  in	
  environment	
  
‣  Changes	
  in	
  design	
  and	
  design	
  
balance	
  
	
  	
  
UPDATE
2-3WEEKS
CHALLENGE: WE’RE GLOBAL
CHALLENGE: PC VARIABILITY
‣  Hardware	
  and	
  OS	
  profiles	
  are	
  significantly	
  different	
  even	
  
within	
  regions	
  
▾  OS	
  and	
  patch	
  level	
  
▾  CPU	
  
▾  Memory	
  
▾  Video	
  card	
  
▾  Video	
  card	
  memory	
  
▾  Drivers	
  
CHALLENGE: GRAPHIC SETTINGS
CHALLENGE: CLIENT-SIDE LOGIC
IMPROVING THE PLAYER EXPERIENCE
‣  We	
  need	
  to	
  gather	
  informa@on	
  across	
  all	
  of	
  these	
  
dimensions	
  in	
  order	
  to	
  UNDERSTAND	
  the	
  player	
  experience	
  
‣  We	
  use	
  this	
  info	
  to:	
  
▾  React	
  quickly	
  to	
  changes	
  
▾  Op@mize	
  performance	
  
▾  Op@mize	
  designs	
  
▾  Improve	
  our	
  tes@ng	
  
•  Like	
  crea@ng	
  our	
  compa@bility	
  tes@ng	
  lab	
  
REACTING QUICKLY
GAME LOAD SCREEN
IMPROVING LOAD TIME
OPTIMIZING DESIGN AND PERFORMANCE
OPTIMIZING DESIGN AND PERFORMANCE
OPTIMIZING DESIGN AND PERFORMANCE
OPTIMIZING DESIGN AND PERFORMANCE
HOW DID WE SOLVE THIS
WE HAVE AN ARMY OF TEEMOS WATCHING PLAYERS’ MACHINES THROUGH THEIR TELESCOPES?!
(NOT REALLY, BUT WE DID CONSIDER IT)
HONU: GENERATE - COLLECT - ANALYZE
‣  Riot’s	
  self-­‐service	
  end-­‐to-­‐end	
  Big	
  Data	
  pipeline	
  
▾  Cloud-­‐ready	
  (AWS	
  compa@ble)	
  
▾  Internal	
  data-­‐center	
  ready	
  
▾  Persistent	
  storage:	
  HDFS/S3	
  
▾  Batch	
  processing:	
  Apache	
  Hadoop/AWS	
  EMR	
  
▾  Data	
  publish:	
  Apache	
  Hive	
  
	
  
EVENT GENERATION
‣  Honu	
  SDKs:	
  Java,	
  C++,	
  Erlang	
  
‣  Collector	
  discovery	
  
‣  Failover	
  
‣  Load	
  balancing	
  
‣  Buffering/Batching	
  
‣  Dispatching	
  
‣  Thri	
  transport	
  
HONU CLIENT SDK
Select	
  avg(f[‘pingAVG’])	
  from	
  game_client_stats	
  group	
  by	
  f[‘serverId’];	
  
pingAvg	
   serverId	
   system	
  source	
   	
  	
  app	
  @mestamp	
  
1234567890	
   99.123.456.78	
   game_client	
   220.9542	
   12.345.678.90	
   Intel64	
  …	
  
GAME_CLIENT_STATS	
  
EVENT COLLECTION
‣  Honu	
  collector	
  
‣  Online	
  system	
  
‣  High	
  availability	
  –	
  100%	
  up@me	
  
‣  Horizontally	
  scalable	
  
‣  Elas@c	
  
‣  Fault	
  tolerant	
  
‣  Neulix	
  OSS	
  Eureka	
  discovery	
  service	
  
HONU COLLECTOR
‣  Collect	
  events	
  from	
  mul@ple	
  clients	
  
(Thri/NIO)	
  
‣  Save	
  all	
  events	
  to	
  one	
  compressed	
  
file	
  locally	
  
‣  Upload	
  that	
  file	
  every	
  XX	
  minutes	
  to	
  
HDFS/S3	
  
‣  Send	
  a	
  message	
  to	
  Queue/SQS	
  for	
  
Demux	
  
H	
  o	
  n	
  u	
  C	
  o	
  l	
  l	
  e	
  c	
  t	
  o	
  r	
  s	
  
S	
  Q	
  S	
  
S	
  3	
  
EVENT ORGANIZATION
‣  Honu	
  demux	
  
‣  Mul@-­‐stage	
  batch	
  processing	
  pipeline	
  
‣  Elas@c	
  producer-­‐consumer	
  
‣  Apache	
  Hadoop	
  map	
  reduce	
  
‣  Standalone	
  map	
  reduce	
  mode	
  
‣  Apache	
  Hive	
  integra@on	
  
HONU DEMUX
‣  Mul@-­‐Stage	
  batch	
  
processing	
  pipeline	
  
‣  Bucket	
  events	
  to	
  separate	
  
tables	
  
‣  Write	
  Hive	
  par@@on	
  files	
  
‣  Add	
  par@@ons	
  to	
  Hive	
  
metastore	
  
‣  Merge	
  par@@ons	
  
	
  
Demux	
  
	
  SQS	
  
S3
S3	
  
Standalone
Demux
Standalone
Demux
Standalone
Demux
Standalone
Demux
S3 S3
S3 S3
HIVE	
  
MERGE	
  
HONU PIPELINE
HONU
CLIENT
SDK
HONU
COLLECTORS
HONU
DEMUX
ORGANIZECOLLECTGENERATE
USE CASE:
PLAYER BEHAVIOR
PLAYER BEHAVIOR
PLAYER BEHAVIOR INITIATIVES
TRIBUNAL JUSTICE
‣  Community	
  regulated	
  
‣  In-­‐game	
  chat	
  log	
  
‣  Player	
  stats	
  
‣  Inventory	
  
‣  Game	
  Info	
  
PLAYER BEHAVIOR INITIATIVES
HONOR SYSTEM
‣  Recognize	
  posi@ve	
  experience	
  
‣  Improve	
  sportsmanship	
  
STARTUP TIPS
TEAMS THAT USE SMART PINGS TO ALERT OTHER PLAYERS TO THREATS ARE MORE LIKELY TO WIN GAME
PLAYERS WHO FOLLOW THE SUMMONER'S CODE WIN 27% MORE GAMES
THE TRIBUNAL BANS PLAYERS FOR NEGATIVE BEHAVIOR SUCH AS VERBAL HARASSMENT
PLAYERS WHO COOPERATE WITH THEIR TEAM WIN 31% MORE GAMES
HOW WE SOLVED IT – EXTEND HONU
HONU
CLIENT
SDK
HONU
COLLECTORS
HONU
DEMUX
ORGANIZECOLLECTGENERATE
HONU TOOLS: DRADIS
‣  Hwp	
  based	
  data	
  collec@on	
  
‣  Large	
  volume	
  of	
  data	
  from	
  
untrusted	
  source	
  
‣  C10K	
  
‣  Nginx	
  +	
  Newy	
  
‣  4+	
  billion	
  API	
  calls/day	
  
‣  Peak	
  100K+	
  calls/sec	
  
	
  
HONU TOOLS: DRADIS
‣  Json	
  Messages:	
  
▾  curl	
  -­‐d	
  ’[	
  
{"messageType":	
  "Foo",	
  "@mestamp":	
  1369064555,	
  "fact":	
  "Hello	
  World!"},	
  {"messageType":	
  
"Foo",	
  "@mestamp":	
  1369064555,	
  "fact":	
  "Hello	
  Dradis!",	
  	
  
"fic@on":	
  "Hello	
  Honu!"}]’	
  	
  
‣  Hive	
  Query:	
  
▾  Select	
  *	
  from	
  foo	
  where	
  f[‘fact’]	
  =	
  ‘Hello	
  Dradis!’	
  
Table:	
  Foo	
  
HONU TOOLS: ECHO SERVICE
‣  Web	
  UI	
  to	
  easily	
  and	
  immediately	
  visualize	
  the	
  data	
  that	
  has	
  been	
  sent	
  
to	
  Honu	
  collectors	
  
‣  Self-­‐service	
  end-­‐to-­‐end	
  pipeline	
  
HONU TOOLS: ECHO SERVICE
‣  Web	
  UI	
  to	
  easily	
  and	
  immediately	
  visualize	
  the	
  data	
  that	
  has	
  been	
  sent	
  
to	
  Honu	
  collectors	
  
‣  Self-­‐service	
  end-­‐to-­‐end	
  pipeline	
  
HONU TOOLS: ECHO SERVICE
‣  Web	
  UI	
  to	
  easily	
  and	
  immediately	
  visualize	
  the	
  data	
  that	
  has	
  been	
  sent	
  
to	
  Honu	
  collectors	
  
‣  Self-­‐service	
  end-­‐to-­‐end	
  pipeline	
  
HONU TOOLS: METADATA SERVICE
‣  Data	
  discovery	
  
‣  Schema	
  management	
  
‣  Counter,	
  @me	
  
HONU TOOLS: REAL-TIME SLICING/DICING
‣  Integration with Platfora
‣  End-user ad-hoc analysis tool
‣  Interactive visual feedback
‣  Realtime exploration/graphing @ 109 data points
HONU TOOLS: REAL-TIME SLICING/DICING
HONU TOOLS: WORKFLOW MANAGEMENT
ENTERPRISE WORKFLOW
MANAGEMENT
MATT GOEKE
@ LATER TODAY
ClientMobile
WWW
HONU STATS
‣  7+ billion events/day
‣  Tested @ 70+ billion events/day
‣  100+ tables
▾  10+ tables @ 100M – 1B rows/day
‣  7 Petabytes Game Event Dataset
‣  Semi-global deployment
‣  0 downtime
‣  Runs in cloud (AWS) +
datacenter
SUMMARY
GOALS
ü Democra@ze	
  Data	
  Access	
  
ü Enable	
  Self-­‐service	
  Data	
  Collec@on	
  and	
  Analysis	
  
ü Create	
  Ac@onable	
  Insights	
  
ü Increase	
  Speed	
  to	
  Insight	
  
HONU
HONU
CLIENT
SDK
FUTURE
‣  Improve	
  self-­‐service	
  workflow	
  &	
  tooling	
  
▾  Metadata	
  management	
  
▾  Discovery	
  of	
  captured	
  data	
  
▾  Workflow	
  management	
  
▾  Plauora	
  to	
  all	
  teams	
  
‣  Real@me	
  event	
  aggrega@on	
  
‣  Global	
  data	
  infrastructure	
  
‣  Replace	
  legacy	
  audit/event	
  logging	
  services	
  
HANDLE INCREASING DATA VELOCITY
JUNE 2012 JULY 2013
MySQL	
  tables	
   180	
   1200	
  
Pipeline	
  Events/day	
   0	
   7+	
  Billion	
  
Workflows	
   Cronjob	
  +	
  Pentaho	
   Oozie	
  
Environment	
   Datacenter	
   DC	
  +	
  AWS	
  
SLA	
   1	
  day	
   2	
  hours	
  
Event	
  tracking	
   •  2+	
  weeks	
  (DB	
  
update)	
  
•  Dependencies:	
  DBA	
  
teams	
  +	
  ETL	
  teams	
  +	
  
Tools	
  teams	
  
•  Down@me	
  (3h	
  min.)	
  
•  10	
  minutes	
  
•  Self-­‐Service	
  
	
  
•  No	
  down@me	
  
DECREASE TEEMO DEATHS?
SHAMELESS HIRING PLUG
Like most everybody else at this conference… we’re hiring!
PLAYER EXPERIENCE FIRST
CHALLENGE CONVENTION
FOCUS ON TALENT AND TEAM
TAKE PLAY SERIOUSLY
STAY HUNGRY, STAY HUMBLE
THE RIOT MANIFESTO
SHAMELESS HIRING PLUG
AND YES, YOU CAN PLAY GAMES AT WORK
IT’S ENCOURAGED!
THANK YOU! QUESTIONS?
BARRY LIVINGSTON
blivingston@riotgames.com
SANDEEP SHRESTHA
sshrestha@riotgames.com

More Related Content

What's hot

Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheDremio Corporation
 
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...confluent
 
Airflow Clustering and High Availability
Airflow Clustering and High AvailabilityAirflow Clustering and High Availability
Airflow Clustering and High AvailabilityRobert Sanders
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastoreKishore Gopalakrishna
 
Spotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great SuccessSpotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great SuccessNick Barkas
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Log analysis using elk
Log analysis using elkLog analysis using elk
Log analysis using elkRushika Shah
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningDavid Stein
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain confluent
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...DataWorks Summit/Hadoop Summit
 
cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouseAltinity Ltd
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best PracticesCloudera, Inc.
 

What's hot (20)

Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
 
Airflow Clustering and High Availability
Airflow Clustering and High AvailabilityAirflow Clustering and High Availability
Airflow Clustering and High Availability
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
 
Spotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great SuccessSpotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great Success
 
HDFS Analysis for Small Files
HDFS Analysis for Small FilesHDFS Analysis for Small Files
HDFS Analysis for Small Files
 
Apache Spark Notes
Apache Spark NotesApache Spark Notes
Apache Spark Notes
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Log analysis using elk
Log analysis using elkLog analysis using elk
Log analysis using elk
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouse
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
 

Similar to Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013

(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...Amazon Web Services
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogC4Media
 
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLARiot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLAsean_seannery
 
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka AggregationKafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregationconfluent
 
Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)Kevin Weil
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Nathan Bijnens
 
MySQL Performance Monitoring
MySQL Performance MonitoringMySQL Performance Monitoring
MySQL Performance Monitoringspil-engineering
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backIcinga
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Amazon Web Services Korea
 
Using Event Streams in Serverless Applications
Using Event Streams in Serverless ApplicationsUsing Event Streams in Serverless Applications
Using Event Streams in Serverless ApplicationsJonathan Dee
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleSean Chittenden
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNATomas Cervenka
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event datayalisassoon
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in PracticeC4Media
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigrainePeak Hosting
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidDataWorks Summit
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hdslantsixgames
 

Similar to Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013 (20)

(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLARiot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
 
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka AggregationKafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
 
Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013
 
MySQL Performance Monitoring
MySQL Performance MonitoringMySQL Performance Monitoring
MySQL Performance Monitoring
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to back
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
 
Using Event Streams in Serverless Applications
Using Event Streams in Serverless ApplicationsUsing Event Streams in Serverless Applications
Using Event Streams in Serverless Applications
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
 
HDInsight for Architects
HDInsight for ArchitectsHDInsight for Architects
HDInsight for Architects
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event data
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in Practice
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration Migraine
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hd
 

More from StampedeCon

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...StampedeCon
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017StampedeCon
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...StampedeCon
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017StampedeCon
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017StampedeCon
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...StampedeCon
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...StampedeCon
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017StampedeCon
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017StampedeCon
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017StampedeCon
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017StampedeCon
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017StampedeCon
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017StampedeCon
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...StampedeCon
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...StampedeCon
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016StampedeCon
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016StampedeCon
 

More from StampedeCon (20)

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013

  • 1. BIG DATA @ RIOT GAMES USING HADOOP TO IMPROVE THE PLAYER EXPERIENCE BARRY LIVINGSTON & SANDEEP SHRESTHA | JULY 2013
  • 3. CONTEXT HIGH LEVEL ARCHITECTURE PLAYER EXPERIENCE USE CASES SUMMARY QUICK DATA WAREHOUSE HISTORY
  • 4. FIRST, A BIT OF CONTEXT…
  • 5.
  • 6. WHAT IS LEAGUE OF LEGENDS? 2009 LAUNCH TEAM ORIENTED 100+ CHAMPS MODERN FANTASY
  • 7. WHAT IS LEAGUE OF LEGENDS?
  • 8. LEAGUE OF LEGENDS GAMEPLAY - CHAMPIONS
  • 9. LEAGUE OF LEGENDS GAMEPLAY - GAMEPLAY
  • 11. INITIAL LAUNCH / SCRAPPY START UP PHASE ‣  Had  a  single,  dedicated  MySQL  instance  for  the  DW   ‣  Data  was  ETL’d  from  produc@on  slaves  into  this  instance   ‣  Queries  were  run  in  MySQL   ‣  Repor@ng  was  done  in  Excel   ▾  All  ETLs,  queries  and  repor@ng  were  done  by  one  person   HISTORY   START-­‐UP   THIS WORKED GREAT!
  • 12. THEN – CRAZY GROWTH HISTORY   START-­‐UP   @me   #  unique  logins   TOTAL  ACTIVE  PLAYERS    June  2012   CRAZY   GROWTH  
  • 13. THE BREAKING POINT HISTORY   START-­‐UP   CRAZY   GROWTH   BREAKING   POINT   ‣  Data  warehouse  reached  a  breaking  point   ▾  24  hours  of  data  took  24.5  hours  to  ETL   ‣  We  couldn’t  handle…   ▾  Mul@ple  environments  in  a  ver@cal  MySQL  instance     ▾  A  single  environment  in  a  ver@cal  MySQL  instance   ‣  We  needed  to  change    
  • 14. INTRODUCTION OF HADOOP HISTORY   START-­‐UP   CRAZY   GROWTH   BREAKING   POINT   ‣  Hadoop  has  a  number  of  great  quali@es   ▾  Cost  effec@ve   ▾  Scalable   ▾  Open  source   ▾  We  could  execute  quickly   HADOOP  
  • 15. HIGH LEVEL ARCHITECTURE – JUNE 2012 Tableau     Hive  Data  Warehouse   Pentaho     +     Custom   ETL     +     Sqoop   MySQL  Pentaho   Analysts   EUROPE   Audit   Plat   LoL   KOREA   Audit   Plat   LoL   NORTH  AMERICA   Audit   Plat   LoL   Business   Analyst  
  • 16. BUT, THIS WASN’T GOOD ENOUGH ‣  The  @me  to  arrive  at  insight  was  too  long!   ‣  Our  solu@on  required  too  much  data  team  involvement   ▾  Schema  changes   ▾  ETL  tweaks   ▾  Hive  metadata  updates   ‣  Hive  is  painful  for  ad-­‐hoc  or  interac@ve  analysis   ▾  Especially  for  non-­‐technical  folks  
  • 17. GOALS ‣  Democra@ze  data  access   ▾  Enable  Self-­‐service  Data  Collec@on  and   Analysis   ‣  Create  ac@onable  insights   ‣  Increase  speed  to  insight  
  • 18. USE CASE: GAME CLIENT PERFORMANCE
  • 19. CLIENT FOOTPRINT ‣  Significant  por@on  of  our  soware  runs  directly  on  players’   machines   ▾  High  performance  graphics   ▾  Responsiveness   ‣  There  is  logic  in  these  components  that's  ONLY  exercised   on  the  client-­‐side   ‣  Understanding  the  performance,  reliability  and  stability  of   these  features  is  paramount  to  improving  the  player   experience  
  • 24. CHALLENGE: THE GAME IS ALIVE The  game  is  a  living,  breathing  service  that’s  always  in  mo@on   ‣  New  champions   ‣  New  items     ‣  New  effects/par@cles   ‣  Changes  in  environment   ‣  Changes  in  design  and  design   balance       UPDATE 2-3WEEKS
  • 26. CHALLENGE: PC VARIABILITY ‣  Hardware  and  OS  profiles  are  significantly  different  even   within  regions   ▾  OS  and  patch  level   ▾  CPU   ▾  Memory   ▾  Video  card   ▾  Video  card  memory   ▾  Drivers  
  • 29. IMPROVING THE PLAYER EXPERIENCE ‣  We  need  to  gather  informa@on  across  all  of  these   dimensions  in  order  to  UNDERSTAND  the  player  experience   ‣  We  use  this  info  to:   ▾  React  quickly  to  changes   ▾  Op@mize  performance   ▾  Op@mize  designs   ▾  Improve  our  tes@ng   •  Like  crea@ng  our  compa@bility  tes@ng  lab  
  • 33. OPTIMIZING DESIGN AND PERFORMANCE
  • 34. OPTIMIZING DESIGN AND PERFORMANCE
  • 35. OPTIMIZING DESIGN AND PERFORMANCE
  • 36. OPTIMIZING DESIGN AND PERFORMANCE
  • 37. HOW DID WE SOLVE THIS WE HAVE AN ARMY OF TEEMOS WATCHING PLAYERS’ MACHINES THROUGH THEIR TELESCOPES?! (NOT REALLY, BUT WE DID CONSIDER IT)
  • 38. HONU: GENERATE - COLLECT - ANALYZE ‣  Riot’s  self-­‐service  end-­‐to-­‐end  Big  Data  pipeline   ▾  Cloud-­‐ready  (AWS  compa@ble)   ▾  Internal  data-­‐center  ready   ▾  Persistent  storage:  HDFS/S3   ▾  Batch  processing:  Apache  Hadoop/AWS  EMR   ▾  Data  publish:  Apache  Hive    
  • 39. EVENT GENERATION ‣  Honu  SDKs:  Java,  C++,  Erlang   ‣  Collector  discovery   ‣  Failover   ‣  Load  balancing   ‣  Buffering/Batching   ‣  Dispatching   ‣  Thri  transport  
  • 40. HONU CLIENT SDK Select  avg(f[‘pingAVG’])  from  game_client_stats  group  by  f[‘serverId’];   pingAvg   serverId   system  source      app  @mestamp   1234567890   99.123.456.78   game_client   220.9542   12.345.678.90   Intel64  …   GAME_CLIENT_STATS  
  • 41. EVENT COLLECTION ‣  Honu  collector   ‣  Online  system   ‣  High  availability  –  100%  up@me   ‣  Horizontally  scalable   ‣  Elas@c   ‣  Fault  tolerant   ‣  Neulix  OSS  Eureka  discovery  service  
  • 42. HONU COLLECTOR ‣  Collect  events  from  mul@ple  clients   (Thri/NIO)   ‣  Save  all  events  to  one  compressed   file  locally   ‣  Upload  that  file  every  XX  minutes  to   HDFS/S3   ‣  Send  a  message  to  Queue/SQS  for   Demux   H  o  n  u  C  o  l  l  e  c  t  o  r  s   S  Q  S   S  3  
  • 43. EVENT ORGANIZATION ‣  Honu  demux   ‣  Mul@-­‐stage  batch  processing  pipeline   ‣  Elas@c  producer-­‐consumer   ‣  Apache  Hadoop  map  reduce   ‣  Standalone  map  reduce  mode   ‣  Apache  Hive  integra@on  
  • 44. HONU DEMUX ‣  Mul@-­‐Stage  batch   processing  pipeline   ‣  Bucket  events  to  separate   tables   ‣  Write  Hive  par@@on  files   ‣  Add  par@@ons  to  Hive   metastore   ‣  Merge  par@@ons     Demux    SQS   S3 S3   Standalone Demux Standalone Demux Standalone Demux Standalone Demux S3 S3 S3 S3 HIVE   MERGE  
  • 48. PLAYER BEHAVIOR INITIATIVES TRIBUNAL JUSTICE ‣  Community  regulated   ‣  In-­‐game  chat  log   ‣  Player  stats   ‣  Inventory   ‣  Game  Info  
  • 49. PLAYER BEHAVIOR INITIATIVES HONOR SYSTEM ‣  Recognize  posi@ve  experience   ‣  Improve  sportsmanship  
  • 50. STARTUP TIPS TEAMS THAT USE SMART PINGS TO ALERT OTHER PLAYERS TO THREATS ARE MORE LIKELY TO WIN GAME PLAYERS WHO FOLLOW THE SUMMONER'S CODE WIN 27% MORE GAMES THE TRIBUNAL BANS PLAYERS FOR NEGATIVE BEHAVIOR SUCH AS VERBAL HARASSMENT PLAYERS WHO COOPERATE WITH THEIR TEAM WIN 31% MORE GAMES
  • 51. HOW WE SOLVED IT – EXTEND HONU HONU CLIENT SDK HONU COLLECTORS HONU DEMUX ORGANIZECOLLECTGENERATE
  • 52. HONU TOOLS: DRADIS ‣  Hwp  based  data  collec@on   ‣  Large  volume  of  data  from   untrusted  source   ‣  C10K   ‣  Nginx  +  Newy   ‣  4+  billion  API  calls/day   ‣  Peak  100K+  calls/sec    
  • 53. HONU TOOLS: DRADIS ‣  Json  Messages:   ▾  curl  -­‐d  ’[   {"messageType":  "Foo",  "@mestamp":  1369064555,  "fact":  "Hello  World!"},  {"messageType":   "Foo",  "@mestamp":  1369064555,  "fact":  "Hello  Dradis!",     "fic@on":  "Hello  Honu!"}]’     ‣  Hive  Query:   ▾  Select  *  from  foo  where  f[‘fact’]  =  ‘Hello  Dradis!’   Table:  Foo  
  • 54. HONU TOOLS: ECHO SERVICE ‣  Web  UI  to  easily  and  immediately  visualize  the  data  that  has  been  sent   to  Honu  collectors   ‣  Self-­‐service  end-­‐to-­‐end  pipeline  
  • 55. HONU TOOLS: ECHO SERVICE ‣  Web  UI  to  easily  and  immediately  visualize  the  data  that  has  been  sent   to  Honu  collectors   ‣  Self-­‐service  end-­‐to-­‐end  pipeline  
  • 56. HONU TOOLS: ECHO SERVICE ‣  Web  UI  to  easily  and  immediately  visualize  the  data  that  has  been  sent   to  Honu  collectors   ‣  Self-­‐service  end-­‐to-­‐end  pipeline  
  • 57. HONU TOOLS: METADATA SERVICE ‣  Data  discovery   ‣  Schema  management   ‣  Counter,  @me  
  • 58. HONU TOOLS: REAL-TIME SLICING/DICING ‣  Integration with Platfora ‣  End-user ad-hoc analysis tool ‣  Interactive visual feedback ‣  Realtime exploration/graphing @ 109 data points
  • 59. HONU TOOLS: REAL-TIME SLICING/DICING
  • 60. HONU TOOLS: WORKFLOW MANAGEMENT ENTERPRISE WORKFLOW MANAGEMENT MATT GOEKE @ LATER TODAY ClientMobile WWW
  • 61. HONU STATS ‣  7+ billion events/day ‣  Tested @ 70+ billion events/day ‣  100+ tables ▾  10+ tables @ 100M – 1B rows/day ‣  7 Petabytes Game Event Dataset ‣  Semi-global deployment ‣  0 downtime ‣  Runs in cloud (AWS) + datacenter
  • 63. GOALS ü Democra@ze  Data  Access   ü Enable  Self-­‐service  Data  Collec@on  and  Analysis   ü Create  Ac@onable  Insights   ü Increase  Speed  to  Insight   HONU HONU CLIENT SDK
  • 64. FUTURE ‣  Improve  self-­‐service  workflow  &  tooling   ▾  Metadata  management   ▾  Discovery  of  captured  data   ▾  Workflow  management   ▾  Plauora  to  all  teams   ‣  Real@me  event  aggrega@on   ‣  Global  data  infrastructure   ‣  Replace  legacy  audit/event  logging  services  
  • 65. HANDLE INCREASING DATA VELOCITY JUNE 2012 JULY 2013 MySQL  tables   180   1200   Pipeline  Events/day   0   7+  Billion   Workflows   Cronjob  +  Pentaho   Oozie   Environment   Datacenter   DC  +  AWS   SLA   1  day   2  hours   Event  tracking   •  2+  weeks  (DB   update)   •  Dependencies:  DBA   teams  +  ETL  teams  +   Tools  teams   •  Down@me  (3h  min.)   •  10  minutes   •  Self-­‐Service     •  No  down@me  
  • 67. SHAMELESS HIRING PLUG Like most everybody else at this conference… we’re hiring! PLAYER EXPERIENCE FIRST CHALLENGE CONVENTION FOCUS ON TALENT AND TEAM TAKE PLAY SERIOUSLY STAY HUNGRY, STAY HUMBLE THE RIOT MANIFESTO
  • 68. SHAMELESS HIRING PLUG AND YES, YOU CAN PLAY GAMES AT WORK IT’S ENCOURAGED!
  • 69. THANK YOU! QUESTIONS? BARRY LIVINGSTON blivingston@riotgames.com SANDEEP SHRESTHA sshrestha@riotgames.com