Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gdmc v11 presentation

GDMC2017 results

  • Login to see the comments

Gdmc v11 presentation

  1. 1. 2017 IEEE CIG Game Data Mining Competition (GDMC) (https://cilab.sejong.ac.kr/gdmc2017/ 1 KyungJoong Kim, Dumim Yoon and Jihoon Jeon (Cognition & Intelligence Lab, Sejong University) Sung-il Yang and SangKwang Lee (Electronics and Telecommunications Research Institute) EunJo Lee and Yoonjae Jang (NCSOFT)
  2. 2. Game Data Mining • Understanding game players’ behaviors from data • Especially, predict players’ churn/retention or purchase behaviors from game log data • Few public datasets available to researchers and it limits the growth of the field 2
  3. 3. Game Data Mining Competition • Access to the big game log data (about 100G) from commercially successful MMORPG game, Blade & Soul by NCSOFT, one of the biggest game companies in South Korea • Predict the game players’ churn (binary classification problem) and survival time (regression problem) from the massive game log data 3
  4. 4. 4 http://www.bladeandsoul.com/en/
  5. 5. Competition Tracks Track 1: Churn Prediction In this track, participants will predict players’ churn or retention on the test datasets. The winner will be determined based on the average F1-Measure. Track 2: Survival Analysis In this track, participants will predict the survival time (the number of days) of game players on the test datasets. The winner will be determined based on the average Root Mean Squared Logarithmic Error (RMSLE). 5
  6. 6. GDMC 2017 Homepage • Important Dates • Problem Description • Tutorial (with R) • Data Description • Rules 6 https://cilab.sejong.ac.kr/gdmc2017/
  7. 7. GDMC 2017 Google Groups https://groups.google.com/d/forum/gdmc2017 • Announcement • Sample Log • Log Schema • Log Data Download • Training Data • Test Data without Label • Question/Answer 7 0 76 106 206 255 264 0 50 100 150 200 250 300 March April May June July August #ofMembers
  8. 8. Test Server http://web_cilab.sejong.ac.kr/gdmcServer/ 8 • Test your predictions before the deadline • 10% of test data used for this test server (not used in final rankings) • For security reason, limit maximum 48 trials per day (30 minutes waiting time from the last submission)
  9. 9. Problems Description 9
  10. 10. Prediction Targets 10 Expense Loyalty Light Users or Malicious Users (Bots) Prediction Targets
  11. 11. Predictions about 3 Weeks from Now 11 Churn/Retention Time Three WeeksTwo Months User Data
  12. 12. Churn/Retention • Long-term inactive state as a Churn • How many weeks for churn decision? • Five Weeks • Retention: Logged in the game more than once during the five weeks 12
  13. 13. Concept Drift (Dec 2016~) 13 Subscription Model (Monthly Fixed Charge Payment) Free-to-Play
  14. 14. Data Description 14 Data Set Time Period Weeks Number of Gamers Data Size* Training APR-1-2017 ~ MAY-11-2017 6 4000 (30% churn) 48G (175m Events) Test Set 1 JULY-27-2016 ~ SEP-21-2016 8 3000 (30% churn) 30G Test Set 2 DEC-14-2017 ~ FEB-08-2017 8 3000 (30% churn) 30G * Uncompressed Size
  15. 15. Log Data Sample 15 Time Event Type Details (up to 72 columns) 2016-05-04 6:38:32 PM Enter World Login Type, Actor Data … 2016-05-04 6:39:16 PM Enter Zone Enter Zone Reason, Zone Type … 2016-05-04 6:39:36 PM Lose Item Item Type, Item Count, … 2016-05-04 6:39:36 PM Get Item Item Type, Item Count, … 2016-05-04 6:39:40 PM Get Item Item Type, Item Count, … ⋮ ⋮ ⋮ 82 Event Types (World, Zone, Item, Party, Quest, Guild)
  16. 16. Competition Results Track 1 Churn Prediction 16
  17. 17. Participants (13 Teams) 17 Team name Team member Affiliation Type County GoAlone 1 Yonsei University Academia South Korea DTND 3 DTND ? South Korea goedle.io 2 goedle.io GmbH Industry Germany IISLABSKKU 3 Sungkyunkwan University Academia South Korea leessang 2 Yonsei University Academia South Korea TheCowKing 2 KAIST Academia South Korea TripleS 3 - ? South Korea UTU 4 University of Turku Academia Finland YD 6 Silicon Studio Industry Japan YK 1 Yonsei University Academia South Korea suya 1 Yonsei University Academia South Korea NoJam 3 Yonsei University Academia South Korea MNDS 3 Yonsei University Academia South Korea
  18. 18. 18 Rank Team Test1 score Test2 score Total score 1 YD (Japan) 0.61008 0.63326 0.62145 2 UTU (Finland) 0.60326 0.60370 0.60348 3 TripleS (Korea) 0.57968 0.62459 0.60130 4 TheCowKing 0.59370 0.60718 0.60036 5 goedleio 0.57717 0.60095 0.58882 6 MNDS 0.55920 0.56205 0.56062 7 DTND 0.49937 0.58776 0.53997 8 IISLABSKKU 0.56643 0.48733 0.52391 9 suya 0.44460 0.40967 0.42642 10 YK 0.49099 0.33181 0.39600 11 GoAlone 0.42697 0.31019 0.35933 12 NoJam 0.30741 0.30930 0.30835 13 Lessang 0.29760 0.29202 0.29479
  19. 19. YD (Winner) • Silicon Studio, Japan • Team Members: Paul Bertens, Pei Pei Chen, Kexin Chen, Anna Guitart, Sovann Lay, Africa Perianez • Find features which have similar distribution between training set and testing set. • Test 1 : LSTM + DNN (implemented with Keras) • Test 2 : Extra Tree Classifier (# of trees = 50) 19
  20. 20. 20 LSTM+DNN from the document of YD team
  21. 21. 21 Rank Team Techniques 1 YD LSTM+DNN, Extra-Trees Classifier 2 UTU Logistic Regression 3 TripleS Random Forest 4 TheCowKing LightGBM (Light Gradient Boosting Machine) 5 goedleio Feed Forward Neural Network 6 MNDS Deep Neural Network 7 DTND Generalized Linear Model 8 IISLABSKKU Tree Boosting 9 suya Deep Neural Network 10 YK Logistic Regression 11 GoAlone Logistic Regression 12 NoJam Decision Tree 13 Lessang Deep Neural Network Neural Net Tree Approach Linear Models
  22. 22. Competition Results Track 2 Survival Analysis 22
  23. 23. Participants (5 Teams) 23 Team name Team member Affiliation County DTND 3 DTND South Korea IISLABSKKU 3 Sungkyunkwan University South Korea TripleS 3 - South Korea UTU 4 University of Turku Finland YD 6 Silicon Studio Japan
  24. 24. 24 Rank Team Test1 score Test2 score Total score 1 YD (Japan) 0.883248 0.616499 0.726151 2 IISLABSKKU (Korea) 1.034321 0.679214 0.819972 3 UTU (Finland) 0.927712 0.898471 0.912857 4 TripleS 0.958308 0.891106 0.923486 5 DTND 1.032688 0.930417 0.978888
  25. 25. 25 Rank Team Techniques 1 YD Ensemble of Conditional Inference Trees (# of Trees = 900) 2 IISLABSKKU Tree Boosting 3 UTU Linear Regression 4 TripleS Ensemble Tree Method 5 DTND Generalized Linear Model Neural Net Tree Approach Linear Models
  26. 26. Future Data Use • Data Download Deadline • Active until end of August, we’re under discussion to extend the deadline • Data Use for Academic Research • No restriction on the data use for academic research (please include acknowledgement on this competition and NCSOFT) • Test Data Label • We’ll open the test data label soon. 26
  27. 27. Q & A 27

×