In the online gaming industry we receive a vast amount of transactions that need to be handled in real time. Our customers get to choose from hundreds or even thousand options, and providing a seamless experience is crucial in our industry. Recommendation systems can be the answer in such cases but require handling loads of data and need to utilize large amounts of processing power. Towards this goal, in the last two years we have taken down the road of machine learning and AI in order to transform our customer’s daily experience and upgrade our internal services.
3. About
▪ Kaizen is a top GameTech company in Greece and one of the fastest
growing in Europe.
▪ At Kaizen we use the technology to offer the best possible product
and services to those who trust us for their entertainment.
5. A bit of history - initial workflow
▪ Several data sources
▪ Data Warehouse, DB’s, Files etc.
▪ Training on local workstation
▪ Model / application
deployment (docker)
6. Architecture Bottlenecks and Challenges
▪ Data
▪ Data availability
▪ Time traveling
▪ Noisy label / no label
▪ Features
▪ Recalculation
▪ Model
▪ Versioning
▪ Experiment tracking / logs
▪ Dedicated VMs
▪ Scalability
▪ Application dockerization
▪ Model versioning
ApplicationMachine learning
7. Journey Log: Day 210
▪ Databricks & Azure
▪ Real-time Data flows
▪ Feature creation
▪ Model predictions
▪ Batch Data flows
▪ Model training
▪ ETL
▪ MLflow
▪ Experiment Tracking
▪ Model registry
▪ Delta Lake
▪ Single Source of Truth
▪ ACID transactions
▪ Time travel
8. Designing Data Pipelines (What, Why) => How
▪ What, Why
▪ Input:
▪ Structured Data stored in Kafka in avro format
▪ Latency up to 10 sec
▪ Output:
▪ avro messages dispatched in Kafka
▪ directly consumed from microservices
▪ How
▪ Use structured streaming for both:
▪ feature generation
▪ model prediction
▪ Use Kafka for low latency and pipelining between
data flows
Use case 1. Pipelines with low latency
9. Designing Data Pipelines (What, Why) => How
▪ What, Why
▪ Input:
▪ Structured Data stored in Kafka in avro format
▪ Delta Tables
▪ Latency few minutes
▪ Output:
▪ Delta Tables
▪ PostgreSQL tables
▪ How
▪ Use structured streaming for both:
▪ feature generation
▪ model prediction
▪ Use Batch processing for feature vector generation
Use case 2. Pipelines with average latency
10. Personalization Journey
▪ Some numbers
▪ ~3K unique games per day
▪ ~ breaks down to markets
▪ ~300K unique events per year
▪ Our aim is to provide
▪ personalized content
▪ improve experience
▪ increase loyalty
Sportsbook Personalization
12. Personalization Journey
▪ Reward increases loyalty
▪ ~ 40% of customer support communication
▪ ~ 4.5M bonus reward assessments per year
▪ Manual and periodic assessments
▪ Real-time decision on bonus eligibility and allocation
Real Time Bonus Computation
13. Architecture and technical overview
▪ Feature / prediction streaming
▪ Binary Classification / MLlib
Gradient Boosting
▪ MLflow
▪ Experimental tracking
▪ Model deployment
▪ Model registry
14. Future steps
▪ Real-time applications
▪ Feature store and reusability
▪ Cassandra
▪ MLflow Model Serving
▪ Use Redis for key value lookup
use cases