3. Deezer overview
RecSysFr #3
โ 420 employees in 20 cities
โ 5M albums
โ 40M tracks
โ 100M playlists
โ 16M MAU
โ 6M subscribers
4. โ ~500 servers
โ 4.5 PB storage for audio files
โ 1.5 TB of logs / day
โ ~1B requests / day
โ ~30k new albums each week
โ Hadoop cluster with 1.5PB storage,
4TB RAM, 1000+ vcores
Some technical numbers
RecSysFr #3
10. RecSysFr #3
Architecture overview
Content data:
- Tags
- Popularity
User data:
- Taste model
- Hot tracks
- Behaviors
Build tracklist
- Data cache
- User action history
- Update user models
- Consolidate tags data
- Build indexes
actions logs
11. RecSysFr #3
โ % users listening more than 10mn
โ % users who reconnect more than 3
days last week
โ % users who do a like / dislike
=> take care of statistical confidence !
A/B Tests evaluation metrics
12. โ A/B tests are costly, long
โ Want to test more cases
Offline testing:
โ setup benchmarking methodology
โ Freeze data and evaluate algos with user future actions
RecSysFr #3
Offline testing / benchmarking