More Related Content Similar to OSCON 2015 (20) OSCON 20152. About Me ▪ Netflix
- I joined Netflix in 2011
- I spend my time working to make big data easy and efficient
- Usually from the perspective of someone trying to use the platform
▪ University of Florida
- Research in Information Retrieval
- How much information does a document have
5. ~20 PB of compressed data
~500 billion events a day
~18K data sets
~4200 nodes in our clusters
11. Task Hour Cost = (cost of node)/(tasks per node) * sum(task duration ms)/(60*60*1000)
24. Dataset Distinct Queries
… 2000
… 1052
prodhive/dse/geo_country_d 1009
prodhive/dse/ttl_title_d 580
… 565
… 512
… 466
… 427
… 395
… 317
26. Related To geo_country_d Shared Queries
prodhive/dse/ttl_title_country_r 2277
… 1697
prodhive/dse/ttl_show_d 1540
prodhive/dse/ttl_season_d 1405
prodhive/dse/ttl_title_d 1392
… 926
… 817
… 743
prodhive/dse/ttl_season_country_r 638
… 628
28. Datasets Input Jobs Queries
prodhive/cdn/occ… 2016 66
teradata/gdw_stg_prod/seg… 1587 36
prodhive/dse/msg… 1527 14
prodhive/dse/msg… 1512 30
teradata/gdw_stg_prod/seg… 1043 50
teradata/gdw_stg_prod/cdn… 970 10
teradata/gdw_tbl_prod/seg… 903 1
prodhive/rpt/pbe… 811 11
prodhive/gps/gro… 904 137
prodhive/cdn/ttl… 631 39
30. Challenges ▪ Knowing what questions should you try to answer.
▪ Getting this data isn’t easy.
▪ The data is noisy.
31. Thanks ▪ Charles Smith – Big Data Platform Architecture Netflix
▪ @charles_s_smith