SlideShare a Scribd company logo
1 of 6
Big Data Panel

Deepak Agarwal, LinkedIn
      JSM, 2012
    San Diego, USA
Disclaimer

• The opinions expressed here are mine and in no way
  represent the official position of LinkedIn
Example of user interaction




ts, user-id, <items shown at various slots>, <what was clicked?>, < what after click>

user-id: covariates; item-id: covariates; user-id: social connections
Statistical Challenges
• Exploratory Analysis (EDA), Visualization
  – Retrospective (on Terabytes)
  – More Real Time (every few minutes/hours)
• Statistical Modeling
  – Scale (computational challenge)
  – Dimensionality (few categorical variables with
    massive number of levels interacting)
  – Temporal Effects
Statistical Challenges continued
• Experiments
  – To test new methods, test hypothesis from
    randomized experiments
  – Adaptive experiments
• Forecasting
  – Planning, advertising
My 2 cents
•   BD problems are complex, messy, it is inherently multi-disciplinary
•   Having a clear idea of the underlying scientific problem important
•   Systems, Algorithms, Statistics, Machine Learning, Optimization,…
•   Statisticians could consume wonderful tools created by our friends,
    develop the statistical aspects
     – Learn Hadoop and Pig, it has become easy (like R)
• Emphasis on areas like sampling, DOE, scalable model fitting

• More collaborative programs between academia/industry,
  academia/government
     – E.g. Training programs for students working with problem ownners

More Related Content

What's hot

Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
Liang Xiang
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
Georgian Micsa
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
youalab
 
Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011
idoguy
 
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Anmol Bhasin
 

What's hot (20)

Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender Systems
 
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 

Viewers also liked (17)

Dobronovskyi
DobronovskyiDobronovskyi
Dobronovskyi
 
Pernik
PernikPernik
Pernik
 
Worlds of Words
Worlds of WordsWorlds of Words
Worlds of Words
 
UNITURBINE
UNITURBINEUNITURBINE
UNITURBINE
 
Wedding slideshow
Wedding slideshowWedding slideshow
Wedding slideshow
 
Photography work
Photography workPhotography work
Photography work
 
07 bio มข
07 bio มข07 bio มข
07 bio มข
 
8 Trends of Training&Development 2012
8 Trends of Training&Development 20128 Trends of Training&Development 2012
8 Trends of Training&Development 2012
 
Bergens blonde
Bergens blondeBergens blonde
Bergens blonde
 
Budget update 2013
Budget update 2013Budget update 2013
Budget update 2013
 
Proyecto Labmovel
Proyecto LabmovelProyecto Labmovel
Proyecto Labmovel
 
The silent whisper production diary
The silent whisper production diary The silent whisper production diary
The silent whisper production diary
 
Macro
MacroMacro
Macro
 
Informatics and Computing Infrastructure for Clinical High-Throughput Sequenc...
Informatics and Computing Infrastructure for Clinical High-Throughput Sequenc...Informatics and Computing Infrastructure for Clinical High-Throughput Sequenc...
Informatics and Computing Infrastructure for Clinical High-Throughput Sequenc...
 
Petersburggroups2
Petersburggroups2Petersburggroups2
Petersburggroups2
 
Mahallu management software
Mahallu management softwareMahallu management software
Mahallu management software
 
Global tobacco epidemic through the European lens
Global tobacco epidemic through the European lensGlobal tobacco epidemic through the European lens
Global tobacco epidemic through the European lens
 

Similar to Bdpanel

Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
juliennehar
 
First Cycle CodingContent drawn from Johnny Saldana’s The .docx
First Cycle CodingContent drawn from Johnny Saldana’s The .docxFirst Cycle CodingContent drawn from Johnny Saldana’s The .docx
First Cycle CodingContent drawn from Johnny Saldana’s The .docx
clydes2
 

Similar to Bdpanel (20)

Applying Systems Thinking to Solve Wicked Problems in Software Engineering
Applying Systems Thinking to Solve Wicked Problems in Software EngineeringApplying Systems Thinking to Solve Wicked Problems in Software Engineering
Applying Systems Thinking to Solve Wicked Problems in Software Engineering
 
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven ResearchISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
 
A Summary of ECIR'18
A Summary of ECIR'18A Summary of ECIR'18
A Summary of ECIR'18
 
Analytics (as if learning mattered) - RIDE Symposium, University of London 10...
Analytics (as if learning mattered) - RIDE Symposium, University of London 10...Analytics (as if learning mattered) - RIDE Symposium, University of London 10...
Analytics (as if learning mattered) - RIDE Symposium, University of London 10...
 
In Focus presentation: Analytics: as if learning mattered
In Focus presentation: Analytics: as if learning matteredIn Focus presentation: Analytics: as if learning mattered
In Focus presentation: Analytics: as if learning mattered
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
Aect 2018 workshop
Aect 2018 workshopAect 2018 workshop
Aect 2018 workshop
 
Aect2018 workshop-v6ij-compressed
Aect2018 workshop-v6ij-compressedAect2018 workshop-v6ij-compressed
Aect2018 workshop-v6ij-compressed
 
Structure Approach to Analytics Interviews
Structure Approach to Analytics InterviewsStructure Approach to Analytics Interviews
Structure Approach to Analytics Interviews
 
First Cycle CodingContent drawn from Johnny Saldana’s The .docx
First Cycle CodingContent drawn from Johnny Saldana’s The .docxFirst Cycle CodingContent drawn from Johnny Saldana’s The .docx
First Cycle CodingContent drawn from Johnny Saldana’s The .docx
 
A picture is worth a thousand words
A picture is worth a thousand wordsA picture is worth a thousand words
A picture is worth a thousand words
 
Advanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARAdvanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise AR
 
Data Science-1 (1).ppt
Data Science-1 (1).pptData Science-1 (1).ppt
Data Science-1 (1).ppt
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analytics
 
Insemtives stanford
Insemtives stanfordInsemtives stanford
Insemtives stanford
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Computational Thinking in the Workforce and Next Generation Science Standards...
Computational Thinking in the Workforce and Next Generation Science Standards...Computational Thinking in the Workforce and Next Generation Science Standards...
Computational Thinking in the Workforce and Next Generation Science Standards...
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
Dimrs (2012) Managing Unstructured Text Data (Schijns)
Dimrs (2012) Managing Unstructured Text Data (Schijns)Dimrs (2012) Managing Unstructured Text Data (Schijns)
Dimrs (2012) Managing Unstructured Text Data (Schijns)
 
Understanding big data and data analytics big data
Understanding big data and data analytics big dataUnderstanding big data and data analytics big data
Understanding big data and data analytics big data
 

Bdpanel

  • 1. Big Data Panel Deepak Agarwal, LinkedIn JSM, 2012 San Diego, USA
  • 2. Disclaimer • The opinions expressed here are mine and in no way represent the official position of LinkedIn
  • 3. Example of user interaction ts, user-id, <items shown at various slots>, <what was clicked?>, < what after click> user-id: covariates; item-id: covariates; user-id: social connections
  • 4. Statistical Challenges • Exploratory Analysis (EDA), Visualization – Retrospective (on Terabytes) – More Real Time (every few minutes/hours) • Statistical Modeling – Scale (computational challenge) – Dimensionality (few categorical variables with massive number of levels interacting) – Temporal Effects
  • 5. Statistical Challenges continued • Experiments – To test new methods, test hypothesis from randomized experiments – Adaptive experiments • Forecasting – Planning, advertising
  • 6. My 2 cents • BD problems are complex, messy, it is inherently multi-disciplinary • Having a clear idea of the underlying scientific problem important • Systems, Algorithms, Statistics, Machine Learning, Optimization,… • Statisticians could consume wonderful tools created by our friends, develop the statistical aspects – Learn Hadoop and Pig, it has become easy (like R) • Emphasis on areas like sampling, DOE, scalable model fitting • More collaborative programs between academia/industry, academia/government – E.g. Training programs for students working with problem ownners