SlideShare a Scribd company logo
1 of 20
Download to read offline
Data Science in the
Newsroom
Geetu Ambwani
Principal Data Scientist
geetu.ambwani@huffingtonpost.com
What is the Huffington Post?
Founded May 2005
Ranking among Digital-only news websites 1
Cross-platform monthly unique visitors Over 187 Million
Number of articles per day Over 500
Number of international editions 15
Bloggers Over 100,000
News Industry - Trends
HuffPost has consistently been an innovator in the digital publishing space.
Massive Blogging Network:
More than 100K bloggers across the globe
News Industry - Trends
HuffPost has consistently been an innovator in the digital publishing space.
Google Site Rank
News Industry - Trends
HuffPost has consistently been an innovator in the digital publishing space.
Biggest Social publisher
News Industry - Challenges
How Can Data Help ?
Ad campaigns
International editionsSocial media promotion
Editors
User-experience
Blog moderators
Reporters
HuffPost Studio
Content Lifecycle
DistributionCreation Consumption
Content Creation: How Can Data Help ?
● Tools to help surface, discover trends in different parts of the web
● Content Enhancement with multimedia based on semantic matching (images, slideshows, videos)
● Optimizing headlines/images (RobinHood Platform)
Content Gap: Production Versus Consumption
Content Consumption: How Can Data Help?
Know Your Audience
● User Cohorts:
○ Social Traffic versus FrontPage Clickers consume different content
○ Desktop Vs Mobile consumption
● Recommendations/Personalization
● Can we use data to inform product design and interface ?
○ Rearrange share buttons based on traffic origin (Facebook vs Pinterest)
Content Lifecycle
DistributionCreation Consumption
Content Distribution: Can Data Help ?
● People’s attention is increasingly concentrated on social streams
○ More traffic to publishers from social than any other way
● Are Distributed Platforms the new home page ?
○ Facebook Instant, Apple News, Snapchat Discover, Google Amp
○ Messenger Bots
● You need to be where your audience is:
○ Identify the content mix that is maximally engaging on an external platform
○ Can we use data to seed these distribution networks ? (Facebook HuffPost Pages, Snapchat
Discover)
Content Distribution: Can Data Help ?
● HuffPost produces 1000 articles a day - which of these do we promote ?
● Article PVs follow a very skewed distribution of success
○ Only 1% of our articles > 100k PVs
● Content performs differently on different networks.
● Can we predict the articles that will get traction in advance so
■ We can optimally seed multiple distribution channels (Facebook HP Pages, Snapchat
Discover)
■ Target for premium/high value ads to maximize revenue
■ Populate Recommendation Widgets
Content Distribution: Can Data Help ?
Challenges
● Histogram of traffic distribution - highly skewed.
● The very act of promoting something causes a bump in traffic.
● Data normalization - how long do want to wait before predicting ?
● Very imbalanced data set
Our Approach
● Random Forest classifier.
● Multiple success criteria
● Historical examples of (+) and (-) articles. Downsampling.
● Different normalization thresholds
● Feature engineering: traffic growth ratios; initial organic social traffic per minute; distinct referrers;
Slackbot for the social promotion team
● 20% lift in PVs per predicted article
● 20% lift in PVs per predicted article
Conclusion
A Data Driven Newsroom today means
● More than just keeping track of clicks and shares
● Using predictive analytics to drive product and content placement
Machine Learning will be a key driver for success with the advent of distributed
content
Thanks !
MachineLearning@HuffPost

More Related Content

Similar to Geetu Ambwani, Principal Data Scientist, Huffington Post at MLconf NYC - 4/15/16

Data science in the newsroom
Data science in the newsroomData science in the newsroom
Data science in the newsroomGeetu Ambwani
 
Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015Josh Luger
 
Telegraph cim social media v01
Telegraph cim social media v01Telegraph cim social media v01
Telegraph cim social media v01LauraWinter
 
Social Media101 (short)
Social Media101 (short)Social Media101 (short)
Social Media101 (short)Drew Shope
 
Project Paper Company Report
Project Paper Company ReportProject Paper Company Report
Project Paper Company ReportTapiwa Choto
 
Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)Julian Gamboa
 
Inbound marketing (with content)
Inbound marketing (with content)Inbound marketing (with content)
Inbound marketing (with content)Phil Decoteau
 
Creative Content Uclan
Creative Content UclanCreative Content Uclan
Creative Content Uclanmarkmedia
 
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...TechSoup
 
Emerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebnEmerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebne-Strategy
 
Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)Julian Gamboa
 
How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22shapira marketing
 
Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016Co-Communications
 
Web trends, social media, viralmarketing
Web trends, social media, viralmarketingWeb trends, social media, viralmarketing
Web trends, social media, viralmarketingPer Axbom
 
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...Web2LLP
 
Intro to internet marketing
Intro to internet marketingIntro to internet marketing
Intro to internet marketingBasil Puglisi
 
Social Networking on a Shoe String
Social Networking on a Shoe StringSocial Networking on a Shoe String
Social Networking on a Shoe StringPhilip Roberts
 

Similar to Geetu Ambwani, Principal Data Scientist, Huffington Post at MLconf NYC - 4/15/16 (20)

Data science in the newsroom
Data science in the newsroomData science in the newsroom
Data science in the newsroom
 
Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015
 
Telegraph cim social media v01
Telegraph cim social media v01Telegraph cim social media v01
Telegraph cim social media v01
 
Social Media101 (short)
Social Media101 (short)Social Media101 (short)
Social Media101 (short)
 
Project Paper Company Report
Project Paper Company ReportProject Paper Company Report
Project Paper Company Report
 
Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)
 
Inbound marketing (with content)
Inbound marketing (with content)Inbound marketing (with content)
Inbound marketing (with content)
 
Creative Content Uclan
Creative Content UclanCreative Content Uclan
Creative Content Uclan
 
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
 
Emerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebnEmerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebn
 
Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)
 
Iabc 2
Iabc 2Iabc 2
Iabc 2
 
Holiday Marketing Tools and Tricks - Debra and Pierre.pdf
Holiday Marketing Tools and Tricks - Debra and Pierre.pdfHoliday Marketing Tools and Tricks - Debra and Pierre.pdf
Holiday Marketing Tools and Tricks - Debra and Pierre.pdf
 
How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22
 
Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016
 
Web trends, social media, viralmarketing
Web trends, social media, viralmarketingWeb trends, social media, viralmarketing
Web trends, social media, viralmarketing
 
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
 
Social Media Madness - join or die
Social Media Madness - join or dieSocial Media Madness - join or die
Social Media Madness - join or die
 
Intro to internet marketing
Intro to internet marketingIntro to internet marketing
Intro to internet marketing
 
Social Networking on a Shoe String
Social Networking on a Shoe StringSocial Networking on a Shoe String
Social Networking on a Shoe String
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Geetu Ambwani, Principal Data Scientist, Huffington Post at MLconf NYC - 4/15/16

  • 1. Data Science in the Newsroom Geetu Ambwani Principal Data Scientist geetu.ambwani@huffingtonpost.com
  • 2. What is the Huffington Post? Founded May 2005 Ranking among Digital-only news websites 1 Cross-platform monthly unique visitors Over 187 Million Number of articles per day Over 500 Number of international editions 15 Bloggers Over 100,000
  • 3. News Industry - Trends HuffPost has consistently been an innovator in the digital publishing space. Massive Blogging Network: More than 100K bloggers across the globe
  • 4. News Industry - Trends HuffPost has consistently been an innovator in the digital publishing space. Google Site Rank
  • 5. News Industry - Trends HuffPost has consistently been an innovator in the digital publishing space. Biggest Social publisher
  • 6. News Industry - Challenges
  • 7. How Can Data Help ?
  • 8. Ad campaigns International editionsSocial media promotion Editors User-experience Blog moderators Reporters HuffPost Studio
  • 10. Content Creation: How Can Data Help ? ● Tools to help surface, discover trends in different parts of the web ● Content Enhancement with multimedia based on semantic matching (images, slideshows, videos) ● Optimizing headlines/images (RobinHood Platform)
  • 11. Content Gap: Production Versus Consumption
  • 12. Content Consumption: How Can Data Help? Know Your Audience ● User Cohorts: ○ Social Traffic versus FrontPage Clickers consume different content ○ Desktop Vs Mobile consumption ● Recommendations/Personalization ● Can we use data to inform product design and interface ? ○ Rearrange share buttons based on traffic origin (Facebook vs Pinterest)
  • 14. Content Distribution: Can Data Help ? ● People’s attention is increasingly concentrated on social streams ○ More traffic to publishers from social than any other way ● Are Distributed Platforms the new home page ? ○ Facebook Instant, Apple News, Snapchat Discover, Google Amp ○ Messenger Bots ● You need to be where your audience is: ○ Identify the content mix that is maximally engaging on an external platform ○ Can we use data to seed these distribution networks ? (Facebook HuffPost Pages, Snapchat Discover)
  • 15. Content Distribution: Can Data Help ? ● HuffPost produces 1000 articles a day - which of these do we promote ? ● Article PVs follow a very skewed distribution of success ○ Only 1% of our articles > 100k PVs ● Content performs differently on different networks. ● Can we predict the articles that will get traction in advance so ■ We can optimally seed multiple distribution channels (Facebook HP Pages, Snapchat Discover) ■ Target for premium/high value ads to maximize revenue ■ Populate Recommendation Widgets
  • 16. Content Distribution: Can Data Help ? Challenges ● Histogram of traffic distribution - highly skewed. ● The very act of promoting something causes a bump in traffic. ● Data normalization - how long do want to wait before predicting ? ● Very imbalanced data set Our Approach ● Random Forest classifier. ● Multiple success criteria ● Historical examples of (+) and (-) articles. Downsampling. ● Different normalization thresholds ● Feature engineering: traffic growth ratios; initial organic social traffic per minute; distinct referrers;
  • 17. Slackbot for the social promotion team ● 20% lift in PVs per predicted article
  • 18. ● 20% lift in PVs per predicted article
  • 19. Conclusion A Data Driven Newsroom today means ● More than just keeping track of clicks and shares ● Using predictive analytics to drive product and content placement Machine Learning will be a key driver for success with the advent of distributed content