SlideShare a Scribd company logo
1 of 17
Kaggle
The home of data
science
GE Flight Quest 2
Optimize flight routes based
on weather & traffic
$250,000
122 teams
Hewlett Foundation: Automated Essay Scoring
Develop an automated scoring algorithm
for student-written essays
$100,000
155 teams
Allstate Purchase Prediction Challenge
Develop an automated scoring algorithm
for student-written essays
$50,000
1,570 teams
Merck Molecular Activity Challenge
Help develop safe and effective medicines
by predicting molecular activity
$40,000
236 teams
Higgs Boson Machine Learning Challenge
Use the ATLAS experiment to
identify the Higgs boson
$13,000
1,302 teams
Age Income Default
58 $95,824 True
73 $20,708 False
59 $82,152 False
66 $25,334 True
Age Income Default
73 $53,445
61 $36,679
47 $90,422
44 $79,040
Training Data Test Data
The Kaggle Approach
Mapping Dark Matter
Competition Progress
Accuracy
(lower is better)
Week 1 Week 3 Week 5 Week 7 End
.0150
.0170
Martin O’Leary
PhD student in Glaciology, Cambridge U
“In less than a week, Martin O’Leary,
a PhD student in glaciology,
outperformed the state-of-the-art
algorithms”
“The world’s brightest physicists have
been working for decades on solving
one of the great unifying problems of
our universe”
Mapping Dark Matter
Competition Progress
Accuracy
(lower is better)
Week 1 Week 3 Week 5 Week 7 End
.0150
.0170
Martin O’Leary
PhD student in Glaciology, Cambridge U
Marius Cobzarenco
Grad student in computer vision, UC London
Ali Haissaine & Eu Jin Loc
Signature Verification, Qatar U & Grad Student @ Deloitte
Other
deepZot (David Kirkby & Daniel Margala)
Particle Physicist & Cosmologist
We’ve worked with
many of the
world’s largest
companies
Healthcare &
Pharma
Consumer
Internet
Finance IndustrialConsumer
Marketing
Oil
& Gas
$50b+
Beverage
Co.
Global
Bank
Top
Credit
Card
Issuer
Top 5 E&P
Top 20 E&P
That submit over
100K machine
learning models
per month
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
May-10 May-11 May-12 May-13 May-14 May-15
Monthly Submissions to Kaggle Competitions
There’s a
cookbook for
winning
competitions on
structured data. It
starts with
exploring the
data.
2. Create and
select features
3. Parameter
tuning and
ensembling
A second
cookbook is
emerging on
computer vision
and speech
problems. It
involves using
convolutional
neural networks.
The vast majority
of time is spent
training
algorithms when
CNNs are
applied.
There are the
problems that
land in the
middle…
Anthony Goldbloom
a@kaggle.com
650 283 9781

More Related Content

More from Extract Data Conference

Utilizing social data to connect brands to celebrities
Utilizing social data to connect brands to celebritiesUtilizing social data to connect brands to celebrities
Utilizing social data to connect brands to celebritiesExtract Data Conference
 
Anomaly Detection for Global Scale at Netflix
Anomaly Detection for Global Scale at NetflixAnomaly Detection for Global Scale at Netflix
Anomaly Detection for Global Scale at NetflixExtract Data Conference
 
Search Secrets Revealed: What Ranks in Google and Why
Search Secrets Revealed: What Ranks in Google and WhySearch Secrets Revealed: What Ranks in Google and Why
Search Secrets Revealed: What Ranks in Google and WhyExtract Data Conference
 
Grand Explorers: What We Can Learn From Data Innovators
Grand Explorers: What We Can Learn From Data InnovatorsGrand Explorers: What We Can Learn From Data Innovators
Grand Explorers: What We Can Learn From Data InnovatorsExtract Data Conference
 
Visualising Flux: Storytelling with Time, Space & Torque
Visualising Flux: Storytelling with Time, Space & TorqueVisualising Flux: Storytelling with Time, Space & Torque
Visualising Flux: Storytelling with Time, Space & TorqueExtract Data Conference
 
Martins Vaivers, inforgr.am: "Data Beauty and Democracy"
Martins Vaivers, inforgr.am: "Data Beauty and Democracy"Martins Vaivers, inforgr.am: "Data Beauty and Democracy"
Martins Vaivers, inforgr.am: "Data Beauty and Democracy"Extract Data Conference
 
Stephen Follows, Founder of Catsnake Films: "Big Screen Data"
Stephen Follows, Founder of Catsnake Films: "Big Screen Data" Stephen Follows, Founder of Catsnake Films: "Big Screen Data"
Stephen Follows, Founder of Catsnake Films: "Big Screen Data" Extract Data Conference
 
Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...
Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...
Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...Extract Data Conference
 
Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"
Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"
Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"Extract Data Conference
 
Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"
Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"
Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"Extract Data Conference
 
Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...
Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...
Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...Extract Data Conference
 
Software is Eating the World, And You're For Lunch"
Software is Eating the World, And You're For Lunch"Software is Eating the World, And You're For Lunch"
Software is Eating the World, And You're For Lunch"Extract Data Conference
 
Alyson Murphy, Senior Data Analyst at Moz: "The Human Side of Business Analy...
Alyson Murphy, Senior Data Analyst at Moz:  "The Human Side of Business Analy...Alyson Murphy, Senior Data Analyst at Moz:  "The Human Side of Business Analy...
Alyson Murphy, Senior Data Analyst at Moz: "The Human Side of Business Analy...Extract Data Conference
 

More from Extract Data Conference (17)

Andrew Ng, Chief Scientist at Baidu
Andrew Ng, Chief Scientist at BaiduAndrew Ng, Chief Scientist at Baidu
Andrew Ng, Chief Scientist at Baidu
 
Utilizing social data to connect brands to celebrities
Utilizing social data to connect brands to celebritiesUtilizing social data to connect brands to celebrities
Utilizing social data to connect brands to celebrities
 
Anomaly Detection for Global Scale at Netflix
Anomaly Detection for Global Scale at NetflixAnomaly Detection for Global Scale at Netflix
Anomaly Detection for Global Scale at Netflix
 
The Death of The Unpaid Internship
The Death of The Unpaid Internship The Death of The Unpaid Internship
The Death of The Unpaid Internship
 
Search Secrets Revealed: What Ranks in Google and Why
Search Secrets Revealed: What Ranks in Google and WhySearch Secrets Revealed: What Ranks in Google and Why
Search Secrets Revealed: What Ranks in Google and Why
 
Search Inside Your Data
Search Inside Your DataSearch Inside Your Data
Search Inside Your Data
 
Grand Explorers: What We Can Learn From Data Innovators
Grand Explorers: What We Can Learn From Data InnovatorsGrand Explorers: What We Can Learn From Data Innovators
Grand Explorers: What We Can Learn From Data Innovators
 
Fashematics: The Science of Colour
Fashematics: The Science of ColourFashematics: The Science of Colour
Fashematics: The Science of Colour
 
Visualising Flux: Storytelling with Time, Space & Torque
Visualising Flux: Storytelling with Time, Space & TorqueVisualising Flux: Storytelling with Time, Space & Torque
Visualising Flux: Storytelling with Time, Space & Torque
 
Martins Vaivers, inforgr.am: "Data Beauty and Democracy"
Martins Vaivers, inforgr.am: "Data Beauty and Democracy"Martins Vaivers, inforgr.am: "Data Beauty and Democracy"
Martins Vaivers, inforgr.am: "Data Beauty and Democracy"
 
Stephen Follows, Founder of Catsnake Films: "Big Screen Data"
Stephen Follows, Founder of Catsnake Films: "Big Screen Data" Stephen Follows, Founder of Catsnake Films: "Big Screen Data"
Stephen Follows, Founder of Catsnake Films: "Big Screen Data"
 
Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...
Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...
Paul joyce, Founder & CEO at Geckoboard: "Brain Hacking: Designing Data for R...
 
Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"
Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"
Ben Rush, CEO of AudioLock: "Fighting Content with Data Privacy"
 
Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"
Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"
Andrew Fogg, Founder & CDO at import.io: "Sex, Drugs & Data: UK GDP Redux"
 
Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...
Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...
Eric Williams, Data Scientist at Omada Health: "Data vs Donuts: Inspiring Hea...
 
Software is Eating the World, And You're For Lunch"
Software is Eating the World, And You're For Lunch"Software is Eating the World, And You're For Lunch"
Software is Eating the World, And You're For Lunch"
 
Alyson Murphy, Senior Data Analyst at Moz: "The Human Side of Business Analy...
Alyson Murphy, Senior Data Analyst at Moz:  "The Human Side of Business Analy...Alyson Murphy, Senior Data Analyst at Moz:  "The Human Side of Business Analy...
Alyson Murphy, Senior Data Analyst at Moz: "The Human Side of Business Analy...
 

Recently uploaded

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 

Recently uploaded (20)

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 

Kaggle home data science competitions

  • 1. Kaggle The home of data science
  • 2. GE Flight Quest 2 Optimize flight routes based on weather & traffic $250,000 122 teams Hewlett Foundation: Automated Essay Scoring Develop an automated scoring algorithm for student-written essays $100,000 155 teams Allstate Purchase Prediction Challenge Develop an automated scoring algorithm for student-written essays $50,000 1,570 teams Merck Molecular Activity Challenge Help develop safe and effective medicines by predicting molecular activity $40,000 236 teams Higgs Boson Machine Learning Challenge Use the ATLAS experiment to identify the Higgs boson $13,000 1,302 teams
  • 3. Age Income Default 58 $95,824 True 73 $20,708 False 59 $82,152 False 66 $25,334 True Age Income Default 73 $53,445 61 $36,679 47 $90,422 44 $79,040 Training Data Test Data The Kaggle Approach
  • 4.
  • 5. Mapping Dark Matter Competition Progress Accuracy (lower is better) Week 1 Week 3 Week 5 Week 7 End .0150 .0170 Martin O’Leary PhD student in Glaciology, Cambridge U
  • 6. “In less than a week, Martin O’Leary, a PhD student in glaciology, outperformed the state-of-the-art algorithms” “The world’s brightest physicists have been working for decades on solving one of the great unifying problems of our universe”
  • 7. Mapping Dark Matter Competition Progress Accuracy (lower is better) Week 1 Week 3 Week 5 Week 7 End .0150 .0170 Martin O’Leary PhD student in Glaciology, Cambridge U Marius Cobzarenco Grad student in computer vision, UC London Ali Haissaine & Eu Jin Loc Signature Verification, Qatar U & Grad Student @ Deloitte Other deepZot (David Kirkby & Daniel Margala) Particle Physicist & Cosmologist
  • 8. We’ve worked with many of the world’s largest companies Healthcare & Pharma Consumer Internet Finance IndustrialConsumer Marketing Oil & Gas $50b+ Beverage Co. Global Bank Top Credit Card Issuer Top 5 E&P Top 20 E&P
  • 9.
  • 10. That submit over 100K machine learning models per month 0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000 May-10 May-11 May-12 May-13 May-14 May-15 Monthly Submissions to Kaggle Competitions
  • 11. There’s a cookbook for winning competitions on structured data. It starts with exploring the data.
  • 14. A second cookbook is emerging on computer vision and speech problems. It involves using convolutional neural networks.
  • 15. The vast majority of time is spent training algorithms when CNNs are applied.
  • 16. There are the problems that land in the middle…

Editor's Notes

  1. People currently come to Kaggle
  2. We score their solutions in real time.
  3. People don’t come to us with churn or cross sell, but they typically come to us with their hardest problems, and I’ll talk more about this soon. It’s for these reasons that we continue to invest in the competition platform. It’s a very efficient operation. It’s currently running with a headcount of 4. We believe 6 is the right long term number of people to invest in competitions. We decided to focus on Oil & Gas because after working with ~25 Fortune 500s and 12 industries, we believe it’s the biggest opportunity for machine learning and most ripe for disruption. Specifically because: Greatest value add: Huge gap between what they’re doing and what’s possible Shale is disruptive: the industry is looking for new ideas making it a good environment to be selling into.
  4. We score their solutions in real time.
  5. Kaggle Competitions – breakeven business Access to most advanced and proven techniques Recruiting the very best of a scarce resource C-level access from leadership positioning in media
  6. Kaggle Competitions – breakeven business Access to most advanced and proven techniques Recruiting the very best of a scarce resource C-level access from leadership positioning in media
  7. Kaggle Competitions – breakeven business Access to most advanced and proven techniques Recruiting the very best of a scarce resource C-level access from leadership positioning in media
  8. Kaggle Competitions – breakeven business Access to most advanced and proven techniques Recruiting the very best of a scarce resource C-level access from leadership positioning in media
  9. Kaggle Competitions – breakeven business Access to most advanced and proven techniques Recruiting the very best of a scarce resource C-level access from leadership positioning in media
  10. Kaggle Competitions – breakeven business Access to most advanced and proven techniques Recruiting the very best of a scarce resource C-level access from leadership positioning in media
  11. Kaggle Competitions – breakeven business Access to most advanced and proven techniques Recruiting the very best of a scarce resource C-level access from leadership positioning in media