SlideShare a Scribd company logo
1 of 56
Download to read offline
Andrew Fogg 
Import.io 
the ultimate 
christmas jumper?
andrew fogg
co-founder of import.io
import.io allows you to turn other people’s websites 
into apis without writing any code
ever used beautiful soup or scrapy?
use import.io and never do that again
yesterday i wrote to emlyn responding to the 
request for xmas content
i set myself a challenge
i promised to stand here and tell you the ultimate 
christmas jumper to buy for christmas jumper day
wtf
i started working on the code this morning
the analysis is running as i speak
the results are not in yet
but i want to share the design with you 
of how i did this
hopefully it will inspire you 
to do some crazy data work over the holidays
the first thing to do is to get some jumpers
we have a new version of import.io 
that requires no training
we call it magic
i did some googling for “christmas jumpers”
found lots of retailers linking to “christmas jumper” 
sections on their websites
copying these urls and pasting them into magic i 
can see that they parse very well
magic is also available over an api
here is the simplest possible integration in python 
(no authentication required, no limit on requests)
i passed 10 retailer urls into the magic api
magic works with pagination
iterating through the pages i was able to literally get 
1,000s of christmas jumpers
the next thing to do is to analyse the jumpers
lots of great image libraries and apis for doing 
computer vision things
but there is no api or image library for christmas 
jumpers (and i was not about to write one myself)
i turned instead to a human api
amazon’s mechanical turk
mechanical turk is a marketplace 
for human micro-tasking
there are workers and requesters
as a requester, you programatically 
specify your micro-task
you create a simple templated 
user interface in HTML
upload your dataset 
(christmas jumpers in my case)
specify a price per micro-task
fill your account with money
and release your micro tasks to the market
there are 500,000 registered workers 
programatically available 24 hours a day
and let me repeat, 
they are available over an api
as soon as you release your micro-tasks to the 
market, you start getting data back
it is most commonly used for things like detecting 
adult content in images
$9 per hour 
a worker can sit there all day and crank through 
micro tasks and make a decent wage
the questions that i am 
asking my workers
are there snowflakes?
how many reindeer?
how many snowmen?
is father christmas present?
transcribe the message
the workers are on this at the moment
when they are finished
imagine a dataviz!
check the import.io blog on 12th december to find 
out what the ultimate christmas jumper looks like
we are hiring for python developers 
– interviews on friday!
learn more at http://www.import.io

More Related Content

Similar to the ultimate Christmas jumper?

Prezentare Antreprenoriat ASE - Viorel Spinu
Prezentare Antreprenoriat ASE -  Viorel SpinuPrezentare Antreprenoriat ASE -  Viorel Spinu
Prezentare Antreprenoriat ASE - Viorel SpinuViorel Spinu
 
Introduction to import.io
Introduction to import.ioIntroduction to import.io
Introduction to import.ioAndrew Fogg
 
Did i do the right thing?
Did i do the right thing?Did i do the right thing?
Did i do the right thing?Fajri Abdillah
 
Import.io @ Hustle Con - 10,000 Leads In 10 minutes
Import.io @ Hustle Con - 10,000 Leads In 10 minutesImport.io @ Hustle Con - 10,000 Leads In 10 minutes
Import.io @ Hustle Con - 10,000 Leads In 10 minutesSam Parr
 
10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.io
10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.io10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.io
10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.ioSales Hacker
 
Did i do the right thing show
Did i do the right thing showDid i do the right thing show
Did i do the right thing showFajri Abdillah
 
Projects Colman2010 Part2
Projects Colman2010 Part2Projects Colman2010 Part2
Projects Colman2010 Part2Shai Wolkomir
 
From API Doing to API Thinking
From API Doing to API ThinkingFrom API Doing to API Thinking
From API Doing to API ThinkingNordic APIs
 
Using WordPress as a Headless CMS with WPGraphQL
Using WordPress as a Headless CMS with WPGraphQLUsing WordPress as a Headless CMS with WPGraphQL
Using WordPress as a Headless CMS with WPGraphQLAri-Pekka Koponen
 
IoT Printer (2012)
IoT Printer (2012)IoT Printer (2012)
IoT Printer (2012)lazyatom
 
Leveling up your JavaScipt - DrupalJam 2017
Leveling up your JavaScipt - DrupalJam 2017Leveling up your JavaScipt - DrupalJam 2017
Leveling up your JavaScipt - DrupalJam 2017Christian Heilmann
 
Computer programming (General) Community Questions Sitemap
Computer programming (General) Community Questions SitemapComputer programming (General) Community Questions Sitemap
Computer programming (General) Community Questions Sitemapscarcemadness8524
 
WebHooks in 10 Minutes
WebHooks in 10 MinutesWebHooks in 10 Minutes
WebHooks in 10 MinutesJeff Lindsay
 
EmberConf 2016 – Idiomatic Ember (Speaker Notes)
EmberConf 2016 – Idiomatic Ember (Speaker Notes)EmberConf 2016 – Idiomatic Ember (Speaker Notes)
EmberConf 2016 – Idiomatic Ember (Speaker Notes)Lauren Elizabeth Tan
 
Introduction to SPA with AngularJS
Introduction to SPA with AngularJSIntroduction to SPA with AngularJS
Introduction to SPA with AngularJSRiki Pribadi
 
Connecting Apple’s iPhone To Google’s cloud
Connecting Apple’s iPhone To Google’s cloudConnecting Apple’s iPhone To Google’s cloud
Connecting Apple’s iPhone To Google’s cloudjonmarimba
 

Similar to the ultimate Christmas jumper? (20)

Prezentare Antreprenoriat ASE - Viorel Spinu
Prezentare Antreprenoriat ASE -  Viorel SpinuPrezentare Antreprenoriat ASE -  Viorel Spinu
Prezentare Antreprenoriat ASE - Viorel Spinu
 
Introduction to import.io
Introduction to import.ioIntroduction to import.io
Introduction to import.io
 
Did i do the right thing?
Did i do the right thing?Did i do the right thing?
Did i do the right thing?
 
Import.io @ Hustle Con - 10,000 Leads In 10 minutes
Import.io @ Hustle Con - 10,000 Leads In 10 minutesImport.io @ Hustle Con - 10,000 Leads In 10 minutes
Import.io @ Hustle Con - 10,000 Leads In 10 minutes
 
10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.io
10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.io10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.io
10,000 Leads in 10 Minutes - Andrew Fogg - CDO, Import.io
 
Did i do the right thing show
Did i do the right thing showDid i do the right thing show
Did i do the right thing show
 
Presentation by the programming
Presentation by the programmingPresentation by the programming
Presentation by the programming
 
Projects Colman2010 Part2
Projects Colman2010 Part2Projects Colman2010 Part2
Projects Colman2010 Part2
 
From API Doing to API Thinking
From API Doing to API ThinkingFrom API Doing to API Thinking
From API Doing to API Thinking
 
Using WordPress as a Headless CMS with WPGraphQL
Using WordPress as a Headless CMS with WPGraphQLUsing WordPress as a Headless CMS with WPGraphQL
Using WordPress as a Headless CMS with WPGraphQL
 
IoT Printer (2012)
IoT Printer (2012)IoT Printer (2012)
IoT Printer (2012)
 
Technology
TechnologyTechnology
Technology
 
Leveling up your JavaScipt - DrupalJam 2017
Leveling up your JavaScipt - DrupalJam 2017Leveling up your JavaScipt - DrupalJam 2017
Leveling up your JavaScipt - DrupalJam 2017
 
50 Great Products For Startups
50 Great Products For Startups50 Great Products For Startups
50 Great Products For Startups
 
Computer programming (General) Community Questions Sitemap
Computer programming (General) Community Questions SitemapComputer programming (General) Community Questions Sitemap
Computer programming (General) Community Questions Sitemap
 
WebHooks in 10 Minutes
WebHooks in 10 MinutesWebHooks in 10 Minutes
WebHooks in 10 Minutes
 
EmberConf 2016 – Idiomatic Ember (Speaker Notes)
EmberConf 2016 – Idiomatic Ember (Speaker Notes)EmberConf 2016 – Idiomatic Ember (Speaker Notes)
EmberConf 2016 – Idiomatic Ember (Speaker Notes)
 
Introduction to SPA with AngularJS
Introduction to SPA with AngularJSIntroduction to SPA with AngularJS
Introduction to SPA with AngularJS
 
Connecting Apple’s iPhone To Google’s cloud
Connecting Apple’s iPhone To Google’s cloudConnecting Apple’s iPhone To Google’s cloud
Connecting Apple’s iPhone To Google’s cloud
 
iphone and Google App Engine
iphone and Google App Engineiphone and Google App Engine
iphone and Google App Engine
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 

the ultimate Christmas jumper?