SlideShare a Scribd company logo
1 of 35
Welcome To
Director of Engineering
Search Science Recall & Spam
April 3, 2015
BRIAN JOHNSON
With more than 100 million active users globally, eBay is
the world's largest online marketplace, where practically
anyone can buy and sell practically anything. Founded in
1995, eBay connects a diverse and passionate community
of individual buyers and sellers, as well as small
businesses. Their collective impact on ecommerce is
staggering: In 2014, the total value of goods sold on eBay
was $82 billion -- more than $2,500 every second.
http://www.ebayinc.com/who
DATA DRIVEN
DECISIONS
METRICS & TESTING
Why?
www.wallpapertimes.com
$’s per year in
incremental revenue
Data Collection
• Cell Tracking
• Eye Tracking
• Much Richer and more Detailed online
• Behavioral Log Files
We Are Data
Trends
Intelligence:Human  Machine
Data: Small  Big
Sources: Few  Many
Context: Aggregate  Detailed
SEARCH QUERY INTENT
Users
+
Documents
What is Special for eBay Search?
•Commercial Intentions
– Both the sellers and buyers have strong and clear intention
– Transactions happen on eBay, hence more behavior data
•Listings (Supply Side)
– Given by sellers
– Semi-structured data
•Buyers (Demand Side)
– Relevance matters (Browser vs. Searching)
– Price matters
– Seller trust/credit matters
– 70% eBay revenue starts with Search
Fish Sticks
Demand Category
42%
Business & Industrial > Electrical & Test Equipment > Electrical Equipment & Tools >
Electrical Tools > Cable Pullers
10% Business & Industrial > Construction > Building Materials & Supplies > Electrical
9% Home & Garden > Tools > Other
7% Pet Supplies > Fish & Aquariums > Fish Pond Supplies
6% Business & Industrial > Light Equipment & Tools > Air Tools > Staplers
Fishing Sticker
Demand Category
41%
eBay Motors > Parts & Accessories > Car & Truck Parts > Decals/Emblems/License
Frames > Decals & Stickers > Graphics Decals
28%Home & Garden > Home Decor > Decals, Stickers & Vinyl Art
12%Sporting Goods > Fishing > Novelties & Gifts
11%Sporting Goods > Fishing > Fishing Equipment > Decals, Stickers & Patches
3%Collectibles > Transportation > Automobilia > Decals & Stickers
Fish Stickers
Demand Category
53%Home & Garden > Home Decor > Decals, Stickers & Vinyl Art
24%Crafts > Scrapbooking & Paper Crafts > Embellishments > Stickers
16%
eBay Motors > Parts & Accessories > Car & Truck Parts > Decals/Emblems/License
Frames > Decals & Stickers > Graphics Decals
3%Sporting Goods > Fishing > Novelties & Gifts
1%Sporting Goods > Fishing > Fishing Equipment > Decals, Stickers & Patches
Keyword Expansion Test
Key Expansion Top Query
fishing sticker fish stickers fishing sticker
diaries diary vampire diaries
baggies baggie patagonia baggies
baggies baggy patagonia baggies
cranberries cranberry the cranberries
jogging jog jogging stroller
Key Expansion Top Query
fishing sticker fish stickers fishing sticker
diaries diary vampire diaries
baggies baggie patagonia baggies
baggies baggy patagonia baggies
cranberries cranberry the cranberries
jogging jog jogging stroller
Context Matters
cowboys hats ≠ cowboy hats
Plastic toy cowboys = plastic toy cowboy
Context & Specificity
Context
ATC  Armored Troop Carrier in Toys and Hobbies
ATC  Artist trading card in ART
ATC  Automatic Tool Change in Business and Industrial
Specificity/Directionality
Old  Antique
Yoga towels/mats  Yogitoes
Category Expansion
Stamps > Commonwealth/ British Colonial > Bermuda
Before After
Compound/Decompound Expansions
acidwash  acid wash
Before After
German Compounds
•Syntactically, words can be combined and split in many ways
•Multiple candidates
Granitpflastersteine (granite paving stones)
Granit(granite) pflastersteine(cobblestones)
Granit(granite) pflaster(paving/band-aid) steine(stones)
•Binding characters
Hochzeitsschuhe (grammatically correct, 593 hits on ebay.de)
Hochzeitschuhe (129 hits on ebay.de)
•Some words shouldn’t be de-compounded.
beiden (both) – bei(at) den(the)
Intent Preserving Query Relaxation
Hadoop Graph/Session Analysis
Bipartite Graphs
Keyword | Keyword Synonym Expansion
Keyword | Attribute Aspect Expansions
Keyword | Category Category Expansions, Related Search Diversity
Query | Query Related Search
Query | Item Related Search
Query Session Analysis
Successive Queries Synonyms, Related Search
Query Substitutions Synonyms
Same Session Correlation Related Search
Query Metrics
Click Through Rate
Purchase Attribution
Time to First Click, View Item, Purchase
Query Pair Price & Category Divergence
Query Pair Result Set Overlap
Result Count
Why we’re excited about data mining…
•We’re at an inflection point – customers are defining how they shop
– We are a data company
– 40+ Pb of data (listings, pictures, queries, clicks, sales, feedback, …)
– Many tests running orthogonally (in parallel with overlapping user slices)
– Nearly all users in one of more tests
– Many users per test, often millions
•Find patterns and insights drives our customer experience
•We’ve built successful teams of data scientists
THANKS!
QUESTIONS?
WE’RE HIRING
BJOHNSON@EBAY.COM
METRICS
•What should we optimize
–Page Views
–Time on Site
–Click Through Rate
–Normalized Discounted Cumulative Gain
–Purchases per User per Session/Day/Week
–Revenue per User per Session/Day/Week
–Net Promoter Score
•How likely would you be to recommend …?
REVENUE
Every business focuses on and measures revenue.
Every business focuses on profit and loss.
And they should.
Experimental Variation By Day
Experimental Variation By Metric
Query Rewrites at eBay
Query Rewrite
Search
User Query
eBay Results
Search Query
User Query: pilzlampe {mushroom lamp}
Search Query: OR(pilzlampe, PHRASE(OR(pilz,pilze),OR(lampe,lampen)))
Example Query Services/Rewrites
• Stemming (ipod OR ipods)
• Spelling (cannon OR canon)
• Condition (new OR condition=new)
• Synonyms (boat carpet OR marine carpet)
• Space Synonyms (MarioKart OR Mario-Kart)
• Item Specifics (blue OR color=blue)
• Acronyms (hp OR hewlett-packard OR horsepower)
• Category (shoes OR Category=Shoes)
• Cross Border (site=0 AND category =123) OR (site=3 AND
category=456)
• Fitment (fits model=corolla)
• Term Removal (Harry Potter and the Order of the Phoenix (daily deal))
Acronym/Abbreviation Mining
•Acronyms/Abbreviations mined from raw text and query logs
•Look for patterns of text:
long form (short form)
short form (long form)
• Employ intelligent matching algorithms to mine candidates
• Schwartz et al: Greedy Match Algorithm
new cheap Playstation portable (PSP)
PlayStation 3 (PS3)
• Acronym discovered
PSP => PlayStation Portable
PS3 => PlayStation 3
• Candidates mined are fed to an ML classifier to remove false positives

More Related Content

What's hot

Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to databaseemailharmeet
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureAccess Innovations, Inc.
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to MetadataEUDAT
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model IntroductionNishant Munjal
 
Phrase based Indexing and Information Retrieval
Phrase based Indexing and Information RetrievalPhrase based Indexing and Information Retrieval
Phrase based Indexing and Information RetrievalBala Abirami
 
Normalization PRESENTATION
Normalization PRESENTATIONNormalization PRESENTATION
Normalization PRESENTATIONbit allahabad
 
Database management system1
Database management system1Database management system1
Database management system1jamwal85
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With PythonRobert Dempsey
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Flipkart Software requirements specification SRS
Flipkart Software requirements specification SRSFlipkart Software requirements specification SRS
Flipkart Software requirements specification SRSAman Goel
 
Database and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health InformaticsDatabase and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health InformaticsZulfiquer Ahmed Amin
 
New features of SQL in Firebird
New features of SQL in FirebirdNew features of SQL in Firebird
New features of SQL in FirebirdMind The Firebird
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
 
Front-End Frameworks: a quick overview
Front-End Frameworks: a quick overviewFront-End Frameworks: a quick overview
Front-End Frameworks: a quick overviewDiacode
 
Online Shopping Cart Business Requirement Dcoument
Online Shopping Cart Business Requirement DcoumentOnline Shopping Cart Business Requirement Dcoument
Online Shopping Cart Business Requirement DcoumentH2Kinfosys
 
Database Fundamental
Database FundamentalDatabase Fundamental
Database FundamentalGong Haibing
 

What's hot (20)

Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to database
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to Metadata
 
Google's Dremel
Google's DremelGoogle's Dremel
Google's Dremel
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model Introduction
 
Phrase based Indexing and Information Retrieval
Phrase based Indexing and Information RetrievalPhrase based Indexing and Information Retrieval
Phrase based Indexing and Information Retrieval
 
Normalization PRESENTATION
Normalization PRESENTATIONNormalization PRESENTATION
Normalization PRESENTATION
 
Database management system1
Database management system1Database management system1
Database management system1
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
 
Codd's 12 rules
Codd's 12 rulesCodd's 12 rules
Codd's 12 rules
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Flipkart Software requirements specification SRS
Flipkart Software requirements specification SRSFlipkart Software requirements specification SRS
Flipkart Software requirements specification SRS
 
Normalization in DBMS
Normalization in DBMSNormalization in DBMS
Normalization in DBMS
 
Database and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health InformaticsDatabase and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health Informatics
 
New features of SQL in Firebird
New features of SQL in FirebirdNew features of SQL in Firebird
New features of SQL in Firebird
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Data science unit1
Data science unit1Data science unit1
Data science unit1
 
Front-End Frameworks: a quick overview
Front-End Frameworks: a quick overviewFront-End Frameworks: a quick overview
Front-End Frameworks: a quick overview
 
Online Shopping Cart Business Requirement Dcoument
Online Shopping Cart Business Requirement DcoumentOnline Shopping Cart Business Requirement Dcoument
Online Shopping Cart Business Requirement Dcoument
 
Database Fundamental
Database FundamentalDatabase Fundamental
Database Fundamental
 

Viewers also liked

eBay Search Science, IEEE Big Data, April 3rd, 2015
eBay Search Science, IEEE Big Data, April 3rd, 2015eBay Search Science, IEEE Big Data, April 3rd, 2015
eBay Search Science, IEEE Big Data, April 3rd, 2015Brian Johnson
 
2011 x.commerce Innovate Data Alchemy
2011 x.commerce Innovate Data Alchemy2011 x.commerce Innovate Data Alchemy
2011 x.commerce Innovate Data AlchemyBrian Johnson
 
2015-04 eBay Statistics
2015-04 eBay Statistics2015-04 eBay Statistics
2015-04 eBay StatisticsBrian Johnson
 
2011 Crowdsourcing Search Evaluation
2011 Crowdsourcing Search Evaluation2011 Crowdsourcing Search Evaluation
2011 Crowdsourcing Search EvaluationBrian Johnson
 
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...Randy Shoup
 
eBay Architecture
eBay Architecture eBay Architecture
eBay Architecture Tony Ng
 
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...Cloudera, Inc.
 
Strategic evaluation of e bay
Strategic evaluation of e bayStrategic evaluation of e bay
Strategic evaluation of e bayMita Hadi
 
HBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchHBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchCloudera, Inc.
 
Strategic mgt of Ebay
Strategic mgt of EbayStrategic mgt of Ebay
Strategic mgt of Ebayfarah naz
 

Viewers also liked (11)

eBay Search Science, IEEE Big Data, April 3rd, 2015
eBay Search Science, IEEE Big Data, April 3rd, 2015eBay Search Science, IEEE Big Data, April 3rd, 2015
eBay Search Science, IEEE Big Data, April 3rd, 2015
 
2011 x.commerce Innovate Data Alchemy
2011 x.commerce Innovate Data Alchemy2011 x.commerce Innovate Data Alchemy
2011 x.commerce Innovate Data Alchemy
 
2015-04 eBay Statistics
2015-04 eBay Statistics2015-04 eBay Statistics
2015-04 eBay Statistics
 
2011 Crowdsourcing Search Evaluation
2011 Crowdsourcing Search Evaluation2011 Crowdsourcing Search Evaluation
2011 Crowdsourcing Search Evaluation
 
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...
 
eBay Architecture
eBay Architecture eBay Architecture
eBay Architecture
 
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
 
Strategic evaluation of e bay
Strategic evaluation of e bayStrategic evaluation of e bay
Strategic evaluation of e bay
 
HBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchHBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay Search
 
Strategic mgt of Ebay
Strategic mgt of EbayStrategic mgt of Ebay
Strategic mgt of Ebay
 
ebay Case Study
ebay Case Studyebay Case Study
ebay Case Study
 

Similar to eBay Search Query Intent

Content Commerce + Growth Strategies For Online Retailers
Content Commerce + Growth Strategies For Online RetailersContent Commerce + Growth Strategies For Online Retailers
Content Commerce + Growth Strategies For Online RetailersRoland Frasier
 
Machine Learning for retail and ecommerce
Machine Learning for retail and ecommerceMachine Learning for retail and ecommerce
Machine Learning for retail and ecommerceAndrei Lopatenko
 
What You Need to Know About Trademarks
What You Need to Know About TrademarksWhat You Need to Know About Trademarks
What You Need to Know About TrademarksKieran McCarthy
 
Issie Hannah - Brighton SEO Slides 2023.pptx
Issie Hannah - Brighton SEO Slides 2023.pptxIssie Hannah - Brighton SEO Slides 2023.pptx
Issie Hannah - Brighton SEO Slides 2023.pptxIssieHannah
 
Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...
Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...
Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...Website School
 
Geekaboo presentation 2013 - Brett Napoli
Geekaboo presentation 2013 - Brett NapoliGeekaboo presentation 2013 - Brett Napoli
Geekaboo presentation 2013 - Brett NapoliDavid Wolfpaw
 
Pub web review 5-11-16 chris middings
Pub web review   5-11-16 chris middingsPub web review   5-11-16 chris middings
Pub web review 5-11-16 chris middingsSarah Fletcher
 
Current Trends in SEO For Businesses and Bloggers
Current Trends in SEO For Businesses and BloggersCurrent Trends in SEO For Businesses and Bloggers
Current Trends in SEO For Businesses and BloggersBrian Rotsztein
 
SAS Analytics2011 Retail final LI Post
SAS Analytics2011 Retail final LI PostSAS Analytics2011 Retail final LI Post
SAS Analytics2011 Retail final LI PostEmmett Cox
 
Mass Conversions | Product Mix | Conversion Optimization
Mass Conversions | Product Mix | Conversion OptimizationMass Conversions | Product Mix | Conversion Optimization
Mass Conversions | Product Mix | Conversion OptimizationRoland Frasier
 
Tcm step 3 venture assessment
Tcm step 3 venture assessmentTcm step 3 venture assessment
Tcm step 3 venture assessmentStephen Ong
 
EIA2018Italy - John Chisholm - How to Scale Your Business
EIA2018Italy - John Chisholm - How to Scale Your BusinessEIA2018Italy - John Chisholm - How to Scale Your Business
EIA2018Italy - John Chisholm - How to Scale Your BusinessEuropean Innovation Academy
 
Search Engine Optimisation
Search Engine OptimisationSearch Engine Optimisation
Search Engine OptimisationCosmic
 
Perfecting Commerce Marketing With Deep Personalization
Perfecting Commerce Marketing With Deep PersonalizationPerfecting Commerce Marketing With Deep Personalization
Perfecting Commerce Marketing With Deep PersonalizationG3 Communications
 
Information Marketing Mastery for Professional Speakers - Ford Saeks
Information Marketing Mastery for Professional Speakers - Ford SaeksInformation Marketing Mastery for Professional Speakers - Ford Saeks
Information Marketing Mastery for Professional Speakers - Ford SaeksFord Saeks
 
Tcm step 3 venture assessment
Tcm step 3 venture assessmentTcm step 3 venture assessment
Tcm step 3 venture assessmentStephen Ong
 
2 day ism workshop v1.1
2 day ism workshop v1.12 day ism workshop v1.1
2 day ism workshop v1.1Ralph Paglia
 
Film distribution & movie marketing seminar
Film distribution & movie marketing seminar  Film distribution & movie marketing seminar
Film distribution & movie marketing seminar Allen Chou
 

Similar to eBay Search Query Intent (20)

Content Commerce + Growth Strategies For Online Retailers
Content Commerce + Growth Strategies For Online RetailersContent Commerce + Growth Strategies For Online Retailers
Content Commerce + Growth Strategies For Online Retailers
 
Machine Learning for retail and ecommerce
Machine Learning for retail and ecommerceMachine Learning for retail and ecommerce
Machine Learning for retail and ecommerce
 
What You Need to Know About Trademarks
What You Need to Know About TrademarksWhat You Need to Know About Trademarks
What You Need to Know About Trademarks
 
E commerce
E commerceE commerce
E commerce
 
Issie Hannah - Brighton SEO Slides 2023.pptx
Issie Hannah - Brighton SEO Slides 2023.pptxIssie Hannah - Brighton SEO Slides 2023.pptx
Issie Hannah - Brighton SEO Slides 2023.pptx
 
Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...
Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...
Using WordPress to Create Multiple Income Streams: WordPress as the Foundatio...
 
Geekaboo presentation 2013 - Brett Napoli
Geekaboo presentation 2013 - Brett NapoliGeekaboo presentation 2013 - Brett Napoli
Geekaboo presentation 2013 - Brett Napoli
 
Pub web review 5-11-16 chris middings
Pub web review   5-11-16 chris middingsPub web review   5-11-16 chris middings
Pub web review 5-11-16 chris middings
 
Current Trends in SEO For Businesses and Bloggers
Current Trends in SEO For Businesses and BloggersCurrent Trends in SEO For Businesses and Bloggers
Current Trends in SEO For Businesses and Bloggers
 
SAS Analytics2011 Retail final LI Post
SAS Analytics2011 Retail final LI PostSAS Analytics2011 Retail final LI Post
SAS Analytics2011 Retail final LI Post
 
Mass Conversions | Product Mix | Conversion Optimization
Mass Conversions | Product Mix | Conversion OptimizationMass Conversions | Product Mix | Conversion Optimization
Mass Conversions | Product Mix | Conversion Optimization
 
Tcm step 3 venture assessment
Tcm step 3 venture assessmentTcm step 3 venture assessment
Tcm step 3 venture assessment
 
EIA2018Italy - John Chisholm - How to Scale Your Business
EIA2018Italy - John Chisholm - How to Scale Your BusinessEIA2018Italy - John Chisholm - How to Scale Your Business
EIA2018Italy - John Chisholm - How to Scale Your Business
 
Search Engine Optimisation
Search Engine OptimisationSearch Engine Optimisation
Search Engine Optimisation
 
Perfecting Commerce Marketing With Deep Personalization
Perfecting Commerce Marketing With Deep PersonalizationPerfecting Commerce Marketing With Deep Personalization
Perfecting Commerce Marketing With Deep Personalization
 
Information Marketing Mastery for Professional Speakers - Ford Saeks
Information Marketing Mastery for Professional Speakers - Ford SaeksInformation Marketing Mastery for Professional Speakers - Ford Saeks
Information Marketing Mastery for Professional Speakers - Ford Saeks
 
Tcm step 3 venture assessment
Tcm step 3 venture assessmentTcm step 3 venture assessment
Tcm step 3 venture assessment
 
2 day ism workshop v1.1
2 day ism workshop v1.12 day ism workshop v1.1
2 day ism workshop v1.1
 
Film distribution & movie marketing seminar
Film distribution & movie marketing seminar  Film distribution & movie marketing seminar
Film distribution & movie marketing seminar
 
Maximising conversions
Maximising conversionsMaximising conversions
Maximising conversions
 

More from Brian Johnson

Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail
Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail
Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail Brian Johnson
 
CloudCon Data Mining Presentation
CloudCon Data Mining PresentationCloudCon Data Mining Presentation
CloudCon Data Mining PresentationBrian Johnson
 
Treemaps: Visualizing Hierarchical and Categorical Data
Treemaps: Visualizing Hierarchical and Categorical DataTreemaps: Visualizing Hierarchical and Categorical Data
Treemaps: Visualizing Hierarchical and Categorical DataBrian Johnson
 
11 964 181 System And Method For Providi
11 964 181 System And Method For Providi11 964 181 System And Method For Providi
11 964 181 System And Method For ProvidiBrian Johnson
 
11 641 262 Proprietor Currency Assignmen
11 641 262 Proprietor Currency Assignmen11 641 262 Proprietor Currency Assignmen
11 641 262 Proprietor Currency AssignmenBrian Johnson
 
10 977 279 Method And System For Categor
10 977 279 Method And System For Categor10 977 279 Method And System For Categor
10 977 279 Method And System For CategorBrian Johnson
 
11 869 290 Electronic Publication System
11 869 290 Electronic Publication System11 869 290 Electronic Publication System
11 869 290 Electronic Publication SystemBrian Johnson
 
2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & Acronyms2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & AcronymsBrian Johnson
 

More from Brian Johnson (8)

Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail
Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail
Graph Walks & Vector Embeddings: Exploiting the head and exploring the tail
 
CloudCon Data Mining Presentation
CloudCon Data Mining PresentationCloudCon Data Mining Presentation
CloudCon Data Mining Presentation
 
Treemaps: Visualizing Hierarchical and Categorical Data
Treemaps: Visualizing Hierarchical and Categorical DataTreemaps: Visualizing Hierarchical and Categorical Data
Treemaps: Visualizing Hierarchical and Categorical Data
 
11 964 181 System And Method For Providi
11 964 181 System And Method For Providi11 964 181 System And Method For Providi
11 964 181 System And Method For Providi
 
11 641 262 Proprietor Currency Assignmen
11 641 262 Proprietor Currency Assignmen11 641 262 Proprietor Currency Assignmen
11 641 262 Proprietor Currency Assignmen
 
10 977 279 Method And System For Categor
10 977 279 Method And System For Categor10 977 279 Method And System For Categor
10 977 279 Method And System For Categor
 
11 869 290 Electronic Publication System
11 869 290 Electronic Publication System11 869 290 Electronic Publication System
11 869 290 Electronic Publication System
 
2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & Acronyms2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & Acronyms
 

Recently uploaded

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROmotivationalword821
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 

Recently uploaded (20)

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTRO
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 

eBay Search Query Intent

  • 1. Welcome To Director of Engineering Search Science Recall & Spam April 3, 2015 BRIAN JOHNSON With more than 100 million active users globally, eBay is the world's largest online marketplace, where practically anyone can buy and sell practically anything. Founded in 1995, eBay connects a diverse and passionate community of individual buyers and sellers, as well as small businesses. Their collective impact on ecommerce is staggering: In 2014, the total value of goods sold on eBay was $82 billion -- more than $2,500 every second.
  • 3.
  • 4.
  • 5.
  • 6.
  • 9. Data Collection • Cell Tracking • Eye Tracking • Much Richer and more Detailed online • Behavioral Log Files
  • 11. Trends Intelligence:Human  Machine Data: Small  Big Sources: Few  Many Context: Aggregate  Detailed
  • 13. What is Special for eBay Search? •Commercial Intentions – Both the sellers and buyers have strong and clear intention – Transactions happen on eBay, hence more behavior data •Listings (Supply Side) – Given by sellers – Semi-structured data •Buyers (Demand Side) – Relevance matters (Browser vs. Searching) – Price matters – Seller trust/credit matters – 70% eBay revenue starts with Search
  • 14. Fish Sticks Demand Category 42% Business & Industrial > Electrical & Test Equipment > Electrical Equipment & Tools > Electrical Tools > Cable Pullers 10% Business & Industrial > Construction > Building Materials & Supplies > Electrical 9% Home & Garden > Tools > Other 7% Pet Supplies > Fish & Aquariums > Fish Pond Supplies 6% Business & Industrial > Light Equipment & Tools > Air Tools > Staplers
  • 15. Fishing Sticker Demand Category 41% eBay Motors > Parts & Accessories > Car & Truck Parts > Decals/Emblems/License Frames > Decals & Stickers > Graphics Decals 28%Home & Garden > Home Decor > Decals, Stickers & Vinyl Art 12%Sporting Goods > Fishing > Novelties & Gifts 11%Sporting Goods > Fishing > Fishing Equipment > Decals, Stickers & Patches 3%Collectibles > Transportation > Automobilia > Decals & Stickers
  • 16. Fish Stickers Demand Category 53%Home & Garden > Home Decor > Decals, Stickers & Vinyl Art 24%Crafts > Scrapbooking & Paper Crafts > Embellishments > Stickers 16% eBay Motors > Parts & Accessories > Car & Truck Parts > Decals/Emblems/License Frames > Decals & Stickers > Graphics Decals 3%Sporting Goods > Fishing > Novelties & Gifts 1%Sporting Goods > Fishing > Fishing Equipment > Decals, Stickers & Patches
  • 17. Keyword Expansion Test Key Expansion Top Query fishing sticker fish stickers fishing sticker diaries diary vampire diaries baggies baggie patagonia baggies baggies baggy patagonia baggies cranberries cranberry the cranberries jogging jog jogging stroller Key Expansion Top Query fishing sticker fish stickers fishing sticker diaries diary vampire diaries baggies baggie patagonia baggies baggies baggy patagonia baggies cranberries cranberry the cranberries jogging jog jogging stroller
  • 18. Context Matters cowboys hats ≠ cowboy hats Plastic toy cowboys = plastic toy cowboy
  • 19. Context & Specificity Context ATC  Armored Troop Carrier in Toys and Hobbies ATC  Artist trading card in ART ATC  Automatic Tool Change in Business and Industrial Specificity/Directionality Old  Antique Yoga towels/mats  Yogitoes
  • 20. Category Expansion Stamps > Commonwealth/ British Colonial > Bermuda Before After
  • 22. German Compounds •Syntactically, words can be combined and split in many ways •Multiple candidates Granitpflastersteine (granite paving stones) Granit(granite) pflastersteine(cobblestones) Granit(granite) pflaster(paving/band-aid) steine(stones) •Binding characters Hochzeitsschuhe (grammatically correct, 593 hits on ebay.de) Hochzeitschuhe (129 hits on ebay.de) •Some words shouldn’t be de-compounded. beiden (both) – bei(at) den(the)
  • 24. Hadoop Graph/Session Analysis Bipartite Graphs Keyword | Keyword Synonym Expansion Keyword | Attribute Aspect Expansions Keyword | Category Category Expansions, Related Search Diversity Query | Query Related Search Query | Item Related Search Query Session Analysis Successive Queries Synonyms, Related Search Query Substitutions Synonyms Same Session Correlation Related Search Query Metrics Click Through Rate Purchase Attribution Time to First Click, View Item, Purchase Query Pair Price & Category Divergence Query Pair Result Set Overlap Result Count
  • 25. Why we’re excited about data mining… •We’re at an inflection point – customers are defining how they shop – We are a data company – 40+ Pb of data (listings, pictures, queries, clicks, sales, feedback, …) – Many tests running orthogonally (in parallel with overlapping user slices) – Nearly all users in one of more tests – Many users per test, often millions •Find patterns and insights drives our customer experience •We’ve built successful teams of data scientists
  • 27. METRICS •What should we optimize –Page Views –Time on Site –Click Through Rate –Normalized Discounted Cumulative Gain –Purchases per User per Session/Day/Week –Revenue per User per Session/Day/Week –Net Promoter Score •How likely would you be to recommend …?
  • 28. REVENUE Every business focuses on and measures revenue. Every business focuses on profit and loss. And they should.
  • 31.
  • 32.
  • 33. Query Rewrites at eBay Query Rewrite Search User Query eBay Results Search Query User Query: pilzlampe {mushroom lamp} Search Query: OR(pilzlampe, PHRASE(OR(pilz,pilze),OR(lampe,lampen)))
  • 34. Example Query Services/Rewrites • Stemming (ipod OR ipods) • Spelling (cannon OR canon) • Condition (new OR condition=new) • Synonyms (boat carpet OR marine carpet) • Space Synonyms (MarioKart OR Mario-Kart) • Item Specifics (blue OR color=blue) • Acronyms (hp OR hewlett-packard OR horsepower) • Category (shoes OR Category=Shoes) • Cross Border (site=0 AND category =123) OR (site=3 AND category=456) • Fitment (fits model=corolla) • Term Removal (Harry Potter and the Order of the Phoenix (daily deal))
  • 35. Acronym/Abbreviation Mining •Acronyms/Abbreviations mined from raw text and query logs •Look for patterns of text: long form (short form) short form (long form) • Employ intelligent matching algorithms to mine candidates • Schwartz et al: Greedy Match Algorithm new cheap Playstation portable (PSP) PlayStation 3 (PS3) • Acronym discovered PSP => PlayStation Portable PS3 => PlayStation 3 • Candidates mined are fed to an ML classifier to remove false positives

Editor's Notes

  1. Why What How
  2. http://www.ebayinc.com/who
  3. You are in business to make money How do you know if changes you make, make money You HAVE to test You can’t manage what you don’t measure Testing is crucial Image http://www.wallpapertimes.com/files/q/Yf/4j/qYf4jp9q86379020_800x600.jpg
  4. http://ww1.prweb.com/prfiles/2010/10/28/4177424/Skymobileeyesh.jpg https://pandodaily.files.wordpress.com/2014/02/store-tracking.jpg?w=900&h=675
  5. 1 week > 6 months, 50 GB > 100 TB, related search collaborative > collaborative + success + NLP + overlap/partition + …
  6. Documents not enough anymore Need behavioral data – Yandex beating Google in Russia, why, they have users, refrigerators in Moscow vs. isolated small town
  7. (It was very hard to find a good example of this that brought in obviously wrong data above the fold: these issues are generally more subtle, showing up in deterministic sorts and in slower processing time. If you come up with another good example to include, that would be great.) There are many entity names, including many brands, which are identical to (Cowboys) or share components with (e.g. Red Bull) common terms that describe our inventory. By identifying entities and by using whole query context, we can provide expansions only when appropriate (e.g. no Redder Bull or Crimson Bull). We can also decide the confidence of an expansion compared to the original (e.g. as is usually done in spell check). For the cowboy(s) hats, Cowboys seems to mainly refer to the football team; there are a few cowboy hats where someone used “cowboys” instead of the possessive, but not many. For the toys, the plural form is definitely more common but the singular is also used in titles even in sets (bottom row of pictures has the singular; top row the plural); so, we want to use both forms to get the maximum inventory for this.
  8. Detail matters Context is important
  9. "beef labeling regulation & delegation of supervision law” - long word
  10. How do we do this Simple counting – that’s it, you “just” have to count Image http://www.csie.ntnu.edu.tw/~u91029/Matching.html
  11. http://www.inc.com/jeff-haden/4-business-metrics-you-cant-afford-to-ignore.html
  12. We did some great query rewrite work for UK/DE in 2011. Germans love compound words and the changes we made in 2011 to support them really paid off. We’ve continued to ramp up our machine learning, data mining, and natural language processing efforts to make sure that eBay search delivers again in 2012.