SlideShare a Scribd company logo
1 of 23
Taxonomy at AOL Classifying the parts of a whole Noel Agnew (@noelagnewny) Ashley Marty (@ashleykmarty) June 09, 2011
The problem:Aol did not have a common vocabulary
56+ Media brands, including: DAM New York 2011 Page 3
Multiple ad systems and content platforms Content platforms: Blogsmith Huffington Post (Movable type) 5min Truveo StudioNow DAM New York 2011 Page 4 Some ad systems: AdTech Advertising.com Feedpoint/Dynamic Banners
All speaking different languages… DAM New York 2011 Page 5 Tag.aol.com “beyonce” Tag… “beyonceknowles” AOL Music “beyonce” AOL music “beyonceknowles” Moviefone “beyonceknowles” Huffington Post “beyonce” H… Post “beyonceknowles”
What we were asked to do Effectively and granularly classify content:    For improved ad sales    To relate content within and between the brands    In some cases, to assist editors with external-facing tags    All sorts of other bits of magic (which will be touched on later) DAM New York 2011 Page6
The solution:Classify all AOL content in the same way
Faceted Ontology DAM New York 2011 Page 8 “…structural frameworks for organizing information on the semantic Web and within semantic enterprises. They provide unique benefits in discovery, flexible access, and information integration due to their inherent connectedness; that is, their ability to represent conceptual relationships. ” -M.K. Bergman, “An Executive Intro to Ontologies” http://www.mkbergman.com/900/an-executive-intro-to-ontologies/
Subjects We have approx. 6800 subjects Generally hierarchical, but some associative relationships Iterative process with editors (subject specialists) 12 Top levels (or classes) DAM New York 2011 Page 9 Arts and Humanities Education Entertainment Health and Medicine Lifestyle Money and Finance News and Politics Science and Tech Social Sciences Sports Transportation Travel and Tourism
Entities Named Things (includes persons) Locations Works Events Groups Brands Products DAM New York 2011 Page 10 Proper nouns (specific persons, places, things) Not hierarchical, but rather associative relationships 7 Entities Vocabularies
Taxonomy/ontology mashup DAM New York 2011 Page 11 Sprint HTC Evo 4G OSX iPhone Verizon Apple AT&T
Making it work
HELLO TEL AVIV! When we were tasked with this, we had very little direct communication with the team in Tel Aviv that runs the classification engine… We also were under the impression that auto-classification was their issue and they’d just have to classify with whatever we gave them. This was WRONG! DAM New York 2011 Page 13
Train in vain? DAM New York 2011 Page 14 ‘Women's Shoes’ We had to find training data for each subject in the taxonomy… and are continually doing so to improve classification.
DAM New York 2011 Page 15 More Contact with the Classification Team 	Providing Feedback on tagging results 	Collaborating on priorities 	What data is most valuable to the tagger? Getting to Know You
Turning large amounts of data into an ontology DAM New York 2011 Page 16 More data sources means multiple records for the same Entity More sources = More effort required in Merging records Name: Beyoncé MusicPerson MoviePerson Alias (synonym): Beyonce Knowles Alias (synonym): Beyonce Source:Wikipedia Source: AolMusicDB Source: AolMovieDB After Merge, one record remains with metadata and relationships from all sources More sources = More valuable records
Where we are now
DAM New York 2011 Page 18 Integrating with Advertising systems Our subjects can be mapped to Advertising categories to serve ads for related products Current Department Store campaign:  Page 18
Recommending Tags for Editorial DAM New York 2011 Page 19
Where we’re going
On the Roadmap… More projects with Advertising teams More data in our ontology to make classification better Refining the ontology- because it’s a living thing DAM New York 2011 Page 21
Lessons learned
Life lessons… Keep your eye on the prize Expect people to think this is a much smaller task than it is Don’t reinvent the wheel Never underestimate the power of the ability to manipulate data DAM New York 2011 Page 23

More Related Content

Viewers also liked

Pegasus essentials 2011 2012
Pegasus essentials 2011 2012Pegasus essentials 2011 2012
Pegasus essentials 2011 2012Jennifer Marten
 
Pharma Field Sales Learning and Development
Pharma Field Sales Learning and DevelopmentPharma Field Sales Learning and Development
Pharma Field Sales Learning and DevelopmentAnup Soans
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2Dave King
 
Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient? Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient? Anup Soans
 
Elvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings RemasteredElvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings RemasteredElvis Presley Blues
 
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?Gritiga Soothorn
 
LXE February Partner Webinar
LXE February Partner WebinarLXE February Partner Webinar
LXE February Partner WebinarLXE
 
Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)tykl94
 
Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)shweetheart
 
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014Joakim Nilsson
 

Viewers also liked (20)

Decisión
DecisiónDecisión
Decisión
 
Pegasus essentials 2011 2012
Pegasus essentials 2011 2012Pegasus essentials 2011 2012
Pegasus essentials 2011 2012
 
Pharma Field Sales Learning and Development
Pharma Field Sales Learning and DevelopmentPharma Field Sales Learning and Development
Pharma Field Sales Learning and Development
 
Voorzieningen
VoorzieningenVoorzieningen
Voorzieningen
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2
 
Ivy 09
Ivy  09Ivy  09
Ivy 09
 
lista de canais com tps
lista de canais com tpslista de canais com tps
lista de canais com tps
 
Creating a digital story by The Grove Library
Creating a digital story by The Grove LibraryCreating a digital story by The Grove Library
Creating a digital story by The Grove Library
 
Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient? Has Pharma Marketing Forgotten the Patient?
Has Pharma Marketing Forgotten the Patient?
 
Mayamuscleadvancedtechniques
MayamuscleadvancedtechniquesMayamuscleadvancedtechniques
Mayamuscleadvancedtechniques
 
Elvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings RemasteredElvis In The Movies Original Recordings Remastered
Elvis In The Movies Original Recordings Remastered
 
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
เข้าใช้โปรแกรม Turnitin ไม่ได้ ทำอย่างไร? ... Forgot Your Turnitin Password?
 
Week 20
Week 20Week 20
Week 20
 
Encuentro 2 Evaluar con tic
Encuentro 2 Evaluar con ticEncuentro 2 Evaluar con tic
Encuentro 2 Evaluar con tic
 
Elvis Presley Vol 02
Elvis Presley Vol 02Elvis Presley Vol 02
Elvis Presley Vol 02
 
LXE February Partner Webinar
LXE February Partner WebinarLXE February Partner Webinar
LXE February Partner Webinar
 
Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)Thesis writing assignment; thesis presentation(fixed)
Thesis writing assignment; thesis presentation(fixed)
 
Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)Summary Report For 23 Nov 09 (Final)
Summary Report For 23 Nov 09 (Final)
 
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
Social Monitor keynote - Barcelona Affiliate Conference #BAC2014
 
PréSentation Axir
PréSentation AxirPréSentation Axir
PréSentation Axir
 

Similar to Aol dam taxonomy

Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...Alexander Serebrenik
 
Open Calais @ Transparent Text
Open Calais @ Transparent TextOpen Calais @ Transparent Text
Open Calais @ Transparent TextKrista Thomas
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateAxiell ALM
 
Salesforce: How To Win The War On the Web
Salesforce: How To Win The War On the WebSalesforce: How To Win The War On the Web
Salesforce: How To Win The War On the WebWriterAccess
 
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Semantic Technology 2009:  Hybrid  Approaches to Taxonomy and FolksonomySemantic Technology 2009:  Hybrid  Approaches to Taxonomy and Folksonomy
Semantic Technology 2009: Hybrid Approaches to Taxonomy and FolksonomyEarley Information Science
 
Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011Joe Bavonese, PhD
 
Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...Conversion Rate Experts
 
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docxKey Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docxcroysierkathey
 
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docxRunning Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docxtodd521
 
Mobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And ContextMobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And ContextMobile Groove
 
Impact Of Piracy And Free ( T O C F F)
Impact Of Piracy And Free ( T O C  F F)Impact Of Piracy And Free ( T O C  F F)
Impact Of Piracy And Free ( T O C F F)Brian O'Leary
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsBianca Pereira
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)James Hendler
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateAxiell ALM
 
MN AMA Search101
MN AMA Search101MN AMA Search101
MN AMA Search101Azul 7
 
draft bpl
draft bpldraft bpl
draft bplmparhar
 
Chanimal Alliance Presentation
Chanimal Alliance PresentationChanimal Alliance Presentation
Chanimal Alliance Presentationtedfinch
 

Similar to Aol dam taxonomy (20)

Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and chal...
 
Open Calais @ Transparent Text
Open Calais @ Transparent TextOpen Calais @ Transparent Text
Open Calais @ Transparent Text
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – Update
 
Salesforce: How To Win The War On the Web
Salesforce: How To Win The War On the WebSalesforce: How To Win The War On the Web
Salesforce: How To Win The War On the Web
 
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Semantic Technology 2009:  Hybrid  Approaches to Taxonomy and FolksonomySemantic Technology 2009:  Hybrid  Approaches to Taxonomy and Folksonomy
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
 
Semantic search
Semantic searchSemantic search
Semantic search
 
Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011Joe Bavonese Psychotherapy Networker presentation March 2011
Joe Bavonese Psychotherapy Networker presentation March 2011
 
Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...Conversion for companies that put people in touch with each other (like class...
Conversion for companies that put people in touch with each other (like class...
 
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docxKey Term TARIFFS- (800 words minimum) 1-5After you have s.docx
Key Term TARIFFS- (800 words minimum) 1-5After you have s.docx
 
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docxRunning Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
Running Head STRATEGIC MANAGEMENT PLAN1STRATEGIC MANAGEMENT.docx
 
Mobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And ContextMobile Search Generating Revenues At The Intersection Of Content And Context
Mobile Search Generating Revenues At The Intersection Of Content And Context
 
Impact Of Piracy And Free ( T O C F F)
Impact Of Piracy And Free ( T O C  F F)Impact Of Piracy And Free ( T O C  F F)
Impact Of Piracy And Free ( T O C F F)
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data Datasets
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)
 
Collaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – UpdateCollaborative Project to Improve EMu for Managing Archives – Update
Collaborative Project to Improve EMu for Managing Archives – Update
 
Amazon
AmazonAmazon
Amazon
 
Amazon
AmazonAmazon
Amazon
 
MN AMA Search101
MN AMA Search101MN AMA Search101
MN AMA Search101
 
draft bpl
draft bpldraft bpl
draft bpl
 
Chanimal Alliance Presentation
Chanimal Alliance PresentationChanimal Alliance Presentation
Chanimal Alliance Presentation
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Aol dam taxonomy

  • 1. Taxonomy at AOL Classifying the parts of a whole Noel Agnew (@noelagnewny) Ashley Marty (@ashleykmarty) June 09, 2011
  • 2. The problem:Aol did not have a common vocabulary
  • 3. 56+ Media brands, including: DAM New York 2011 Page 3
  • 4. Multiple ad systems and content platforms Content platforms: Blogsmith Huffington Post (Movable type) 5min Truveo StudioNow DAM New York 2011 Page 4 Some ad systems: AdTech Advertising.com Feedpoint/Dynamic Banners
  • 5. All speaking different languages… DAM New York 2011 Page 5 Tag.aol.com “beyonce” Tag… “beyonceknowles” AOL Music “beyonce” AOL music “beyonceknowles” Moviefone “beyonceknowles” Huffington Post “beyonce” H… Post “beyonceknowles”
  • 6. What we were asked to do Effectively and granularly classify content: For improved ad sales To relate content within and between the brands In some cases, to assist editors with external-facing tags All sorts of other bits of magic (which will be touched on later) DAM New York 2011 Page6
  • 7. The solution:Classify all AOL content in the same way
  • 8. Faceted Ontology DAM New York 2011 Page 8 “…structural frameworks for organizing information on the semantic Web and within semantic enterprises. They provide unique benefits in discovery, flexible access, and information integration due to their inherent connectedness; that is, their ability to represent conceptual relationships. ” -M.K. Bergman, “An Executive Intro to Ontologies” http://www.mkbergman.com/900/an-executive-intro-to-ontologies/
  • 9. Subjects We have approx. 6800 subjects Generally hierarchical, but some associative relationships Iterative process with editors (subject specialists) 12 Top levels (or classes) DAM New York 2011 Page 9 Arts and Humanities Education Entertainment Health and Medicine Lifestyle Money and Finance News and Politics Science and Tech Social Sciences Sports Transportation Travel and Tourism
  • 10. Entities Named Things (includes persons) Locations Works Events Groups Brands Products DAM New York 2011 Page 10 Proper nouns (specific persons, places, things) Not hierarchical, but rather associative relationships 7 Entities Vocabularies
  • 11. Taxonomy/ontology mashup DAM New York 2011 Page 11 Sprint HTC Evo 4G OSX iPhone Verizon Apple AT&T
  • 13. HELLO TEL AVIV! When we were tasked with this, we had very little direct communication with the team in Tel Aviv that runs the classification engine… We also were under the impression that auto-classification was their issue and they’d just have to classify with whatever we gave them. This was WRONG! DAM New York 2011 Page 13
  • 14. Train in vain? DAM New York 2011 Page 14 ‘Women's Shoes’ We had to find training data for each subject in the taxonomy… and are continually doing so to improve classification.
  • 15. DAM New York 2011 Page 15 More Contact with the Classification Team Providing Feedback on tagging results Collaborating on priorities What data is most valuable to the tagger? Getting to Know You
  • 16. Turning large amounts of data into an ontology DAM New York 2011 Page 16 More data sources means multiple records for the same Entity More sources = More effort required in Merging records Name: Beyoncé MusicPerson MoviePerson Alias (synonym): Beyonce Knowles Alias (synonym): Beyonce Source:Wikipedia Source: AolMusicDB Source: AolMovieDB After Merge, one record remains with metadata and relationships from all sources More sources = More valuable records
  • 18. DAM New York 2011 Page 18 Integrating with Advertising systems Our subjects can be mapped to Advertising categories to serve ads for related products Current Department Store campaign: Page 18
  • 19. Recommending Tags for Editorial DAM New York 2011 Page 19
  • 21. On the Roadmap… More projects with Advertising teams More data in our ontology to make classification better Refining the ontology- because it’s a living thing DAM New York 2011 Page 21
  • 23. Life lessons… Keep your eye on the prize Expect people to think this is a much smaller task than it is Don’t reinvent the wheel Never underestimate the power of the ability to manipulate data DAM New York 2011 Page 23

Editor's Notes

  1. How many of you knew that all of these are owned by aolHow many of these were purchased since we started the taxonomy process
  2. Photo platform (mention it)At a minimum, 3 ad systems that we’ve had to deal with
  3. url to link out here
  4. Ad Sales: so products with some relation to the article can be served2.Relating content: Within: e.g. Someone on Aol Music can see all Beyonce articles Between: see Beyonce articles on Moviefone, Stylelist, Popeater: keep people on Aol sites instead of linking out3. Assist editors: standardize tags so content not being lost without relationships – can’t find it if not tagged properly
  5. Difference between taxo and onto
  6. Be flexible and remember your purpose (for us its aol content)Subjects may be called topics/categories in other placesSubjects describe ‘aboutness’ of an articlee.g. Report on world series is about ‘Baseball’e.g. Article about best airlines is about ‘Air Travel’
  7. We have around 3.8 million and countingTogether subjects and entities make up the taxonomy
  8. More Contact with the Classification Team Providing Feedback on tagging results Collaborating on priorities Focus on what is most valuable to the tagger
  9. Mix of NLP and machine learningPicks up important related terms that imply content is about a subject (heels, flats, etc).. Brands..etcMention that now entities extracted can actually improve subject taggingDMOZ: Voluntary human-edited directory of the web: lists of websites by subject
  10. One record will have multiple node types, aliases, metadata will be brought together: albums, date of birth, marriedto, spokesperson for brandVery rich records result: opportunity to create multiple relationships
  11. Subjects and entitiesWe met with teams, one thing they liked was the fact they could tag a ‘master version’ with a taxonomy ID-Bring all articles mentioning ‘Charlie Sheen’ together, just like the Beyonce example not different versions like charliesheen,charlie sheen, charlie+sheen
  12. Need title