SlideShare a Scribd company logo
1 of 33
Wikitology Wikipedia as an Ontology Zareen Syed, Tim Finin and Anupam Joshi zarsyed1@umbc.edu, finin@cs.umbc.edu, joshi@cs.umbc.edu University of Maryland Baltimore County
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],   intro     wikipedia    experiments    evaluation    next    conclusion  
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],   intro     wikipedia    experiments    evaluation    next    conclusion  
Motivation ,[object Object],[object Object],[object Object],[object Object],[object Object],   intro     wikipedia    experiments    evaluation    next    conclusion  
Approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],   intro     wikipedia    experiments    evaluation    next    conclusion  
What’s a document about? ,[object Object],[object Object],[object Object],[object Object],[object Object],   intro     wikipedia    experiments    evaluation    next    conclusion  
Wikitology ! ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],   intro      wikipedia     experiments    evaluation    next    conclusion  
Wikipedia Graph Structures ,[object Object],   intro      wikipedia     experiments    evaluation    next    conclusion   ,[object Object]
Methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Spreading Activation ,[object Object],[object Object],   intro    wikipedia     experiments     evaluation    next    conclusion  
Spreading Activation Start with an initial set of activated nodes
Spreading Activation At each pulse/iteration, spread activation to adjacent nodes
Spreading Activation ,[object Object],[object Object],[object Object],[object Object],[object Object],Some nodes will have higher activation than others
Method 1 Query doc(s) similar to Cosine similarity Similar Wikipedia Articles Using Wikipedia Article Text and Categories to Predict Concepts 0.2 0.1 0.3 0.8 0.2 Input
Method 1 Query doc(s) similar to Cosine similarity Wikipedia Category Graph Similar Wikipedia Articles Using Wikipedia Article Text and Categories to Predict Concepts 0.2 0.1 0.3 0.8 0.2 Input
Method 1 Query doc(s) similar to ,[object Object],[object Object],[object Object],Cosine similarity Wikipedia Category Graph Similar Wikipedia Articles Using Wikipedia Article Text and Categories to Predict Concepts 0.2 0.1 0.3 0.8 0.2 0.9 3 Input Output
Method 2 Query doc(s) Similar to Cosine similarity Wikipedia Category Graph Using Spreading Activation on Category Links Graph to get Aggregated Concepts 0.2 0.1 0.3 0.8 0.2 Input Ranked Concepts based  on Final Activation Score Output Spreading Activation Input Function Output Function
[object Object],[object Object],[object Object]
Method 3  Query doc(s) Similar To Ranked Concepts based on Final Activation Score Spreading Activation Threshold :  Ignore Spreading Activation to articles with less than 0.4  Cosine similarity score Edge Weights : Cosine  similarity between linked articles Wikipedia Article Links Graph Using Spreading Activation on  Article Links  Graph Node Input Function Node Output Function Output Input
Prediction for Single Test Document Preliminary Experiments    intro    wikipedia    experiments     evaluation     next    conclusion   ,[object Object],[object Object],[object Object],More pulses -> More Generalized Concepts Test Document Title Method 1 Ranking Categories Directly  Method 2 Spreading Activation Pulses=2 Method 2 Spreading Activation Pulses=3 Weather Prediction of thunder storms (CNN) “ Weather_Hazards” “ Winds” “ Severe_weather_and_convection”  “ Weather_Hazards” “ Current_events” “ Types_of_cyclone” “ Meterology” “ Nature” “ Weather”
Test Document Titles in the Set: (Wikipedia Articles) Crop_rotation  Permaculture  Beneficial_insects Neem  Lady_Bird Principles_of_Organic_Agriculture Rhizobia Biointensive Inter­cropping Green_manure  Preliminary Experiments Prediction for Set of Test Documents    intro    wikipedia    experiments     evaluation     next    conclusion   Concept not in the Category Hierarchy Method 1 Ranking Categories Directly Method 2 (2 pulses) Spreading Activation on Category links Graph Method 3 (2 pulses) Spreading Activation on Article Links Graph Agriculture Sustainable_technologies Crops Agronomy Permaculture Skills Applied_sciences Land_management Food_industry Agriculture Organic_farming Sustainable_agriculture Organic_gardening Agriculture Companion_planting
Average Similarity Query doc(s) similar to Cosine similarity 0.9 0.5 0.7 0.8 0.2 0.5 + 0.9 + 0.7 + 0.2 + 0.8 5    intro    wikipedia    experiments     evaluation     next    conclusion   Evaluation ,[object Object],[object Object]
Evaluation ,[object Object],Antibiotics Medical Treatments Medicines Oxytetracyclin Tetracyclin 1st Observation Articles are linked often with super and sub categories both
Category Prediction Evaluation    intro    wikipedia    experiments     evaluation     next    conclusion   ,[object Object],[object Object],M1 Method 1 SA1 Spreading Activation pulse(s)= 1 SA2 Spreading Activation pulse(s)=2
Article Links Prediction Evaluation ,[object Object],[object Object],Similar Documents, N = 5 Spreading Activation pulses=1    intro    wikipedia    experiments     evaluation     next    conclusion  
Prediction Accuracy  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],   intro    wikipedia    experiments     evaluation     next    conclusion  
Potential Applications ,[object Object],[object Object],[object Object]
Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],   intro    wikipedia    experiments    evaluation     next     conclusion  
Document Expansion  with Wikipedia Derived Ontology Terms   * ,[object Object],[object Object],*  In Collaboration with Paul McNamee, John Hopkins University Applied Physics Laboratory Doc: FT921-4598  (3/9/92) ... Alan Turing, described as a brilliant mathematician and a key figure in the breaking of the Nazis' Enigma codes. Prof IJ Good says it is as well that British security was unaware of Turing's homosexuality, otherwise he might have been fired 'and we might have lost the war'. In 1950 Turing wrote the seminal paper 'Computing Machinery And Intelligence', but in 1954 killed himself ... Turing_machine, Turing_test, Church_Turing_thesis, Halting_problem, Computable_number, Bombe, Alan_Turing, Recusion_theory, Formal_methods,  Computational_models, Theory_of_computation, Theoretical_computer_science, Artificial_Intelligence
Conclusion ,[object Object],[object Object],[object Object],[object Object],   intro    wikipedia    experiments    evaluation    next     conclusion   
References ,[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank you Questions and Suggestions?

More Related Content

What's hot

RAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSING
RAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSINGRAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSING
RAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSINGijaia
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service iiKan-Han (John) Lu
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Datagpano
 
Discovering latent informaion by
Discovering latent informaion byDiscovering latent informaion by
Discovering latent informaion byijaia
 
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation SystemLatent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation SystemShailly Saxena
 
Investigating the Characteristics and Research Impact of Sentiments in Tweets...
Investigating the Characteristics and Research Impact of Sentiments in Tweets...Investigating the Characteristics and Research Impact of Sentiments in Tweets...
Investigating the Characteristics and Research Impact of Sentiments in Tweets...Aravind Sesagiri Raamkumar
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Parang Saraf
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...John McDonald
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
 
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...IJCI JOURNAL
 
Aspects of broad folksonomies
Aspects of broad folksonomiesAspects of broad folksonomies
Aspects of broad folksonomiesdermotte
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...IRJET Journal
 
Advantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information RetrievalAdvantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information RetrievalOnur Yılmaz
 
DOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASES
DOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASESDOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASES
DOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASEScscpconf
 
Domain ontology development for communicable diseases
Domain ontology development for communicable diseasesDomain ontology development for communicable diseases
Domain ontology development for communicable diseasescsandit
 
ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodKarry Lu
 
Query Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on WikipediaQuery Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on WikipediaYI-JHEN LIN
 

What's hot (20)

RAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSING
RAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSINGRAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSING
RAPID INDUCTION OF MULTIPLE TAXONOMIES FOR ENHANCED FACETED TEXT BROWSING
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service ii
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Data
 
Discovering latent informaion by
Discovering latent informaion byDiscovering latent informaion by
Discovering latent informaion by
 
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation SystemLatent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
 
Investigating the Characteristics and Research Impact of Sentiments in Tweets...
Investigating the Characteristics and Research Impact of Sentiments in Tweets...Investigating the Characteristics and Research Impact of Sentiments in Tweets...
Investigating the Characteristics and Research Impact of Sentiments in Tweets...
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
 
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
 
Sybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal PresentationSybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal Presentation
 
Slideshow ire
Slideshow ireSlideshow ire
Slideshow ire
 
Aspects of broad folksonomies
Aspects of broad folksonomiesAspects of broad folksonomies
Aspects of broad folksonomies
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
 
Advantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information RetrievalAdvantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information Retrieval
 
DOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASES
DOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASESDOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASES
DOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASES
 
Domain ontology development for communicable diseases
Domain ontology development for communicable diseasesDomain ontology development for communicable diseases
Domain ontology development for communicable diseases
 
ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For Good
 
Query Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on WikipediaQuery Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on Wikipedia
 

Similar to Wikipedia as an Ontology for Describing Documents

The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONIJDKP
 
Effective Navigation of Query Results Based On Hierarchies
Effective Navigation of Query Results Based On HierarchiesEffective Navigation of Query Results Based On Hierarchies
Effective Navigation of Query Results Based On HierarchiesAkhil Ambekar
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
 
Law and Economics.docx
Law and Economics.docxLaw and Economics.docx
Law and Economics.docxwrite4
 
Automatically assigning Wikipedia articles to macrocategories.pdf
Automatically assigning Wikipedia articles to macrocategories.pdfAutomatically assigning Wikipedia articles to macrocategories.pdf
Automatically assigning Wikipedia articles to macrocategories.pdfCrystal Sanchez
 
Effective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From TextEffective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From Textmaria.grineva
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsIRJET Journal
 
Mendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic LiteratureMendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic LiteratureKris Jack
 
Taxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTaxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTSoholt
 
Extracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme DocumentsExtracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme Documentsmaria.grineva
 
ScienceDirect Presentation: Seton Hall
ScienceDirect Presentation: Seton HallScienceDirect Presentation: Seton Hall
ScienceDirect Presentation: Seton Hallrachelmccullough
 
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...ijnlc
 
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESFINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESkevig
 
From Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de BellisFrom Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de BellisXanat V. Meza
 
Automatic and unsupervised topic discovery in social networks
Automatic and unsupervised topic discovery in social networksAutomatic and unsupervised topic discovery in social networks
Automatic and unsupervised topic discovery in social networksAntonio Moreno
 
Use of Wikipedia categories on information retrieval: a brief research
Use of Wikipedia categories on information retrieval: a brief researchUse of Wikipedia categories on information retrieval: a brief research
Use of Wikipedia categories on information retrieval: a brief researchJesús Tramullas
 
How to conduct_a_systematic_or_evidence_review
How to conduct_a_systematic_or_evidence_reviewHow to conduct_a_systematic_or_evidence_review
How to conduct_a_systematic_or_evidence_reviewEaglefly Fly
 

Similar to Wikipedia as an Ontology for Describing Documents (20)

The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
 
Effective Navigation of Query Results Based On Hierarchies
Effective Navigation of Query Results Based On HierarchiesEffective Navigation of Query Results Based On Hierarchies
Effective Navigation of Query Results Based On Hierarchies
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect Information
 
Law and Economics.docx
Law and Economics.docxLaw and Economics.docx
Law and Economics.docx
 
Automatically assigning Wikipedia articles to macrocategories.pdf
Automatically assigning Wikipedia articles to macrocategories.pdfAutomatically assigning Wikipedia articles to macrocategories.pdf
Automatically assigning Wikipedia articles to macrocategories.pdf
 
Effective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From TextEffective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From Text
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining Applications
 
Mendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic LiteratureMendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic Literature
 
Taxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTaxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User Experience
 
Extracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme DocumentsExtracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme Documents
 
SciVerse @ TJU
SciVerse @ TJUSciVerse @ TJU
SciVerse @ TJU
 
ScienceDirect Presentation: Seton Hall
ScienceDirect Presentation: Seton HallScienceDirect Presentation: Seton Hall
ScienceDirect Presentation: Seton Hall
 
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
 
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESFINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
 
From Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de BellisFrom Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
From Bibliometrics to Cybermetrics - a book chapter by Nicola de Bellis
 
Automatic and unsupervised topic discovery in social networks
Automatic and unsupervised topic discovery in social networksAutomatic and unsupervised topic discovery in social networks
Automatic and unsupervised topic discovery in social networks
 
Use of Wikipedia categories on information retrieval: a brief research
Use of Wikipedia categories on information retrieval: a brief researchUse of Wikipedia categories on information retrieval: a brief research
Use of Wikipedia categories on information retrieval: a brief research
 
How to conduct_a_systematic_or_evidence_review
How to conduct_a_systematic_or_evidence_reviewHow to conduct_a_systematic_or_evidence_review
How to conduct_a_systematic_or_evidence_review
 

Recently uploaded

FOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdf
FOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdfFOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdf
FOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdfCocity Enterprises
 
Group 8 - Goldman Sachs & 1MDB Case Studies
Group 8 - Goldman Sachs & 1MDB Case StudiesGroup 8 - Goldman Sachs & 1MDB Case Studies
Group 8 - Goldman Sachs & 1MDB Case StudiesNghiaPham100
 
Stock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdfStock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdfMichael Silva
 
Responsible Finance Principles and Implication
Responsible Finance Principles and ImplicationResponsible Finance Principles and Implication
Responsible Finance Principles and ImplicationNghiaPham100
 
Technology industry / Finnish economic outlook
Technology industry / Finnish economic outlookTechnology industry / Finnish economic outlook
Technology industry / Finnish economic outlookTechFinland
 
Seeman_Fiintouch_LLP_Newsletter_May-2024.pdf
Seeman_Fiintouch_LLP_Newsletter_May-2024.pdfSeeman_Fiintouch_LLP_Newsletter_May-2024.pdf
Seeman_Fiintouch_LLP_Newsletter_May-2024.pdfAshis Kumar Dey
 
logistics industry development power point ppt.pdf
logistics industry development power point ppt.pdflogistics industry development power point ppt.pdf
logistics industry development power point ppt.pdfSalimullah13
 
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budgetCall Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budgetSareena Khatun
 
cost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxcost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxazadalisthp2020i
 
abortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammam
abortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammamabortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammam
abortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammamsamsungultra782445
 
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & RequirementsExplore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirementsmarketingkingdomofku
 
Q1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfQ1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfAdnet Communications
 
In Sharjah ௵(+971)558539980 *_௵abortion pills now available.
In Sharjah ௵(+971)558539980 *_௵abortion pills now available.In Sharjah ௵(+971)558539980 *_௵abortion pills now available.
In Sharjah ௵(+971)558539980 *_௵abortion pills now available.hyt3577
 
falcon-invoice-discounting-unlocking-prime-investment-opportunities
falcon-invoice-discounting-unlocking-prime-investment-opportunitiesfalcon-invoice-discounting-unlocking-prime-investment-opportunities
falcon-invoice-discounting-unlocking-prime-investment-opportunitiesFalcon Invoice Discounting
 
QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.
QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.
QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.hyt3577
 
Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...
Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...
Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...globusfinanza
 
abortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadh
abortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadhabortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadh
abortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadhsamsungultra782445
 
uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...
uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...
uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...batoole333
 
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsMahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
Law of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Law of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnLaw of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Law of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTintoTom3
 

Recently uploaded (20)

FOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdf
FOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdfFOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdf
FOREX FUNDAMENTALS: A BEGINNER'S GUIDE.pdf
 
Group 8 - Goldman Sachs & 1MDB Case Studies
Group 8 - Goldman Sachs & 1MDB Case StudiesGroup 8 - Goldman Sachs & 1MDB Case Studies
Group 8 - Goldman Sachs & 1MDB Case Studies
 
Stock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdfStock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdf
 
Responsible Finance Principles and Implication
Responsible Finance Principles and ImplicationResponsible Finance Principles and Implication
Responsible Finance Principles and Implication
 
Technology industry / Finnish economic outlook
Technology industry / Finnish economic outlookTechnology industry / Finnish economic outlook
Technology industry / Finnish economic outlook
 
Seeman_Fiintouch_LLP_Newsletter_May-2024.pdf
Seeman_Fiintouch_LLP_Newsletter_May-2024.pdfSeeman_Fiintouch_LLP_Newsletter_May-2024.pdf
Seeman_Fiintouch_LLP_Newsletter_May-2024.pdf
 
logistics industry development power point ppt.pdf
logistics industry development power point ppt.pdflogistics industry development power point ppt.pdf
logistics industry development power point ppt.pdf
 
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budgetCall Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
 
cost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxcost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptx
 
abortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammam
abortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammamabortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammam
abortion pills in Riyadh Saudi Arabia (+919707899604)cytotec pills in dammam
 
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & RequirementsExplore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirements
 
Q1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfQ1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdf
 
In Sharjah ௵(+971)558539980 *_௵abortion pills now available.
In Sharjah ௵(+971)558539980 *_௵abortion pills now available.In Sharjah ௵(+971)558539980 *_௵abortion pills now available.
In Sharjah ௵(+971)558539980 *_௵abortion pills now available.
 
falcon-invoice-discounting-unlocking-prime-investment-opportunities
falcon-invoice-discounting-unlocking-prime-investment-opportunitiesfalcon-invoice-discounting-unlocking-prime-investment-opportunities
falcon-invoice-discounting-unlocking-prime-investment-opportunities
 
QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.
QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.
QATAR Pills for Abortion -+971*55*85*39*980-in Dubai. Abu Dhabi.
 
Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...
Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...
Avoidable Errors in Payroll Compliance for Payroll Services Providers - Globu...
 
abortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadh
abortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadhabortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadh
abortion pills in Jeddah Saudi Arabia (+919707899604)cytotec pills in Riyadh
 
uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...
uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...
uk-no 1 kala ilam expert specialist in uk and qatar kala ilam expert speciali...
 
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsMahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Law of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Law of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnLaw of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Law of Demand.pptxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
 

Wikipedia as an Ontology for Describing Documents

  • 1. Wikitology Wikipedia as an Ontology Zareen Syed, Tim Finin and Anupam Joshi zarsyed1@umbc.edu, finin@cs.umbc.edu, joshi@cs.umbc.edu University of Maryland Baltimore County
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Spreading Activation Start with an initial set of activated nodes
  • 12. Spreading Activation At each pulse/iteration, spread activation to adjacent nodes
  • 13.
  • 14. Method 1 Query doc(s) similar to Cosine similarity Similar Wikipedia Articles Using Wikipedia Article Text and Categories to Predict Concepts 0.2 0.1 0.3 0.8 0.2 Input
  • 15. Method 1 Query doc(s) similar to Cosine similarity Wikipedia Category Graph Similar Wikipedia Articles Using Wikipedia Article Text and Categories to Predict Concepts 0.2 0.1 0.3 0.8 0.2 Input
  • 16.
  • 17. Method 2 Query doc(s) Similar to Cosine similarity Wikipedia Category Graph Using Spreading Activation on Category Links Graph to get Aggregated Concepts 0.2 0.1 0.3 0.8 0.2 Input Ranked Concepts based on Final Activation Score Output Spreading Activation Input Function Output Function
  • 18.
  • 19. Method 3 Query doc(s) Similar To Ranked Concepts based on Final Activation Score Spreading Activation Threshold : Ignore Spreading Activation to articles with less than 0.4 Cosine similarity score Edge Weights : Cosine similarity between linked articles Wikipedia Article Links Graph Using Spreading Activation on Article Links Graph Node Input Function Node Output Function Output Input
  • 20.
  • 21. Test Document Titles in the Set: (Wikipedia Articles) Crop_rotation Permaculture Beneficial_insects Neem Lady_Bird Principles_of_Organic_Agriculture Rhizobia Biointensive Inter­cropping Green_manure Preliminary Experiments Prediction for Set of Test Documents  intro  wikipedia  experiments  evaluation  next  conclusion  Concept not in the Category Hierarchy Method 1 Ranking Categories Directly Method 2 (2 pulses) Spreading Activation on Category links Graph Method 3 (2 pulses) Spreading Activation on Article Links Graph Agriculture Sustainable_technologies Crops Agronomy Permaculture Skills Applied_sciences Land_management Food_industry Agriculture Organic_farming Sustainable_agriculture Organic_gardening Agriculture Companion_planting
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. Thank you Questions and Suggestions?