SlideShare a Scribd company logo
1 of 41
in collaboration with  Georgiana Ifrim, Gjergji Kasneci, Josiane Parreira, Maya Ramanath,  Ralf Schenkel, Fabian Suchanek, Martin Theobald
DB and IR: Two Parallel Universes canonical  application: accounting libraries data type: numbers, short strings text foundation: algebraic / logic based probabilistic / statistics based search paradigm: Boolean retrieval (exact queries, result sets/bags)‏ ranked retrieval (vague queries, result lists)‏ Database Systems Information Retrieval market leaders: Oracle, IBM DB2, MS SQL Server, etc. Google, Yahoo!, MSN, Verity, Fast, etc. parallel universes forever ?
Why DB&IR Now? – Application Needs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Simplify life for application areas like: Typical data: Disease (DId, Name,  Category , Pathogen …)   UMLS-Categories ( … )‏ Patient (… Age, HId, Date,  Report , TreatedDId)  Hospital (HId,  Address  …) Typical query:  symptoms of  tropical virus diseases  and  reported anomalies with young patients in  central Europe  in the last two weeks
Why DB&IR Now? – Platform Desiderata Structured data (records)‏ Unstructured data (documents)‏ Unstructured search (keywords)‏ Structured search (SQL,XQuery)‏ DB Systems IR Systems Search Engines Keyword Search on Relational Graphs (IIT Bombay, UCSD, MSR, Hebrew U, CU Hong Kong, Duke U, ...)‏ Querying entities & relations from IE (MSR Beijing, UW Seattle, IBM Almaden, UIUC, MPI, … )‏ Platform desiderata (from app developer‘s viewpoint): ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Integrated DB&IR Platform
Why DB&IR Forever? Turn the Web, Web2.0, and Web3.0 into the world‘s  most comprehensive  knowledge base  („ semantic DB “) !   ,[object Object],[object Object],[object Object],  2000   2007 indexed Web  2 Bio.   20 Bio. Flickr photos   ---   100 Mio. digital photos   ?   150 Bio.  Wikipedia  8 000   1.8 Mio. OECD researchers  7.4 Mio.   8.4 Mio. patents world-wide   ?  60 Mio. US Library of Congres   115 Mio.   134 Mio. Google Scholar   ---   500 Mio.
Outline • Past • Future • Present : Matter, Antimatter, and Wormholes  : From Data to Knowledge : XML and Graph IR
Parallel Universes: A Closer Look Matter Antimatter ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
DB IR 1990 1995 2000 2005 VAGUE (Motro)‏ Proximal Nodes (Baeza-Yates et al.)‏ WHIRL (Cohen)‏ Prob. Datalog (Fuhr et al.)‏ INEX XPath XPath Full-Text Prob. DB (Cavallo&Pittarelli)‏ Prob. Tuples (Barbara et al.)‏ Web Entity Search: Libra, Avatar, ExDB … Faceted Search: Flamenco … 1st Gen. XML IR: XXL, XIRQL, Elixir, JuruXML Multimedia IR Web Query Languages: W3QS, WebOQL, Araneus … Semistructured Data:  Lore, Xyleme … 2nd Gen. XML IR: XRank,Timber, TIJAH, XSearch, FleXPath, CoXML, TopX, MarkLogic, Fast … Uncertain & Prob. Relations: Mystiq, Trio … Struct. Docs Deep Web Search Digital Libraries Graph IR
WHIRL: IR over Relations  [W.W. Cohen: SIGMOD’98] Add text-similarity selection and join to relational algebra Example:  Select * From Movies M, Reviews R  Where M.Plot  ~   ” fight“ And M.Year > 1990 And R.Rating > 3 And M.Title  ~  R.Title And M.Plot  ~  R.Comment Title  Plot  …  Year Movies Title  Comment  …  Rating Reviews Matrix Hero Matrix 1 Matrix Reloaded Matrix Eigenvalues Ying xiong aka. Hero Shrek 2 …  matrix spectrum  …  orthonormal …  …  fight for peace … …  sword fight …  dramatic colors … …  In ancient China …  fights  …  sword fight … fights Broken Sword … In the near future …  computer hacker Neo … …  fight training … …  cool fights … new techniques … …  fights … and more fights … …  fairly boring … 1999 2002 2004 In Far Far Away … our lovely hero fights with cat killer … 4 1 5 5 ,[object Object],[object Object],[object Object],Scoring and ranking: s (<x,y>, q: A~B) = cosine (x.A, y.B)  s (<x,y>, q 1     …    q m ) =  x j  ~  tf  (word j in x)     idf  (word j)‏ with dampening & normalization
XXL: Early XML IR  [Anja Theobald, GW: Adding Relevance toXML, WebDB’00] Which professors  from Saarbruecken (SB)‏ are teaching IR and have research projects on XML? Union of  heterogeneous  sources  without global schema   Similarity-aware XPath: // ~ Professor   [//* =  ” ~ SB“] [ // ~ Course  [//* = ” ~ IR“]  ] [ // ~ Research  [//* =  ” ~ XML“]   ] Similarity-aware XPath: // ~ Professor   [//* =  ” ~ SB“] [ // ~ Course  [//* = ” ~ IR“]  ] [ // ~ Research  [//* =  ” ~ XML“]   ] Professor Name : Gerhard Weikum Address ... City : SB Country :  Germany Teaching Research   Course Title :  IR Description :  Information  retrieval ... Syllabus ... Book Article ... ... Project Title :  Intelligent Search of Heterogeneous XML Data Funding : EU ... Name : Ralf Schenkel Lecturer Address: Max-Planck Institute for Informatics, Germany Activities Seminar Contents: Ranked  retrieval … Literature:  … Scientific Name: INEX task coordinator (Initiative for the  Evaluation of XML …)‏ Other Sponsor:  EU …
XXL: Early XML IR  [Anja Theobald, GW: Adding Relevance toXML, WebDB’00] ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Similarity-aware XPath: // ~ Professor   [//* =  ” ~ Saarbruecken“] [ // ~ Course  [//* = ” ~ IR“]  ] [ // ~ Research  [//* =  ” ~ XML“]   ] Which professors  from Saarbruecken (SB)‏ are teaching IR and have research projects on XML? Motivation: Union of heterogeneous sources has no schema  Professor Name : Gerhard Weikum Address ... City : SB Country :  Germany Teaching Research   Course Title :  IR Description :  Information  retrieval ... Syllabus ... Book Article ... ... Project Title :  Intelligent Search of Heterogeneous XML Data Funding : EU ... Name : Ralf Schenkel Lecturer Address: Max-Planck Institute for Informatics, Germany Activities Seminar Contents: Ranked  retrieval … Literature:  … Scientific Name: INEX task coordinator (Initiative for the  Evaluation of XML …)‏ Other Sponsor:  EU … Wu&Palmer: |path| through lca(x,y)‏ Dice coeff.: 2 #(x,y) / (#x + #y) on Web query expansion model: disjunction of tags magician wizard intellectual artist alchemist director primadonna professor teacher scholar academic, academician, faculty member scientist researcher HYPONYM (0.749)‏ investigator mentor RELATED (0.48)‏ lecturer
The Past: Lessons Learned  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],precision recall //  ~Professor [...] //  { Professor, Researcher,  Lecturer, Scientist,  Scholar, Academic, ... }[...] element gold produce Golden Delicious entity food substance solid edible fruit apple pome
Outline  Past • Future • Present : Matter, Antimatter, and Wormholes  : From Data to Knowledge : XML and Graph IR
TopX: 2nd Generation XML IR ” Semantic“ XPath Full-Text query:  / Article  [ftcontains(// Person ,  ” Max Planck“)] [ftcontains(// Work ,  ” quantum physics“)] // Children [@ Gender  =  ” female“]// Birthdates supported by  TopX  engine:  http://infao5501.ag5.mpi-sb.mpg.de:8080/topx/ http://topx.sourceforge.net ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Martin Theobald, Ralf Schenkel, GW: VLDB’05, VLDB Journal]
Commercial Break [Martin Theobald, Ralf Schenkel, GW: VLDB’95] TopX demo  today 3:30 – 5:30
Principled Ranking by Probabilistic IR odds for item d with terms d i  being relevant for  query q = {q 1 , …, q m } binary features, conditional independence of features [Robertson & Sparck-Jones 1976] ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],„ God does not play dice.“ (Einstein)‏ IR does. with related to but different from statistical language models  Relationship to tf*idf ,[object Object],[object Object],[object Object]
Probabilistic Ranking for SQL SQL queries that return  many answers  need ranking ,[object Object],[object Object],[object Object],[object Object],[object Object],odds for tuple d with attributes X  Y  relevant for  query  q: X 1 =x 1    …     X m =x m Estimate prob‘s, exploiting  workload  W: [S. Chaudhuri, G. Das, V. Hristidis, GW: TODS‘06] ,[object Object],[object Object],[object Object],[object Object]
From Tables and Trees to Graphs Example:  Conferences (CId, Title, Location, Year) Journals (JId, Title)‏ CPublications (PId, Title, CId) JPublications (PId, Title, Vol, No, Year)  Authors (PId, Person) Editors (CId, Person)‏ Select * From * Where * Contains  ” Gray, DeWitt, XML, Performance “  And Year > 95 Schema-agnostic  keyword search  over  multiple tables : graph of tuples with foreign-key relationships as edges  [BANKS, Discover, DBExplorer, KUPS, SphereSearch, BLINKS] Result is  connected tree  with nodes that contain  as many query keywords as possible Ranking:  with  nodeScore  based on tf*idf or prob. IR and  edgeScore  reflecting importance of relationships (or confidence, authority, etc.)‏ ,[object Object],[object Object],[object Object],[object Object],[object Object],Top-k querying:  compute best trees, e.g. Steiner trees (NP-hard)
The Present: Observations & Opportunities ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],actor movie movie plot director movie actor actor director plot ” life physicist Max Planck“ //article[//person ”Max Planck“] [//category ”physicist“] //biography
Outline  Past • Future  Present : Matter, Antimatter, and Wormholes  : From Data to Knowledge : XML and Graph IR
Knowledge Queries  Nobel laureate who survived both world wars and his children drama with three women making a prophecy  to a British nobleman that he will become king proteins that inhibit both protease and some other enzyme connection between Thomas Mann and Goethe differences in Rembetiko music from Greece and from Turkey neutron stars with Xray bursts > 10 40  erg s -1  & black holes in 10‘‘  market impact of Web2.0 technology in December 2006  sympathy or antipathy for Germany from May to August 2006 Turn the Web, Web2.0, and Web3.0 into the world‘s  most comprehensive  knowledge base  („ semantic DB “) !  Answer „knowledge queries“ such as:
Three Roads to Knowledge ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
High-Quality Knowledge Sources ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],growing with strong momentum
High-Quality Knowledge Sources General-purpose  thesauri  and concept networks:  WordNet  family enzyme  -- (any of several complex proteins that are produced by cells and  act as catalysts in specific biochemical reactions)‏ =>  protein  -- (any of a large group of nitrogenous organic compounds  that are essential constituents of living cells; ...)‏ => macromolecule, supermolecule  ... =>  organic compound  -- (any compound of carbon  and another element or a radical)‏ ...  =>  catalyst, accelerator  -- ((chemistry) a substance that initiates or  accelerates a chemical reaction  without itself being affected)‏ =>  activator  -- ((biology) any agency bringing about activation; ...)‏ ,[object Object],[object Object],[object Object],[object Object]
High-Quality Knowledge Sources Wikipedia  and other lexical sources
Exploit Hand-Crafted Knowledge {{Infobox_Scientist | name = Max Planck | birth_date = [[April 23]], [[1858]]  | birth_place = [[Kiel]], [[Germany]] | death_date = [[October 4]], [[1947]] | death_place = [[Göttingen]], [[Germany]] | residence = [[Germany]]  | nationality = [[Germany|German]]  | field = [[Physicist]] | work_institution = [[University of Kiel]]</br>  [[Humboldt-Universität zu Berlin]]</br> [[Georg-August-Universität Göttingen]] | alma_mater = [[Ludwig-Maximilians-Universität München]] | doctoral_advisor = [[Philipp von Jolly]] | doctoral_students =  [[Gustav Ludwig Hertz]]</br> …  | known_for  = [[Planck's constant]],  [[Quantum mechanics|quantum theory]] | prizes =  [[Nobel Prize in Physics]] (1918)‏ … Wikipedia, WordNet,  and other lexical sources
YAGO: Yet Another Great Ontology [F. Suchanek, G. Kasneci, GW: WWW 2007] ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],entity1 entity2 relation Max_Planck Kiel bornIn Kiel City isInstanceOf Examples:
YAGO Knowledge Representation Entity Max_Planck April 23, 1858 Person City Country subclass Location subclass instanceOf subclass subclass bornOn “ Max Planck” means “ Dr. Planck” means subclass October 4, 1947 diedOn Kiel bornIn Nobel Prize Erwin_Planck FatherOf hasWon Scientist means “ Max Karl Ernst Ludwig Planck” Physicist instanceOf subclass Biologist subclass concepts individuals words Online access and download at  http://www.mpi-inf.mpg.de/~suchanek/yago/   Accuracy: 97% Knowledge Base  # Facts KnowItAll   30 000 SUMO   60 000 WordNet   200 000 OpenCyc   300 000 Cyc    5 000 000 YAGO   6 000 000
NAGA: Graph IR on YAGO  [G. Kasneci et al.: WWW‘07] queries with regular expressions Ling $x scientist isa hasFirstName | hasLastName $y Zhejiang locatedIn * worksFor conjunctive queries Beng Chin Ooi (coAuthor | advisor) * Kiel $x scientist isa bornIn Graph-based search on YAGO-style knowledge bases  with built-in  ranking  based on  confidence  and  informativeness    statistical language model for result graphs
Ranking Factors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],bornIn (Max Planck, Kiel)  from „ Max Planck was born in Kiel“ (Wikipedia)‏ livesIn (Elvis Presley, Mars)  from „ They believe Elvis hides on Mars“ (Martian Bloggeria)‏ q: isa (Einstein, $y)‏ isa (Einstein, scientist)‏ isa (Einstein, vegetarian)‏ q: isa ($x, vegetarian)‏ isa (Einstein, vegetarian)‏ isa (Al Nobody, vegetarian)‏ Einstein vegetarian Bohr Nobel Prize Tom Cruise 1962 isa isa bornIn diedIn won won
Information Extraction (IE): Text to Records combine NLP, pattern matching, lexicons, statistical learning Max Planck  4/23, 1858  Kiel Albert Einstein  3/14, 1879  Ulm  Mahatma Gandhi 10/2, 1869  Porbandar Person  BirthDate  BirthPlace  ... Person  ScientificResult Max Planck Quantum Theory Person  Collaborator Max Planck  Albert Einstein Max Planck  Niels Bohr Planck‘s constant  6.226  10 23   Js Constant  Value  Dimension
Knowledge Acquisition from the Web ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Existing approaches and tools (Snowball [Gravano et al. 2000], KnowItAll [Etzioni et al. 2004], …): almost-unsupervised pattern matching and learning: seeds (known facts)    patterns (in text)    (extraction) rule    (new) facts
Methods for Web-Scale Fact Extration city(Beijing)   plays(Coltrane, sax)   city(Beijing)   old center of Beijing plays(Coltrane, sax)   sax player Coltrane city(Beijing)   old center of Beijing old center of X plays(Coltrane, sax)   sax player Coltrane Y player X Example: city (Seattle)  in downtown Seattle  city (Seattle)  Seattle and other towns  city (Las Vegas)   Las Vegas and other towns plays (Zappa, guitar)  playing guitar: … Zappa plays (Davis, trumpet)  Davis … blows trumpet seeds     text       rules     new facts  Example: city (Seattle)  in downtown Seattle  in downtown X city (Seattle)  Seattle and other towns  X and other towns city (Las Vegas)   Las Vegas and other towns X and other towns plays (Zappa, guitar)  playing guitar: … Zappa playing Y: … X plays (Davis, trumpet)  Davis … blows trumpet X … blows Y Example: city (Seattle)  in downtown Seattle  in downtown X city (Seattle)  Seattle and other towns  X and other towns city (Las Vegas)   Las Vegas and other towns  X and other towns plays (Zappa, guitar)  playing guitar: … Zappa playing Y: … X plays (Davis, trumpet)  Davis … blows trumpet X … blows Y Example: city (Seattle)  in downtown Seattle   in downtown X city (Seattle)  Seattle and other towns   X and other towns city (Las Vegas)    Las Vegas and other towns X and other towns plays (Zappa, guitar)  playing guitar: … Zappa playing Y: … X plays (Davis, trumpet)  Davis … blows trumpet X … blows Y   in downtown Beijing city(Beijing)‏   Coltrane blows sax plays(C., sax)‏ Assessment of facts & generation of rules based on statistics Rules can be more sophisticated:  playing NN: (ADJ|ADV)* NP & class(NN)=instrument & class(head(NP))=person     plays(head(NP), NN)‏
Performance of Web-IE State-of-the-art precision/recall results: Anecdotic evidence: invented (A.G. Bell, telephone)‏ married (Hillary Clinton, Bill Clinton)‏ isa (yoga, relaxation technique)‏ isa ( zearalenone, mycotoxin)‏ contains (chocolate,  theobromine)‏ contains (Singapore sling, gin)‏ invented (Johannes Kepler, logarithm tables)‏ married (Segolene Royal, Francois Hollande)‏ isa (yoga, excellent way)‏ isa (your day, good one)‏ contains (chocolate, raisins)‏ plays (the liver, central role)‏ makes (everybody, mistakes)‏ relation precision  recall   corpus  systems countries 80%   90%   Web  KnowItAll cities 80%  ???   Web  KnowItAll scientists 60%   ???   Web KnowItAll headquarters 90%   50%   News  Snowball, LEILA birthdates 80%   70%   Wikipedia  LEILA instanceOf 40%   20%   Web Text2Onto, LEILA Open IE 80%   ???   Web TextRunner precision value-chain: entities 80%, attributes 70%, facts 60%, events 50%
Beyond Surface Learning with LEILA Almost-unsupervised Statistical Learning with Dependency Parsing Limitation of surface patterns: who discovered or invented what “ Tesla ’s work formed the basis of  AC electric power ”  Learning to Extract Information by Linguistic Analysis [F.Suchanek, G.Ifrim, GW: KDD‘06] ,[object Object],[object Object],[object Object],[object Object],“ Al Gore  funded more work for a better basis of the  Internet ” (Cologne, Rhine), (Cairo, Nile), …  (Cairo, Rhine), (Rome, 0911), (  ,   [0..9]*  ), … Paris  was founded on an island in the  Seine (Paris, Seine)  Ss Pv MVp Ds Js DG Js MVp NP VP VP PP NP NP PP NP NP Cologne  lies on the banks of the  Rhine Ss MVp DMc Mp Dg Js Jp NP PP VP NP PP NP NP NP People in  Cairo  like wine from the  Rhine  valley Mp Js Os Sp Mvp Ds Js AN NP NP PP VP PP NP NP NP NP
IE Efficiency and Accuracy Tradeoffs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],IE is cool, but what‘s in it for DB folks? [see also tutorials by Cohen, Doan/Ramakrishnan/Vaithyanathan, Agichtein/Sarawagi]
The Future: Challenges ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Outline  Past  Future  Present : Matter, Antimatter, and Wormholes  : From Data to Knowledge : XML and Graph IR
Major Trends in DB and IR malleable schema (later)‏ deep NLP, adding structure record linkage info extraction graph mining entity-relationship graph IR  ontologies ranking Database Systems Information Retrieval statistical language models data uncertainty programmability search as Web Service dataspaces Web objects Web 2.0 Web 2.0
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
DB&IR:  Both Sides Now ,[object Object],[object Object],[object Object],Thank You ! DB&IR

More Related Content

What's hot

Lacey Liu SDE II Resume
Lacey Liu SDE II ResumeLacey Liu SDE II Resume
Lacey Liu SDE II ResumeLacey (Xi) Liu
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionLihua Zhao
 
"Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"...
"Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"..."Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"...
"Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"...Dataconomy Media
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Lihua Zhao
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataSSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataPolytechnic University of Bari
 
(Semi-)Automatic analysis of online contents
(Semi-)Automatic analysis of online contents(Semi-)Automatic analysis of online contents
(Semi-)Automatic analysis of online contentsSteffen Staab
 
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Polytechnic University of Bari
 
Programming with Semantic Broad Data
Programming with Semantic Broad DataProgramming with Semantic Broad Data
Programming with Semantic Broad DataSteffen Staab
 
Automatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative NetworksAutomatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative NetworksMarko Rodriguez
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...Jennifer D'Souza
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Qualityandimou
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search SystemTrey Grainger
 
Perspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textPerspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textJennifer D'Souza
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1ErhardRahm
 
Deriving human readable labels from sparql queries
Deriving human readable labels from sparql queries Deriving human readable labels from sparql queries
Deriving human readable labels from sparql queries Basil Ell
 

What's hot (19)

Lacey Liu SDE II Resume
Lacey Liu SDE II ResumeLacey Liu SDE II Resume
Lacey Liu SDE II Resume
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 
"Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"...
"Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"..."Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"...
"Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age"...
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataSSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
 
co:op-READ-Convention Marburg - Enrique Vidal
co:op-READ-Convention Marburg - Enrique Vidalco:op-READ-Convention Marburg - Enrique Vidal
co:op-READ-Convention Marburg - Enrique Vidal
 
(Semi-)Automatic analysis of online contents
(Semi-)Automatic analysis of online contents(Semi-)Automatic analysis of online contents
(Semi-)Automatic analysis of online contents
 
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
 
Programming with Semantic Broad Data
Programming with Semantic Broad DataProgramming with Semantic Broad Data
Programming with Semantic Broad Data
 
Automatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative NetworksAutomatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative Networks
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
 
Perspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textPerspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from text
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1
 
Deriving human readable labels from sparql queries
Deriving human readable labels from sparql queries Deriving human readable labels from sparql queries
Deriving human readable labels from sparql queries
 

Viewers also liked

Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013
Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013
Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013aokutur
 
Unit 14b Types of Managed Funds
Unit 14b Types of Managed FundsUnit 14b Types of Managed Funds
Unit 14b Types of Managed FundsAndrew Hingston
 
Unit 11d Property disadvantages
Unit 11d Property disadvantagesUnit 11d Property disadvantages
Unit 11d Property disadvantagesAndrew Hingston
 
Sonajero de chupetes
Sonajero de chupetesSonajero de chupetes
Sonajero de chupetesdiegoredondo
 
KAYA KARATAS 26 02 2015 ckm konseri
KAYA KARATAS 26 02 2015 ckm konseriKAYA KARATAS 26 02 2015 ckm konseri
KAYA KARATAS 26 02 2015 ckm konseriaokutur
 
When thieves strike: Executive briefing on SWIFT attacks
When thieves strike: Executive briefing on SWIFT attacksWhen thieves strike: Executive briefing on SWIFT attacks
When thieves strike: Executive briefing on SWIFT attacksSangram Gayal
 
Defrag2014 anomalies final
Defrag2014 anomalies finalDefrag2014 anomalies final
Defrag2014 anomalies finalJames Urquhart
 
You Had Me at Hello: Tips for Building Relationships with Media and Influence...
You Had Me at Hello: Tips for Building Relationships with Media and Influence...You Had Me at Hello: Tips for Building Relationships with Media and Influence...
You Had Me at Hello: Tips for Building Relationships with Media and Influence...prnewswire
 
Game Design Document
Game Design DocumentGame Design Document
Game Design Documentguest950a08
 
Robotics Fall 2009
Robotics  Fall 2009Robotics  Fall 2009
Robotics Fall 2009Anna Donskoy
 
Fra idé til handling
Fra idé til handlingFra idé til handling
Fra idé til handlingAud Hakestad
 
Бизнес-потенциал социальных технологий_РУС
Бизнес-потенциал социальных технологий_РУСБизнес-потенциал социальных технологий_РУС
Бизнес-потенциал социальных технологий_РУСIngria. Technopark St. Petersburg
 
Gospel Family Reunion Small Summary
Gospel Family Reunion Small SummaryGospel Family Reunion Small Summary
Gospel Family Reunion Small Summaryfpres1079
 

Viewers also liked (20)

Career Outlook
Career OutlookCareer Outlook
Career Outlook
 
Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013
Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013
Cem Karaca Kültür Merkezinde SERAP MUTLU AKBULUT Konseri resimleri 09_11_2013
 
Unit 14b Types of Managed Funds
Unit 14b Types of Managed FundsUnit 14b Types of Managed Funds
Unit 14b Types of Managed Funds
 
Unit 11d Property disadvantages
Unit 11d Property disadvantagesUnit 11d Property disadvantages
Unit 11d Property disadvantages
 
Cyber covenant
Cyber covenantCyber covenant
Cyber covenant
 
Sonajero de chupetes
Sonajero de chupetesSonajero de chupetes
Sonajero de chupetes
 
KAYA KARATAS 26 02 2015 ckm konseri
KAYA KARATAS 26 02 2015 ckm konseriKAYA KARATAS 26 02 2015 ckm konseri
KAYA KARATAS 26 02 2015 ckm konseri
 
De Novo
De NovoDe Novo
De Novo
 
Evan & ethan
Evan & ethanEvan & ethan
Evan & ethan
 
Информационный вестник. Июнь 2011
Информационный вестник. Июнь 2011Информационный вестник. Июнь 2011
Информационный вестник. Июнь 2011
 
When thieves strike: Executive briefing on SWIFT attacks
When thieves strike: Executive briefing on SWIFT attacksWhen thieves strike: Executive briefing on SWIFT attacks
When thieves strike: Executive briefing on SWIFT attacks
 
Unit 3d Job interviews
Unit 3d Job interviewsUnit 3d Job interviews
Unit 3d Job interviews
 
Defrag2014 anomalies final
Defrag2014 anomalies finalDefrag2014 anomalies final
Defrag2014 anomalies final
 
You Had Me at Hello: Tips for Building Relationships with Media and Influence...
You Had Me at Hello: Tips for Building Relationships with Media and Influence...You Had Me at Hello: Tips for Building Relationships with Media and Influence...
You Had Me at Hello: Tips for Building Relationships with Media and Influence...
 
Game Design Document
Game Design DocumentGame Design Document
Game Design Document
 
Robotics Fall 2009
Robotics  Fall 2009Robotics  Fall 2009
Robotics Fall 2009
 
Fra idé til handling
Fra idé til handlingFra idé til handling
Fra idé til handling
 
Бизнес-потенциал социальных технологий_РУС
Бизнес-потенциал социальных технологий_РУСБизнес-потенциал социальных технологий_РУС
Бизнес-потенциал социальных технологий_РУС
 
Gospel Family Reunion Small Summary
Gospel Family Reunion Small SummaryGospel Family Reunion Small Summary
Gospel Family Reunion Small Summary
 
Why Papble
Why PapbleWhy Papble
Why Papble
 

Similar to DB and IR platforms for flexible search

osm.cs.byu.edu
osm.cs.byu.eduosm.cs.byu.edu
osm.cs.byu.edubutest
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webFabien Gandon
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003butest
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
download
downloaddownload
downloadbutest
 
download
downloaddownload
downloadbutest
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of InformationAdrian Paschke
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Netgramana
 
Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Takeshi Morita
 
Make your data great again - Ver 2
Make your data great again - Ver 2Make your data great again - Ver 2
Make your data great again - Ver 2Daniel JACOB
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinalDeborah McGuinness
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
Web Data Extraction Como2010
Web Data Extraction Como2010Web Data Extraction Como2010
Web Data Extraction Como2010Giorgio Orsi
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...kevig
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 

Similar to DB and IR platforms for flexible search (20)

osm.cs.byu.edu
osm.cs.byu.eduosm.cs.byu.edu
osm.cs.byu.edu
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the web
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
1645 track 2 pafka
1645 track 2 pafka1645 track 2 pafka
1645 track 2 pafka
 
download
downloaddownload
download
 
download
downloaddownload
download
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Net
 
Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...
 
Make your data great again - Ver 2
Make your data great again - Ver 2Make your data great again - Ver 2
Make your data great again - Ver 2
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
Web Data Extraction Como2010
Web Data Extraction Como2010Web Data Extraction Como2010
Web Data Extraction Como2010
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 

More from FELIX75

technorati
technoratitechnorati
technoratiFELIX75
 
technorati
technoratitechnorati
technoratiFELIX75
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
IR-ranking
IR-rankingIR-ranking
IR-rankingFELIX75
 

More from FELIX75 (9)

technorati
technoratitechnorati
technorati
 
technorati
technoratitechnorati
technorati
 
php
phpphp
php
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
IR-ranking
IR-rankingIR-ranking
IR-ranking
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

DB and IR platforms for flexible search

  • 1. in collaboration with Georgiana Ifrim, Gjergji Kasneci, Josiane Parreira, Maya Ramanath, Ralf Schenkel, Fabian Suchanek, Martin Theobald
  • 2. DB and IR: Two Parallel Universes canonical application: accounting libraries data type: numbers, short strings text foundation: algebraic / logic based probabilistic / statistics based search paradigm: Boolean retrieval (exact queries, result sets/bags)‏ ranked retrieval (vague queries, result lists)‏ Database Systems Information Retrieval market leaders: Oracle, IBM DB2, MS SQL Server, etc. Google, Yahoo!, MSN, Verity, Fast, etc. parallel universes forever ?
  • 3.
  • 4.
  • 5.
  • 6. Outline • Past • Future • Present : Matter, Antimatter, and Wormholes : From Data to Knowledge : XML and Graph IR
  • 7.
  • 8. DB IR 1990 1995 2000 2005 VAGUE (Motro)‏ Proximal Nodes (Baeza-Yates et al.)‏ WHIRL (Cohen)‏ Prob. Datalog (Fuhr et al.)‏ INEX XPath XPath Full-Text Prob. DB (Cavallo&Pittarelli)‏ Prob. Tuples (Barbara et al.)‏ Web Entity Search: Libra, Avatar, ExDB … Faceted Search: Flamenco … 1st Gen. XML IR: XXL, XIRQL, Elixir, JuruXML Multimedia IR Web Query Languages: W3QS, WebOQL, Araneus … Semistructured Data: Lore, Xyleme … 2nd Gen. XML IR: XRank,Timber, TIJAH, XSearch, FleXPath, CoXML, TopX, MarkLogic, Fast … Uncertain & Prob. Relations: Mystiq, Trio … Struct. Docs Deep Web Search Digital Libraries Graph IR
  • 9.
  • 10. XXL: Early XML IR [Anja Theobald, GW: Adding Relevance toXML, WebDB’00] Which professors from Saarbruecken (SB)‏ are teaching IR and have research projects on XML? Union of heterogeneous sources without global schema Similarity-aware XPath: // ~ Professor [//* = ” ~ SB“] [ // ~ Course [//* = ” ~ IR“] ] [ // ~ Research [//* = ” ~ XML“] ] Similarity-aware XPath: // ~ Professor [//* = ” ~ SB“] [ // ~ Course [//* = ” ~ IR“] ] [ // ~ Research [//* = ” ~ XML“] ] Professor Name : Gerhard Weikum Address ... City : SB Country : Germany Teaching Research Course Title : IR Description : Information retrieval ... Syllabus ... Book Article ... ... Project Title : Intelligent Search of Heterogeneous XML Data Funding : EU ... Name : Ralf Schenkel Lecturer Address: Max-Planck Institute for Informatics, Germany Activities Seminar Contents: Ranked retrieval … Literature: … Scientific Name: INEX task coordinator (Initiative for the Evaluation of XML …)‏ Other Sponsor: EU …
  • 11.
  • 12.
  • 13. Outline  Past • Future • Present : Matter, Antimatter, and Wormholes : From Data to Knowledge : XML and Graph IR
  • 14.
  • 15. Commercial Break [Martin Theobald, Ralf Schenkel, GW: VLDB’95] TopX demo today 3:30 – 5:30
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Outline  Past • Future  Present : Matter, Antimatter, and Wormholes : From Data to Knowledge : XML and Graph IR
  • 21. Knowledge Queries Nobel laureate who survived both world wars and his children drama with three women making a prophecy to a British nobleman that he will become king proteins that inhibit both protease and some other enzyme connection between Thomas Mann and Goethe differences in Rembetiko music from Greece and from Turkey neutron stars with Xray bursts > 10 40 erg s -1 & black holes in 10‘‘ market impact of Web2.0 technology in December 2006 sympathy or antipathy for Germany from May to August 2006 Turn the Web, Web2.0, and Web3.0 into the world‘s most comprehensive knowledge base („ semantic DB “) ! Answer „knowledge queries“ such as:
  • 22.
  • 23.
  • 24.
  • 25. High-Quality Knowledge Sources Wikipedia and other lexical sources
  • 26. Exploit Hand-Crafted Knowledge {{Infobox_Scientist | name = Max Planck | birth_date = [[April 23]], [[1858]] | birth_place = [[Kiel]], [[Germany]] | death_date = [[October 4]], [[1947]] | death_place = [[Göttingen]], [[Germany]] | residence = [[Germany]] | nationality = [[Germany|German]] | field = [[Physicist]] | work_institution = [[University of Kiel]]</br> [[Humboldt-Universität zu Berlin]]</br> [[Georg-August-Universität Göttingen]] | alma_mater = [[Ludwig-Maximilians-Universität München]] | doctoral_advisor = [[Philipp von Jolly]] | doctoral_students = [[Gustav Ludwig Hertz]]</br> … | known_for = [[Planck's constant]], [[Quantum mechanics|quantum theory]] | prizes = [[Nobel Prize in Physics]] (1918)‏ … Wikipedia, WordNet, and other lexical sources
  • 27.
  • 28. YAGO Knowledge Representation Entity Max_Planck April 23, 1858 Person City Country subclass Location subclass instanceOf subclass subclass bornOn “ Max Planck” means “ Dr. Planck” means subclass October 4, 1947 diedOn Kiel bornIn Nobel Prize Erwin_Planck FatherOf hasWon Scientist means “ Max Karl Ernst Ludwig Planck” Physicist instanceOf subclass Biologist subclass concepts individuals words Online access and download at http://www.mpi-inf.mpg.de/~suchanek/yago/ Accuracy: 97% Knowledge Base # Facts KnowItAll 30 000 SUMO 60 000 WordNet 200 000 OpenCyc 300 000 Cyc 5 000 000 YAGO 6 000 000
  • 29. NAGA: Graph IR on YAGO [G. Kasneci et al.: WWW‘07] queries with regular expressions Ling $x scientist isa hasFirstName | hasLastName $y Zhejiang locatedIn * worksFor conjunctive queries Beng Chin Ooi (coAuthor | advisor) * Kiel $x scientist isa bornIn Graph-based search on YAGO-style knowledge bases with built-in ranking based on confidence and informativeness  statistical language model for result graphs
  • 30.
  • 31. Information Extraction (IE): Text to Records combine NLP, pattern matching, lexicons, statistical learning Max Planck 4/23, 1858 Kiel Albert Einstein 3/14, 1879 Ulm Mahatma Gandhi 10/2, 1869 Porbandar Person BirthDate BirthPlace ... Person ScientificResult Max Planck Quantum Theory Person Collaborator Max Planck Albert Einstein Max Planck Niels Bohr Planck‘s constant 6.226  10 23 Js Constant Value Dimension
  • 32.
  • 33. Methods for Web-Scale Fact Extration city(Beijing) plays(Coltrane, sax) city(Beijing) old center of Beijing plays(Coltrane, sax) sax player Coltrane city(Beijing) old center of Beijing old center of X plays(Coltrane, sax) sax player Coltrane Y player X Example: city (Seattle) in downtown Seattle city (Seattle) Seattle and other towns city (Las Vegas) Las Vegas and other towns plays (Zappa, guitar) playing guitar: … Zappa plays (Davis, trumpet) Davis … blows trumpet seeds  text  rules  new facts Example: city (Seattle) in downtown Seattle in downtown X city (Seattle) Seattle and other towns X and other towns city (Las Vegas) Las Vegas and other towns X and other towns plays (Zappa, guitar) playing guitar: … Zappa playing Y: … X plays (Davis, trumpet) Davis … blows trumpet X … blows Y Example: city (Seattle) in downtown Seattle in downtown X city (Seattle) Seattle and other towns X and other towns city (Las Vegas) Las Vegas and other towns X and other towns plays (Zappa, guitar) playing guitar: … Zappa playing Y: … X plays (Davis, trumpet) Davis … blows trumpet X … blows Y Example: city (Seattle) in downtown Seattle in downtown X city (Seattle) Seattle and other towns X and other towns city (Las Vegas) Las Vegas and other towns X and other towns plays (Zappa, guitar) playing guitar: … Zappa playing Y: … X plays (Davis, trumpet) Davis … blows trumpet X … blows Y in downtown Beijing city(Beijing)‏ Coltrane blows sax plays(C., sax)‏ Assessment of facts & generation of rules based on statistics Rules can be more sophisticated: playing NN: (ADJ|ADV)* NP & class(NN)=instrument & class(head(NP))=person  plays(head(NP), NN)‏
  • 34. Performance of Web-IE State-of-the-art precision/recall results: Anecdotic evidence: invented (A.G. Bell, telephone)‏ married (Hillary Clinton, Bill Clinton)‏ isa (yoga, relaxation technique)‏ isa ( zearalenone, mycotoxin)‏ contains (chocolate, theobromine)‏ contains (Singapore sling, gin)‏ invented (Johannes Kepler, logarithm tables)‏ married (Segolene Royal, Francois Hollande)‏ isa (yoga, excellent way)‏ isa (your day, good one)‏ contains (chocolate, raisins)‏ plays (the liver, central role)‏ makes (everybody, mistakes)‏ relation precision recall corpus systems countries 80% 90% Web KnowItAll cities 80% ??? Web KnowItAll scientists 60% ??? Web KnowItAll headquarters 90% 50% News Snowball, LEILA birthdates 80% 70% Wikipedia LEILA instanceOf 40% 20% Web Text2Onto, LEILA Open IE 80% ??? Web TextRunner precision value-chain: entities 80%, attributes 70%, facts 60%, events 50%
  • 35.
  • 36.
  • 37.
  • 38. Outline  Past  Future  Present : Matter, Antimatter, and Wormholes : From Data to Knowledge : XML and Graph IR
  • 39. Major Trends in DB and IR malleable schema (later)‏ deep NLP, adding structure record linkage info extraction graph mining entity-relationship graph IR ontologies ranking Database Systems Information Retrieval statistical language models data uncertainty programmability search as Web Service dataspaces Web objects Web 2.0 Web 2.0
  • 40.
  • 41.