SlideShare a Scribd company logo
1 of 42
making sense of text and data
Semantics 2018, Vienna
Analytics on Big Knowledge
Graphs Deliver Entity Awareness
and Help Data Linking
Presentation Outline
o Ontotext Introduction
o Technology and Portfolio
o Cognitive Analytics Meet Big Knowledge Graphs
o Big Company Data: Knowing, Matching and Cleaning
o Product Roadmap
Presentation Outline
o Global business information will be
key for competitiveness tomorrow
o Adequate business decisions require global information!
 Analytics cannot deliver deep market/business insights based only on proprietary data
 Broader context and signals are needed
o Merging data requires concept and entity awareness
 Entity matching across databases requires rich knowledge about the entity
 Entity recognition in text requires even more context
o Ontotext makes this possible
Vision
Mission
We help enterprises to identify meaning across:
o Diverse databases & unstructured data
We combine:
o Proprietary & Global data
o Graph databases & Text mining
o Symbolic reasoning & Machine learning
History and Essential Facts
o Started in year 2000 as Semantic Web pioneer
 Part of Sirma Group: ~400 persons, listed at Sofia Stock Exchange
 Got spun-off and took VC investment in 2008
o R&D Center in Sofia, 80% sales in USA and UK
 Over 400 person-years invested in R&D
 Multiple innovation awards: Washington Post, BBC, FT, ...
o Member of multiple industry bodies
 W3C, EDMC, ODI, LDBC, STI, DBPedia Foundation
Best known for GraphDB
“Despite all of this attention the market is
dominated by Neo4J and Ontotext
(GraphDB), which are graph and RDF
database providers respectively. These are
the longest established vendors in this space
(both founded in 2000) so they have a
longevity and experience that other
suppliers cannot yet match. How long this
will remain the case remains to be seen.”
Bloor Group report
Graph Databases, April 2015
http://www.bloorresearch.com/technology/graph-databases/
Fancy Stuff and Heavy Lifting
o We do advanced analytics:
We predicted BREXIT
 14 Jun 2016 whitepaper:
#BRExit Twitter Analysis: More Twitter Users
Want to Split with EU and Support #Brexit
https://ontotext.com/white-paper-brexit-twitter-analysis/
o But most of the time we do the
heavy lifting of data integration
and information extraction
 Enabling data scientists can do fancy things
Discovery in Knowledge Graphs
o Find suspicious
patterns like:
 Company in USA
 Controls another
company in USA
 Through a company
in an off-shore zone
o Show news
relevant to
these companies
8
Technology Excellence Delivered
o Unique technology mix: GraphDBTM engine + Text mining
o Robust technology: powers BBC.CO.UK/SPORT and FT.COM
o We serve the most knowledge intensive enterprises:
Presentation Outline
o Ontotext Introduction
o Technology and Portfolio
o Cognitive Analytics Meet Big Knowledge Graphs
o Big Company Data: Knowing, Matching and Cleaning
o Product Roadmap
Presentation Outline
Linking Text to Big Knowledge Graphs
1. Integrate relevant structured data
 Build a Big Knowledge Graph from proprietary databases
and taxonomies combined with Linked Open Data
2. Infer new facts and unveil relationships
 Performing reasoning across data from different sources
3. Link text mentions to the Knowledge Graph
 Using text-mining to automatically discover references to
concepts and entities
4. Hybrid Queries and Search in GraphDB
Text Analytics:
Semantic Disambiguation
GraphDB
Vocabulary
Vocabulary Gazetteer
Disambiguation
NLP Pipeline
Language Detection
POS
...
...
...
Relevance Ranking
...
Dynamic
Vocabulary
Get
Suggestions
Annotate
Content
Apple : Organisation
Tim Cook : Person, CEO
Tim Cook : Person, Footballer
Samsung : Organisation
Apple : Organisation
Tim Cook : Person, CEO
Tim Cook : Person, Footballer
Samsung : Organisation
87% - Tim Cook : Person, CEO
68% - Apple : Organisation
56% - Samsung : Organisation
Apple CEO Tim Cook was
at a conference with the
CEO of Samsung. Tim
explained how smart
phones are changing the
consumer electronics
market.
Suggestions
Entity Detection from
Vocabulary
Disambiguation
Relevance
Sample Knowledge Graph with Metadata
Document
Apple
Organisation
SamsungAnnotation
textpos:123,142
relevance:56%
mentions
Annotation
textpos:123,142
relevance:68%
about
Tim Cook Person
target
target
tag
tag
ceo
type
type
competitor
Annotation
textpos:123,142
relevance:87%
about
target
tag
USA
NASDAQ
Computer
Hardware
location
exchange
sector
type
Linking News to Big Knowledge Graphs
o Link text to
knowledge
graphs
o Navigate
from news
to concepts
and from
there to
other news
Try it at http://now.ontotext.com #15
Semantic Media Monitoring
For each entity:
o popularity
trends
o relevant news
o related
entities
o knowledge
graph
information
16Try it at http://now.ontotext.com
Visual Graph: Node details
#17
Big KG Demosntration
o DBpedia (the English version) 496M
o Geonames (all geographic features on Earth) 150M
o owl:sameAs links between DBpedia and Geonames 471K
o GLEI (global company register data) 3M
o Panama Papers DB (#LinkedLeaks) 20M
o Other datasets and ontologies: WordNet, WorldFacts, FIBO
o News metadata (2000 articles/day enriched by NOW) 673M
o Total size (1.8B explicit + 328M inferred statements) 2 168М
GraphDB Workbench: Class Instances & Hierarchy
#19
GraphDB Workbench: Class Relations
#20
Presentation Outline
o Ontotext Introduction
o Technology and Portfolio
o Cognitive Analytics Meet Big Knowledge Graphs
o Big Company Data: Knowing, Matching and Cleaning
o Product Roadmap
Presentation Outline
Context and Awareness
o Context allows concepts to be identified, the way people do
o Big knowledge graph can provide context for the entities in it
 Differentiating features and similar nodes
 How important and how popular it is
 Related entities and concepts
 Entities it is typically mentioned together with (co-occurrence)
o This is awareness!
o The kind of knowledge that people mean saying "I am aware of X" or
"She is cognizant of Y"
The Critical Mass
Malcolm Gladwell claims that one needs
to devote 10 000 hours to become an
expert in something, e.g. violin or hokey
(Outliers)
The Critical Mass
o A cognitive system needs:
 To know 1B facts
 About 100M concepts and entities
 Read 1M news articles
o In order to reach concept and entity
awareness in a specific domain
 The level of awareness that people mean saying
“My background is X”
Let’s play an Awareness game!
o Important airports near London?
o The most popular banks in UK?
o Companies similar to Google?
o People mentioned together with IBM in news?
We are getting closer!
o Our Business Knowledge Model can already answer many of
these questions better than you
o Most of this intelligence is available in the Ontotext Platform
o Knowledge model = KG + text mining + analytics
o We already offer two such knowledge models:
 Business and general news: one for processing general business master data (like people,
organizations, locations and their mentions in the news)
 Life sciences and healthcare
Customized Cognitive Marketing Intelligence
o Developing from scratch cognitive system
with global knowledge is infeasible
o We can provide and “onboard” one for you:
 Suggest open and commercial data sources
 Integrate them with your proprietary data sources
 Tune text analytics
 Develop specific analytics, reports, dashboards, etc.
o We can also maintain it for you:
 Various support and maintenance options, including …
 Managed data service: updates, monitoring, data quality
Presentation Outline
o Ontotext Introduction
o Technology and Portfolio
o Cognitive Analytics Meet Big Knowledge Graphs
o Big Company Data: Knowing, Matching and Cleaning
o Product Roadmap
Presentation Outline
o POL data is the most common type of master/reference data
 Considering business applications and news
o Open POL data is available in vast quantities
 Geonames covers locations exhaustively; DBPedia covers well popular POL entities; Wikidata, …
 Open company data grows: OpenCorporates, GLEI, open national registers, various “data leaks”
o Within 3 years exhaustive global POL data will be commodity!
 And it will be widely used for BI and decision making
o Ontotext delivers Global POL data solutions today.
 We make them more affordable with more cognitive analytics
Person, Organization, Location (POL) Data
Oct
2016
Company Data Species (1/2)
Category Representatives Size (Orgs.)
Exhaustive Global Databases Dun & Bradstreet, BvD, Factset > 200M
Rich Company Databases Capital IQ (S&P), Thomson Reuters (various) 5-10M
Investment Databases CrunchBase, PitchBook, CBI, DJ Venture Source 200-600K
Very Big Open Databases OpenCorporates 130M
Global Official Open Databases GLEI (Global Legal Identifier), EU BRIS 1-30M
Open Encyclopedic DBPedia, Wikidata 0.3-1.2M
Open Leaks and Investigations Panama Papers (Offshore Leaks), Trump World Data 3-300K
Oct
2016
Company Data Species (2/2)
Category Loca
tions
Industry
Classi-
fication
High
Tech.
Fields
Invest.
Info
Org-Org
Relations
(e.g.Tree)
Org-
Person
Relations
Clean,
Correct,
Predictable
Exhaustive Global Databases ++ +/- - - ++ +/- 6
Rich Company Databases ++ + +/- +/- ++ +/- 8
Investment Databases +/- +/- + + ++/- +/- 4-6
Very Big Open Databases + +/- - - +/- - 8
Global Official Open Databases + - - - +/- - 8
Open Encyclopedic +/- +/- + - +/- + 3-5
Open Leaks and Investigations +/- - - - +/- + 4-6
Matching and Overlap
o Organizations matched across:
CrunchBase (CB), CB Insights
(CBI), Capital IQ (CIQ), DJ
Venture Source, …
o The Venn diagram presents
the overlap between sources
 The size of the circle indicates
number of entities per source
 The level of overlap indicates number
of entities matched between the two
sources
Oct
2016
Data Consolidation Across Data Sources
Entity Matching Across Datasets
o Match IDs of one of the same real entity across different databases
o Data Challenges
 Different schemata
 Name variations
 Different classifications and codes
 Lack of unique identifiers (even ticker symbols are not unique)
o Technology challenges
 Pre-selection is needed; brute-force matching is not good for 1M against 5M companies
 It is not trivial to come up with good pre-selection mechanism
Company Matching Sample Project
o We matched 5+ big datasets within couple of months
o Fully automated procedure, which takes few hours to execute
 90% SPARQL and GraphDB’s FTS connectors
o Location normalization through matching to Geonames
 Also industry classification alignment across the sources
o About 85% F-Score with simple structural matching rules
o To get higher accuracy, you need:
 Massive amount of manual work and fine-tuning of weights … or
 Cognitive analytics (importance, similarity, highly accurate named entity recognition, etc.)
Presentation Outline
o Ontotext Introduction
o Technology and Portfolio
o Cognitive Analytics Meet Big Knowledge Graphs
o Big Company Data: Knowing, Matching and Cleaning
o Product Roadmap
Presentation Outline
Product Roadmap (short term)
o Ontotext platform
 Multi-tenant version of our Manual Annotation Tool
 Streamlined ETL and entity matching based on SPARK
 Configurable Semantic Search front end
o GraphDB
 Reconciliation
 Faster transactions on big knowledge graphs – 2x speed up of small transactions
 Faster SPARQL federation between local repositories
 Similarity based on Semantic Vectors
Reconciliation
GraphDB Semantic Similarity Plugin
o Statistics similarity on knowledge graphs using Semantic vectors
o Creates statistical semantic models from your RDF data and search for
similar terms and documents
o Sample:
o Create index from the news from FactForge
o Find similar news, find relevant terms for a news, etc..
Similar News
Take home
o Business needs global company data for market intelligence
o Linking Proprietary and global data is rocket science
 Mainstream tech cannot deal with such diversity
 Semantic data integration and cognitive analytics needed
o Ontotext is ready to help
 Consulting: help you build the concept for your next generation system
 Develop: build one for you or support you developing your platform
 Support and operations: from Level 3 support to Managed services
Thank you!
Experience the technology with our demonstrators
NOW: Semantic News Portal http://now.ontotext.com
RANK: News popularity ranking for companies http://rank.ontotext.com
FactForge: Hub for open data and news about People and Organizations
http://factforge.net
#42

More Related Content

What's hot

Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentOntotext
 
Diving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsDiving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsOntotext
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphsSören Auer
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsOntotext
 
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataGain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataOntotext
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectOntotext
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
 
LDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and DiscussionLDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and DiscussionSören Auer
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data SmarterMatheus Mota
 
One Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACLOne Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACLConnected Data World
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsOntotext
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jConnected Data World
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackMartin Voigt
 
Sören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge GraphsSören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge Graphssemanticsconference
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Connected Data World
 

What's hot (20)

Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news content
 
Diving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsDiving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging News
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk Analytics
 
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataGain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
LDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and DiscussionLDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and Discussion
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
One Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACLOne Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACL
 
euBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic DataeuBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic Data
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
 
Graham Cousins
Graham CousinsGraham Cousins
Graham Cousins
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data Stack
 
Sören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge GraphsSören Auer | Enterprise Knowledge Graphs
Sören Auer | Enterprise Knowledge Graphs
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
 

Similar to Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking

Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics DemoOntotext
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleMartin Kaltenböck
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Connected Data World
 
Foresight conversation
Foresight conversationForesight conversation
Foresight conversationsuresh sood
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientPerficient, Inc.
 
The Next Big Thing in Technology: What innovations will have the biggest impa...
The Next Big Thing in Technology: What innovations will have the biggest impa...The Next Big Thing in Technology: What innovations will have the biggest impa...
The Next Big Thing in Technology: What innovations will have the biggest impa...Career Communications Group
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research ReportIla Group
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)Prof. Dr. Diego Kuonen
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
Atomico Need-to-Know 24 August 2017
Atomico Need-to-Know 24 August 2017Atomico Need-to-Know 24 August 2017
Atomico Need-to-Know 24 August 2017Atomico
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October IssueJIMS Rohini Sector 5
 

Similar to Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking (20)

Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics Demo
 
Broad Data
Broad DataBroad Data
Broad Data
 
The cycle of data
The cycle of dataThe cycle of data
The cycle of data
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
 
Linked Open Data and Ontotext Projects
Linked Open Data and Ontotext ProjectsLinked Open Data and Ontotext Projects
Linked Open Data and Ontotext Projects
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
 
Foresight conversation
Foresight conversationForesight conversation
Foresight conversation
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
Open data now english
Open data now englishOpen data now english
Open data now english
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
 
The Next Big Thing in Technology: What innovations will have the biggest impa...
The Next Big Thing in Technology: What innovations will have the biggest impa...The Next Big Thing in Technology: What innovations will have the biggest impa...
The Next Big Thing in Technology: What innovations will have the biggest impa...
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research Report
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)
 
Semantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop PerspectiveSemantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop Perspective
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
Big data
Big data Big data
Big data
 
Atomico Need-to-Know 24 August 2017
Atomico Need-to-Know 24 August 2017Atomico Need-to-Know 24 August 2017
Atomico Need-to-Know 24 August 2017
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October Issue
 

More from Ontotext

It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsOntotext
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your DataOntotext
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesOntotext
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps Ontotext
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformOntotext
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?Ontotext
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessOntotext
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest Ontotext
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingOntotext
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchOntotext
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyOntotext
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic WebOntotext
 
Why Semantics Matter? Adding the semantic edge to your content, right from au...
Why Semantics Matter? Adding the semantic edge to your content,right from au...Why Semantics Matter? Adding the semantic edge to your content,right from au...
Why Semantics Matter? Adding the semantic edge to your content, right from au...Ontotext
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryOntotext
 

More from Ontotext (16)

It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic Web
 
Why Semantics Matter? Adding the semantic edge to your content, right from au...
Why Semantics Matter? Adding the semantic edge to your content,right from au...Why Semantics Matter? Adding the semantic edge to your content,right from au...
Why Semantics Matter? Adding the semantic edge to your content, right from au...
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to Delivery
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking

  • 1. making sense of text and data Semantics 2018, Vienna Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
  • 2. Presentation Outline o Ontotext Introduction o Technology and Portfolio o Cognitive Analytics Meet Big Knowledge Graphs o Big Company Data: Knowing, Matching and Cleaning o Product Roadmap Presentation Outline
  • 3. o Global business information will be key for competitiveness tomorrow o Adequate business decisions require global information!  Analytics cannot deliver deep market/business insights based only on proprietary data  Broader context and signals are needed o Merging data requires concept and entity awareness  Entity matching across databases requires rich knowledge about the entity  Entity recognition in text requires even more context o Ontotext makes this possible Vision
  • 4. Mission We help enterprises to identify meaning across: o Diverse databases & unstructured data We combine: o Proprietary & Global data o Graph databases & Text mining o Symbolic reasoning & Machine learning
  • 5. History and Essential Facts o Started in year 2000 as Semantic Web pioneer  Part of Sirma Group: ~400 persons, listed at Sofia Stock Exchange  Got spun-off and took VC investment in 2008 o R&D Center in Sofia, 80% sales in USA and UK  Over 400 person-years invested in R&D  Multiple innovation awards: Washington Post, BBC, FT, ... o Member of multiple industry bodies  W3C, EDMC, ODI, LDBC, STI, DBPedia Foundation
  • 6. Best known for GraphDB “Despite all of this attention the market is dominated by Neo4J and Ontotext (GraphDB), which are graph and RDF database providers respectively. These are the longest established vendors in this space (both founded in 2000) so they have a longevity and experience that other suppliers cannot yet match. How long this will remain the case remains to be seen.” Bloor Group report Graph Databases, April 2015 http://www.bloorresearch.com/technology/graph-databases/
  • 7. Fancy Stuff and Heavy Lifting o We do advanced analytics: We predicted BREXIT  14 Jun 2016 whitepaper: #BRExit Twitter Analysis: More Twitter Users Want to Split with EU and Support #Brexit https://ontotext.com/white-paper-brexit-twitter-analysis/ o But most of the time we do the heavy lifting of data integration and information extraction  Enabling data scientists can do fancy things
  • 8. Discovery in Knowledge Graphs o Find suspicious patterns like:  Company in USA  Controls another company in USA  Through a company in an off-shore zone o Show news relevant to these companies 8
  • 9.
  • 10. Technology Excellence Delivered o Unique technology mix: GraphDBTM engine + Text mining o Robust technology: powers BBC.CO.UK/SPORT and FT.COM o We serve the most knowledge intensive enterprises:
  • 11. Presentation Outline o Ontotext Introduction o Technology and Portfolio o Cognitive Analytics Meet Big Knowledge Graphs o Big Company Data: Knowing, Matching and Cleaning o Product Roadmap Presentation Outline
  • 12. Linking Text to Big Knowledge Graphs 1. Integrate relevant structured data  Build a Big Knowledge Graph from proprietary databases and taxonomies combined with Linked Open Data 2. Infer new facts and unveil relationships  Performing reasoning across data from different sources 3. Link text mentions to the Knowledge Graph  Using text-mining to automatically discover references to concepts and entities 4. Hybrid Queries and Search in GraphDB
  • 13. Text Analytics: Semantic Disambiguation GraphDB Vocabulary Vocabulary Gazetteer Disambiguation NLP Pipeline Language Detection POS ... ... ... Relevance Ranking ... Dynamic Vocabulary Get Suggestions Annotate Content Apple : Organisation Tim Cook : Person, CEO Tim Cook : Person, Footballer Samsung : Organisation Apple : Organisation Tim Cook : Person, CEO Tim Cook : Person, Footballer Samsung : Organisation 87% - Tim Cook : Person, CEO 68% - Apple : Organisation 56% - Samsung : Organisation Apple CEO Tim Cook was at a conference with the CEO of Samsung. Tim explained how smart phones are changing the consumer electronics market. Suggestions Entity Detection from Vocabulary Disambiguation Relevance
  • 14. Sample Knowledge Graph with Metadata Document Apple Organisation SamsungAnnotation textpos:123,142 relevance:56% mentions Annotation textpos:123,142 relevance:68% about Tim Cook Person target target tag tag ceo type type competitor Annotation textpos:123,142 relevance:87% about target tag USA NASDAQ Computer Hardware location exchange sector type
  • 15. Linking News to Big Knowledge Graphs o Link text to knowledge graphs o Navigate from news to concepts and from there to other news Try it at http://now.ontotext.com #15
  • 16. Semantic Media Monitoring For each entity: o popularity trends o relevant news o related entities o knowledge graph information 16Try it at http://now.ontotext.com
  • 17. Visual Graph: Node details #17
  • 18. Big KG Demosntration o DBpedia (the English version) 496M o Geonames (all geographic features on Earth) 150M o owl:sameAs links between DBpedia and Geonames 471K o GLEI (global company register data) 3M o Panama Papers DB (#LinkedLeaks) 20M o Other datasets and ontologies: WordNet, WorldFacts, FIBO o News metadata (2000 articles/day enriched by NOW) 673M o Total size (1.8B explicit + 328M inferred statements) 2 168М
  • 19. GraphDB Workbench: Class Instances & Hierarchy #19
  • 20. GraphDB Workbench: Class Relations #20
  • 21. Presentation Outline o Ontotext Introduction o Technology and Portfolio o Cognitive Analytics Meet Big Knowledge Graphs o Big Company Data: Knowing, Matching and Cleaning o Product Roadmap Presentation Outline
  • 22. Context and Awareness o Context allows concepts to be identified, the way people do o Big knowledge graph can provide context for the entities in it  Differentiating features and similar nodes  How important and how popular it is  Related entities and concepts  Entities it is typically mentioned together with (co-occurrence) o This is awareness! o The kind of knowledge that people mean saying "I am aware of X" or "She is cognizant of Y"
  • 23. The Critical Mass Malcolm Gladwell claims that one needs to devote 10 000 hours to become an expert in something, e.g. violin or hokey (Outliers)
  • 24. The Critical Mass o A cognitive system needs:  To know 1B facts  About 100M concepts and entities  Read 1M news articles o In order to reach concept and entity awareness in a specific domain  The level of awareness that people mean saying “My background is X”
  • 25. Let’s play an Awareness game! o Important airports near London? o The most popular banks in UK? o Companies similar to Google? o People mentioned together with IBM in news?
  • 26. We are getting closer! o Our Business Knowledge Model can already answer many of these questions better than you o Most of this intelligence is available in the Ontotext Platform o Knowledge model = KG + text mining + analytics o We already offer two such knowledge models:  Business and general news: one for processing general business master data (like people, organizations, locations and their mentions in the news)  Life sciences and healthcare
  • 27. Customized Cognitive Marketing Intelligence o Developing from scratch cognitive system with global knowledge is infeasible o We can provide and “onboard” one for you:  Suggest open and commercial data sources  Integrate them with your proprietary data sources  Tune text analytics  Develop specific analytics, reports, dashboards, etc. o We can also maintain it for you:  Various support and maintenance options, including …  Managed data service: updates, monitoring, data quality
  • 28. Presentation Outline o Ontotext Introduction o Technology and Portfolio o Cognitive Analytics Meet Big Knowledge Graphs o Big Company Data: Knowing, Matching and Cleaning o Product Roadmap Presentation Outline
  • 29. o POL data is the most common type of master/reference data  Considering business applications and news o Open POL data is available in vast quantities  Geonames covers locations exhaustively; DBPedia covers well popular POL entities; Wikidata, …  Open company data grows: OpenCorporates, GLEI, open national registers, various “data leaks” o Within 3 years exhaustive global POL data will be commodity!  And it will be widely used for BI and decision making o Ontotext delivers Global POL data solutions today.  We make them more affordable with more cognitive analytics Person, Organization, Location (POL) Data
  • 30. Oct 2016 Company Data Species (1/2) Category Representatives Size (Orgs.) Exhaustive Global Databases Dun & Bradstreet, BvD, Factset > 200M Rich Company Databases Capital IQ (S&P), Thomson Reuters (various) 5-10M Investment Databases CrunchBase, PitchBook, CBI, DJ Venture Source 200-600K Very Big Open Databases OpenCorporates 130M Global Official Open Databases GLEI (Global Legal Identifier), EU BRIS 1-30M Open Encyclopedic DBPedia, Wikidata 0.3-1.2M Open Leaks and Investigations Panama Papers (Offshore Leaks), Trump World Data 3-300K
  • 31. Oct 2016 Company Data Species (2/2) Category Loca tions Industry Classi- fication High Tech. Fields Invest. Info Org-Org Relations (e.g.Tree) Org- Person Relations Clean, Correct, Predictable Exhaustive Global Databases ++ +/- - - ++ +/- 6 Rich Company Databases ++ + +/- +/- ++ +/- 8 Investment Databases +/- +/- + + ++/- +/- 4-6 Very Big Open Databases + +/- - - +/- - 8 Global Official Open Databases + - - - +/- - 8 Open Encyclopedic +/- +/- + - +/- + 3-5 Open Leaks and Investigations +/- - - - +/- + 4-6
  • 32. Matching and Overlap o Organizations matched across: CrunchBase (CB), CB Insights (CBI), Capital IQ (CIQ), DJ Venture Source, … o The Venn diagram presents the overlap between sources  The size of the circle indicates number of entities per source  The level of overlap indicates number of entities matched between the two sources
  • 34. Entity Matching Across Datasets o Match IDs of one of the same real entity across different databases o Data Challenges  Different schemata  Name variations  Different classifications and codes  Lack of unique identifiers (even ticker symbols are not unique) o Technology challenges  Pre-selection is needed; brute-force matching is not good for 1M against 5M companies  It is not trivial to come up with good pre-selection mechanism
  • 35. Company Matching Sample Project o We matched 5+ big datasets within couple of months o Fully automated procedure, which takes few hours to execute  90% SPARQL and GraphDB’s FTS connectors o Location normalization through matching to Geonames  Also industry classification alignment across the sources o About 85% F-Score with simple structural matching rules o To get higher accuracy, you need:  Massive amount of manual work and fine-tuning of weights … or  Cognitive analytics (importance, similarity, highly accurate named entity recognition, etc.)
  • 36. Presentation Outline o Ontotext Introduction o Technology and Portfolio o Cognitive Analytics Meet Big Knowledge Graphs o Big Company Data: Knowing, Matching and Cleaning o Product Roadmap Presentation Outline
  • 37. Product Roadmap (short term) o Ontotext platform  Multi-tenant version of our Manual Annotation Tool  Streamlined ETL and entity matching based on SPARK  Configurable Semantic Search front end o GraphDB  Reconciliation  Faster transactions on big knowledge graphs – 2x speed up of small transactions  Faster SPARQL federation between local repositories  Similarity based on Semantic Vectors
  • 39. GraphDB Semantic Similarity Plugin o Statistics similarity on knowledge graphs using Semantic vectors o Creates statistical semantic models from your RDF data and search for similar terms and documents o Sample: o Create index from the news from FactForge o Find similar news, find relevant terms for a news, etc..
  • 41. Take home o Business needs global company data for market intelligence o Linking Proprietary and global data is rocket science  Mainstream tech cannot deal with such diversity  Semantic data integration and cognitive analytics needed o Ontotext is ready to help  Consulting: help you build the concept for your next generation system  Develop: build one for you or support you developing your platform  Support and operations: from Level 3 support to Managed services
  • 42. Thank you! Experience the technology with our demonstrators NOW: Semantic News Portal http://now.ontotext.com RANK: News popularity ranking for companies http://rank.ontotext.com FactForge: Hub for open data and news about People and Organizations http://factforge.net #42