Graph databases are seeing a spike in popularity as their value in leveraging large data sets for key areas such as fraud detection, marketing, and network optimization become increasingly apparent. With graph databases, it’s been said that ‘the data model and the metadata are the database’. What does this mean in a practical application, and how can this technology be optimized for maximum business value?
1. Copyright Global Data Strategy, Ltd. 2020
Graph Databases: Practical Use Cases
Donna Burbank
Global Data Strategy, Ltd.
December 1st, 2020
Follow on Twitter @donnaburbank
@GlobalDataStrat
Twitter Event hashtag: #DAStrategies
2. 1December 2020
• Based in Boston
• Origins in IBM and Netezza
• Featuring enterprise-scale
OLAP graph database engine
About Cambridge Semantics
Scalable knowledge graphs for modern data
integration and analytics.
4. Single View Of
The Drug Product
A Knowledge Graph is a connected graph of data and
metadata that richly models real-world entities.
Safety
Report
Patient
Drug Reaction
Summary
Product
Exclusivity
Patent
canonical
PRODUCT
The canonical product concept connects related data
about drugs from siloed sources
5. DRUGS@FDA -Approved
Appl_No Approval_Date Applicant
205613 10/07/2014 Valeant Pharms
212379 10/18/2019 Foamix
… … …
ApplNo ProductNo DrugName ActiveIngredient
205612 007 ROPIVACAINE HYDROCHLORIDE
205613 001 UCERIS BUDESONIDE
… …
Orange Book-Products
12.2 Pharmacodynamics
Treatment with glucocorticosteroids,
including UCERIS rectal foam, is
associated with a suppression of
endogenous cortisol concentrations and
an impairment of the hypothalamic-
pituitary-adrenal (HPA) axis function.
DailyMedDrugs@FDA-Products
205613
APPLNO
Drug
CONTAINS
BUDESONIDE
ACTIVEINGREDIENT
UCERIS
DRUGNAME
Product
2MG/ACTUATION
STRENGTH
ABOUT
Pharmacodynamics
glucocorticosteriods
endogenous cortisol
SUPRRESION
TREATMENT
10/07/2014
APPROVAL_DATE
Product
APPLICANT
Valeant Pharms
Application
205613
APPL_NO
isSponsor
Product
(Canonical)
Knowledge graphs are connected graphs of data and metadata that
richly model real-world entities.
7. What’s makes AnzoGraph DB powerful
BUILT ON
STANDARDS
• SPARQL/RDF
• SPARQL*/RDF*
• Cypher/BOLT
• RDFS+
DATA
CONNECTIVITY
• Remote access 200+
data sources
• Data Virtualization
• ELT, ETL, Streaming
FASTEST DATA
LOADING
• Parallel data loading
• 250 GB/hr/32vCPU
server
HORIZONTAL
SCALABILITY
• Linear scaling to
handle billions or
trillions of triples
FASTEST QUERY &
RICH ANALYTICS
• Graph Algorithms
• Data Science Algorithms
• BI/DW Analytics
• Inferencing
• Geospatial Algorithms
• Build-Your-Own
217x
AnzoGraph DB when compared to
Neo4j on and industry standard
TPC-H benchmark
113x
AnzoGraph DB LUBM Benchmark
over previous fastest results
10-300x
AnzoGraph DB vs SPARK SQL and
SPARK GraphFrames
Analytical Benchmarks
8. A scalable, knowledge graph platform for modern
data integration and analytics
Anzo connects and models related data in a real-world
representation of data at scale, surfacing new insights
and fueling pervasive analytics.
Knowledge Graph
Management and
Metadata Catalog
AnzoGraph MPP OLAP
Knowledge Graph
Engine
Enterprise-grade cloud
deployment and
security
10. Copyright Global Data Strategy, Ltd. 2020
Graph Databases: Practical Use Cases
Donna Burbank
Global Data Strategy, Ltd.
December 1st, 2020
Follow on Twitter @donnaburbank
@GlobalDataStrat
Twitter Event hashtag: #DAStrategies
11. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Donna Burbank
2
Donna is a recognised industry expert in
information management with over 20 years
of experience in data strategy, information
management, data modeling, metadata
management, and enterprise architecture.
Her background is multi-faceted across
consulting, product development, product
management, brand strategy, marketing, and
business leadership.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting company
that specializes in the alignment of business
drivers with data-centric technology. In past
roles, she has served in key brand strategy
and product management roles at CA
Technologies and Embarcadero Technologies
for several of the leading data management
products in the market.
As an active contributor to the data
management community, she is a long time
DAMA International member, Past President
and Advisor to the DAMA Rocky Mountain
chapter, and was awarded the Excellence in
Data Management Award from DAMA
International.
Donna is also an analyst at the Boulder BI
Train Trust (BBBT) where she provides advice
and gains insight on the latest BI and Analytics
software in the market. She was on several
review committees for the Object
Management Group’s for key information
management and process modeling
notations.
She has worked with dozens of Fortune 500
companies worldwide in the Americas,
Europe, Asia, and Africa and speaks regularly
at industry conferences. She has co-authored
several books and is a regular contributor to
industry publications. She can be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
Follow on Twitter @donnaburbank
@GlobalDataStrat
Twitter Event hashtag: #DAStrategies
12. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
DATAVERSITY Data Architecture Strategies
• January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same
• April 23 Master Data Management – Aligning Data, Process, and Governance
• May 28 Data Governance and Data Architecture – Alignment and Synergies
• June 25 Enterprise Architecture vs. Data Architecture
• July 22 Best Practices in Metadata Management
• August 27 Data Quality Best Practices
• September 24 Data Virtualization – Separating Myth from Reality
• October 22 Data Architect vs. Data Engineer vs. Data Modeler
• December 1 Graph Databases: Practical Use Cases
3
This Year’s Lineup
13. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
DATAVERSITY Data Architecture Strategies
• January Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March Data Modeling Case Study – Business Data Modeling at Kiewit
• April Master Data Management – Aligning Data, Process, and Governance
• May Data Architecture, Solution Architecture, Platform Architecture – What’s the Difference?
• June Enterprise Architecture vs. Data Architecture
• July Best Practices in Metadata Management
• August Data Quality Best Practices (with guest Nigel Turner)
• September Data Modeling Techniques
• October Data Governance: Aligning Technical & Business Approaches
• December Data Architecture for Digital Transformation
4
Next Year’s Lineup - 2021
14. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
3
What We’ll Cover Today
5
• Graph databases are seeing a spike in popularity as their value in leveraging large data sets
for key areas such as fraud detection, marketing, and network optimization become
increasingly apparent.
• With graph databases, it’s been said that ‘the data model and the metadata are the database’.
• What does this mean in a practical application, and how can this technology be optimized for
maximum business value?
15. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
What is a Graph Database?
• A graph database uses a set of nodes, edges, and
properties to represent and store data.
• With graph databases, the relationships between data
points often matter more than the individual points
themselves. In order to leverage those data relationships,
your organization needs a database technology that stores
• These relationships can help you discover new insights
from your data.
6
16. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Graph Database = Thing Relates to Thing
7
17. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Graph Database = Thing Relates to Thing
8
Node
Vertice
Edge
Relationship
The more formal way of referring to “thing relates to thing” is
“Nodes & Edges”, “Vertices & Relationships”, etc.
18. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Graph Databases Mirror the Way We Think
9
Squirrel!
I should go
visit Mary
I wonder how her
brother John is doing?
Is he still dating
Stephanie?
…In the mind, as in data,
there are always random
data points…
Do they still have that
house at the Lake?
Riding their boats on the lake was great.
Remember when John crashed the boat?
Like my toy
as a child.
Graph databases can be intuitive to many, since they mirror the way the human brain
typically thinks – through Association.
19. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
“Traditional” way of Looking at the World: Hierarchies
• Carolus Linnaeus in 1735 established a hierarchy/taxonomy for organizing and identifying
biological systems.
Kingdom
Phylum
Class
Order
Family
Genus
Species
20. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
“New” Way of Looking at the World - Emergence
In philosophy, systems theory, science, and art, emergence is
the way complex systems and patterns arise out of a
multiplicity of relatively simple interactions.
- Wikipedia
21. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Graph Databases Combine Flexibility w/ Structure & Meaning
• In many ways, graph databases provide the “best of both worlds”.
12
Flexibility of the “New World”
of Discovery & “Emergence”
Structure & Meaning of the “Old
World” through Ontologies+
22. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
It’s All About Relationships
• In graph databases, relationships are first class constructs.
• Rather ironically, relational databases lack relationships.
• In relational databases, relationships are enforced through joins and constraints.
• NoSQL (e.g. Key Value) databases are also weak at supporting relationships.
13
“A relational database isn’t about relationships, it’s about constraints.”
– Karen Lopez
Customer Account
Is Owner Of
<Customer> <Owner Of> <Account>
23. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com 14
Use Cases for Graph Databases
24. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Social Networks
15
Donna
Sad, Lonely Person who
doesn’t like data
Who are the cool kids?
i.e. People linked with Donna
25. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
X Degrees of Separation – “The Bacon Number”
• What’s Audrey Hepburn’s “Bacon Number”? i.e. degrees of separation/relation to actor Kevin Bacon
• As always, metadata and data quality are important., i.e Which Audrey Hepburn?
16Courtesy of oracleofbacon.org
26. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Fraud Detection in Online Transactions
• Online transactions typically have certain identifiers, e.g. User ID, IP address, geo location, tracking cookie, credit card number, etc.
• Graph patterns can help detect fraud, e.g.
• The more interconnections exist among identifiers, the greater the cause for concern.
• Typically they would be 1:1.
• Some variations may occur, e.g. Multiple credit cards with one person. Families using same machine, etc.
• Large and tightly-knit graphs are very strong indicators that fraud is taking place.
• Triggers can be put into place so that these patterns are uncovered before they cause damage.
17
IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 CC10 CC11 CC12 CC13 CC14 CC15 CC16 CC17
Fraud? FamilyPersonal & Business Card
27. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Recommendation Engines
• Recommendation Engines are familiar to most of us who do any online shopping.
• These engines can be powered by a graph database, e.g.
• Capture a customer’s browsing behavior and demographics
• Combine those with their buying history to provide relevant recommendations
18
28. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Data Quality & Volume Matters
• Recommendation engines are based on evaluating data sets. If those data sets are faulty or of
poor quality, your results will be flawed.
• Especially if the data sets are small
19
29. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Master Data Management (MDM)
• Master Data Management (MDM) is the practice of identifying, cleansing, storing & governance
core data assets of the organization (e.g. customer, product, etc.)
• There are many architectural approaches to MDM. Two are the following:
20
Centralized -- Commonly Relational Virtualized/Registry – Commonly Graph
MDM
Virtualization Layer
• Core data stored in
a common schema
in a centralized
“hub”.
• Used as a common
reference for
operational systems,
DW, etc.
• Data remains in
source systems.
• Referenced through
a common
virtualization layer.
BOTH require the same core foundation of data quality, parsing & matching, semantic meaning,
data governance, etc. in order to be successful… and that’s usually the hardest stuff.
30. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com 21
When you have a
Hammer, everything
looks like a nail
i.e. Data Warehouses serve a
particular purpose for aggregating &
summarizing data. Not ideal for
graph databases.
Graph Databases for Data Warehousing
31. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Data Warehousing & Enterprise Knowledge Graph
22
Data Warehouse
…Show me Total Sales by Region and by
Customer each month in 2017
Enterprise Knowledge Graph
Relational & Dimensional data model Graph data model
…Who are my most influential
customers. (with the most connections)
32. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Data Management & Ballroom Dancing
“First you dance with yourself, then with your partner, then you dance with the room.”
23
33. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
An Enterprise Knowledge Graph Provides a Holistic View of the Organization
through Relationships
24
“First you dance with yourself, then with your partner, then you dance with the room.”
Customer Data
Data Quality & Semantics are important
for core enterprise data assets.
Name: Audrey Hepburn
DOB: May 4, 1929
Current Customer: No
But the true value is in the
interrelationships between data assets.
Mother of
Name: Luca Dotti
DOB: February 8, 1970
Current
Customer: Yes
Purchased Yacht Insurance
Purchased Home
Insurance
Filed a Claim
34. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
3
Who is Using Graph Databases?
25
Graph Databases currently have
lower adoption than other
platforms, according to a recent
DATAVERSITY survey.
* Trends in Data Management, a 2020 DATAVERSITY® Report, by Donna Burbank and Michelle Knight
35. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
3
Who is Using Graph Databases?
26
For future implementations, there
is growing interest in graph
databases and technologies
18.5% of respondents are looking
to implement graph within the
next 1-2 years.
* Trends in Data Management, a 2020 DATAVERSITY® Report, by Donna Burbank and Michelle Knight
36. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
3
Summary
• Graph Databases provide powerful enterprise-wide
association using simple constructs
• “Thing Relates to Thing”
• Relationships are first class constructs
• Enterprise use cases are best suited to those that
focus on interrelationships between data points
• Social Networks
• Fraud Detection
• Recommendation Engines
• Enterprise Knowledge Graph
• Graph adoption, while lower than traditional
technologies, is has growing interest.
37. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
About Global Data Strategy™, Ltd
• Global Data Strategy™ is an international information management consulting company that
specializes in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technology solution.
• Clear & Relevant: We provide clear explanations using real-world examples.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
28
Data-Driven Business Transformation
Business Strategy
Aligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
38. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
DATAVERSITY Data Architecture Strategies
• January Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March Data Modeling Case Study – Business Data Modeling at Kiewit
• April Master Data Management – Aligning Data, Process, and Governance
• May Data Architecture, Solution Architecture, Platform Architecture – What’s the Difference?
• June Enterprise Architecture vs. Data Architecture
• July Best Practices in Metadata Management
• August Data Quality Best Practices (with guest Nigel Turner)
• September Data Modeling Techniques
• October Data Governance: Aligning Technical & Business Approaches
• December Data Architecture for Digital Transformation
29
Next Year’s Lineup - 2021
39. Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com
Questions?
30
• Thoughts? Ideas?