Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
What Is GDS and Neo4j’s GDS Library
1. Neo4j for
Graph Data Science™
Alicia Frame @AliciaFrame1
Lead Product Manager for Graph Data Science
Amy E. Hodler @AmyHodler
Director, Product Marketing & Programs for Graph Data Science
2. 2
• Graph Data Science
(GDS)
• Neo4j for GDS and the
GDS Library
• DEMO!
• Questions
#GraphDataScience
#Neo4j
Alicia Frame
Lead Product Manager
Graph Data Science
Amy E. Hodler
Director, Product
Marketing & Programs
Graph Data Science
6. Photo by Helena Lopes on Unsplash
Network Structure
is highly predictive of
pay and promotions
• People Near Structural Holes
• Organizational Misfits
“Organizational Misfits and the Origins of Brokerage in Intrafirm Networks” A. Kleinbaum
“Structural Holes and Good Ideas” R. Burt
7. Relationships and Network Structure
Strongest Predictors of Behavior & Complex Outcomes
“Research into networks reveal that,
surprisingly, the most connected
people inside a tight group within a
single industry are less valuable than
the people who span the gaps ...”
7
“…jumping from ladder to ladder is a
more effective strategy, and that lateral
or even downward moves across an
organization are more promising in the
longer run . . .”
10. “Data science is an interdisciplinary
field that uses scientific methods,
processes, algorithms and systems
to extract knowledge and insights
from structured and unstructured
data.” - Wikipedia
10
What is data science?
Data scientists use data to
answer questions.
11. Graph Data Science is a
science-driven approach to gain
knowledge from the relationships
and structures in data, typically to
power predictions.
11
What is Graph data science?
Data scientists use
relationships to answer
questions.
12. Query (e.g. Cypher/Python)
Real-time, local decisioning
and pattern matching
Graph Algorithms
Global analysis
and iterations
You know what you’re
looking for and making a
decision
You’re learning the overall structure
of a network, updating data, and
predicting
Local
Patterns
Global
Computation
13. Relationships and
network structures
are highly predictive
and underutilized
– and already in your data.
Graph are a natural way to
store and use this predictive
information, but different
than what you’re doing today.
How do you continually put more
accurate, predictive models into
production quickly?
15. 15
• 27 Million warranty & service documents
parsed for text to knowledge graph
• Graph is context for AI to learn “prime
examples” and anticipate maintenance
• Improves satisfaction and equipment
lifespan
• Connecting 50 research databases, 100k’s of
Excel workbooks, 30 bio-sample databases
• Bytes 4 Diabetes Award for use of a
knowledge graph, graph analytics, and AI
• Customized views for flexible research angles
• Almost 70% of CC fraud was missed
• ~1B Nodes and Relationships to analyse
• Graph analytics with queries & algorithms
help find $ millions of fraud in 1st year
Improving Analytics, ML & AI for Enterprises
Caterpillar’s AI Supply
Chain & Maintenance
German Center for
Diabetes Research (DZD)
Financial Fraud
Detection & Recovery Top 10
Bank
16. Evolution of Graph Data Science
Decision
Support
Graph Based
Predictions
Predictions within
a Graphs
16
Graph Feature
Engineering
Graph
Embeddings
Graph Native
Learning
Knowledge
Graphs
Graph
Analytics
16
17. Evolution of Graph Data Science
Graph Feature
Engineering
Graph
Embeddings
Graph
Networks
17
Graph
AnalyticsKnowledge
Graphs
Graph search
and queries
Support domain
experts
18. Deceptively Simple Queries
Collaborative filtering: users who
bought X, also bought Y (open-ended
pattern matching)
What items make you more likely to
buy additional items in subsequent
transactions?
Traverse hierarchies - what items are
similar 4+ hops out?
Difficult for RDMS systems
Knowledge Graph Queries
e.g. Retail Recommendation
18
19. Evolution of Graph Data Science
Graph Feature
Engineering
Graph
Embeddings
Graph
Networks
19
Knowledge
Graphs
Graph
Analytics
Graph queries &
algorithms for
offline analysis
Understanding
Structures
21. Graph Algorithms
e.g. Retail Recommendations
Graph algorithms enable reasoning
about network structure
Louvain to identify customer
segmentation based on topology
PageRank to measure
transaction volumes
Connected components
identify unique users
Jaccard to measure purchasing
similarity
21
22. Evolution of Graph Data Science
Graph
Embeddings
Graph
Networks
22
Knowledge
Graphs
Graph
Analytics
Graph Feature
Engineering
Graph algorithms
& queries for
machine learning
Improve Prediction
Accuracy
23. Graph Feature Engineering
Feature Engineering is how we combine and process the
data to create new, more meaningful features, such as
clustering or connectivity metrics.
23
24. Evolution of Graph Data Science
Decision
Support
Graph Based
Predictions
Predictions within
a Graphs
24
Graph Feature
Engineering
Graph
Embeddings
Graph Native
Learning
Knowledge
Graphs
Graph
Analytics
FUTURE
24
26. A graph catalog that creates an
efficient analytics workspace
• Reshape transactional database into an in memory analytics
graph, and manage these operations
• Optimized for analytics with global traversals and aggregation
Algorithms
• Run on the loaded graph to compute metrics about the
topology and connectivity
• Highly parallelized and scale to billions of nodes
26
What is the GDS Library?
27. • Neo4j automates data
transformations
• Fast iterations & layering
• Production ready features,
parallelization & enterprise
support
Neo4j for GDS
enterprise-grade features and scale
A graph-specific analytics workspace that’s mutable – integrated
with a native-graph database
Mutable In-Memory Workspace
Computational Graph
Native Graph Store
28. Answer previously intractable questions
with the data you already have
• Deep Path Analytics & Structural Pattern Matching
• Community & Neighbors Detection
• Influencer and Risk Identification
• Disambiguation
• Link and Behavior Prediction
Massive scale to 10’s billions of nodes with optimized
algorithms
Increase your predictive accuracy with
Neo4j GDS Algorithms
Take advantage of hardened, validated graph algorithms that
enable reasoning about network structure.
29. Find Value Faster with Neo4j’s
practical Graph Data Science framework
Drastically simplified and
standardized API that
enables custom, flexible
configurations
Documentation, training,
and examples so getting
started is simple
Explore graphs and
algorithm results visually
with Bloom
Share insights across
teams for better
collaboration
Friendly data science
experience with logical
guardrails like memory
mgmt.
Reshapping, node &
relationship aggregation /
deduplication and
multipartite algos
30. 30
Simplify Your Data Science Experience
Dozens of
libraries,
hundreds of
algos & no docs!
How do we
shape data into
a graph in the
first place?
We’ve picked a
library...good
luck learning
the syntax
WTF? We have to
build the entire
ETL pipeline for
this?
Are the results
right? How do
we get into
production?
Data Modeling
Which
Algorithms?
Learn Syntax What Now?
Reshape
31. 31
Simplify Your Data Science Experience
Dozens of
libraries,
hundreds of
algos & no docs!
With Neo4j it’s
already a
graph
We’ve picked a
library...good
luck learning
the syntax
WTF? We have to
build the entire
ETL pipeline for
this?
Are the results
right? How do
we get into
production?
Data Modeling
Which
Algorithms?
Learn Syntax What Now?
Reshape
We have
validated algos,
clear docs, and
tutorials
Neo4j syntax is
standardized
and simplified
We seamlessly
reshape your
data with 1
command
Easily write
results to Neo4j
& move straight
into production
33. • Real data from online retailer in the UK with all-occasion items
• 28K nodes and 1.1M relationships
• “Data mining for the online retail industry” by D. Chen et al
• Neo4j 4.0 and Graph Data Science Library 1.2
33
Retail Data
34. 1. Customer segmentation
a. Break down the graph into customers with similar buying patterns
i. Similarities, Community Detection, Mutate & Export Graph
2. Item recommendations
a. What item should we recommend to different customer
segments?
i. Co-Purchase Similarity, Centarlities, Mutate & Export Graph
3. Explore and answer our business questions
a. Recommendation for a person that buys something specific?
b. What to promote to drive sales in a category?
34
Retail Demo
35. Thank You
&
Questions
35
- O’Reilly Book on Graph Algorithms
neo4j.com/graph-algorithms-book/
- GDS website, whitepapers, links
neo4j.com/use-cases/graph-data-s
cience-artificial-intelligence/
- Data for Retail example github.com
/AliciaFrame/GDS_Retail_Demo
- GDS Sandbox sandbox.neo4j.com/
?usecase=graph-data-science
Resources
Alicia Frame
Alicia.Frame@Neo4j.com
@AliciaFrame1
Amy E. Hodler
Amy.Hodler@Neo4j.com
@AmyHodler