08448380779 Call Girls In Friends Colony Women Seeking Men
Neo4j: What's Under the Hood & How Knowing This Can Help You
1. Neo4j: What’s Under the Hood
How knowing this can help you
Philip Rathle
VP of Product Management
@prathle
2. 1. Choose the right technology tool for the job
2. Solve intractable problems: (Business) <--> ( IT)
3. Identify new business opportunities
Today’s Takeaways:
5. Data Management in 1979
Paper Forms
Tiny RAM Spinning Platters
(Low Capacity /
Slow, Sequential IO) RDBMS
Relational Model
The RDBMS Era
Confidential - Neo4j, Inc.
6. Data Management Today
Dynamic Real-World Systems
Abundant
RAM
Flash & IO Co-
Processors
(High-Capacity Storage &
Ultra-Fast Random I/O)
Confidential - Neo4j, Inc.
A New Graph Era Emerging
Neo4j
Property Graph Model
Real-Time
Connected Data
9. 9
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data
Real time storage & retrieval
Long running queries
Aggregation & filtering
Up to
3
Max #
of
hops
1
IT Portfolio Perspective
10. 10
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data Connections in data
Real time storage & retrieval Real-Time Connected Insights
Long running queries
Aggregation & filtering
“Our Neo4j solution is literally thousands of times faster
than the prior MySQL solution, with queries that require
10-100 times less code”
Volker Pacher, Senior Developer
Up to
3
Max #
of
hops
1 Millions
IT Portfolio Perspective
11. Illustration by David Somerville based on the original by Hugh McLeod (@gapingvoid)
RDBMS
&
Aggregate-
Oriented NoSQL
Hadoop /
EDW/
Columnar
RDBMS
|<———————- Graph Database & ———————>|
Graph Compute Engine
(Graph Transactions & Analytics)
12. 3. A Technical Architecture Perspective
Core Technology Differences
14. Connectedness and Size of Data Set
ResponseTime
Relational and
Other NoSQL
Databases
0 to 2 hops
0 to 3 degrees
Thousands of connections
1000x
Advantage
Tens to hundreds of hops
Thousands of degrees
Billions of connections
Neo4j
“Minutes to
milliseconds”
This Enables:
“Minutes to Milliseconds” Real-Time Query Performance
15. ACID Consistency Non ‘Graph-ACID’ DBMSs
15
Maintains Integrity Over Time
Guaranteed Graph Consistency
Becomes Corrupt Over Time
Not ‘Good Enough’ for Graphs
And is Supported By:
ACID Graph Writes : A Requirement for Graph Transactions
16. A Language For Connected Data
Cypher Query Language
16
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate,
count(report) AS Total
Project
Impact
Less time writing queries
• More time understanding the answers
• Leaving time to ask the next question
Less time debugging queries:
• More time writing the next piece of code
• Improved quality of overall code base
Code that’s easier to read:
• Faster ramp-up for new project members
• Improved maintainability & troubleshooting
17. Where in the Organization
Do Graphs Add Value
17
18. The Connected Enterprise
Consumers of Connected Data
18
AI & Graph Analytics
• Sentiment analysis
• Customer
segmentation
• Machine learning
• Cognitive computing
• Community detection
Transactional Graphs
• Fraud detection
• Real-time recommendations
• Network and IT operations
management
• Knowledge Graphs
• Master Data Management
Discovery & Visualization
• Fraud detection
• Network and IT
operations
• Product information
management
• Risk and portfolio analysis
Data
Scientists
Business
Users
Applications
21. 21
Neo4j Database Anatomy
Full Stack, Native Graph DB
Cost-Based Optimizer
Role-Based Security
Native Graph Engine
Transaction Logging/Backup/Recovery
Management & monitoring
Binary Wire Protocol
Clustering
Neo4j Neo4jNeo4j
Integrations
Cypher Query Language
Procedures
Programmatic Language Drivers
MATCH (a)-->(b)
22. Neo4j Graph Database: Enterprise Features
22
Neo4j Security Foundation Multi-Clustering Support for
Global Internet Apps
Rolling Upgrades
Schema Constraints Concurrent/Transactional Write
Performance
Auto Cache Reheating
For Restarts, Restores and Cluster
Expansion
Neo4j 3.4 now supports
rolling upgrades
3.4 3.5
Upgrade older instances while keeping other
members stable and without requiring a restart
of the environment
3.5
24. Neo4j Desktop: A Neo4j Developer’s Toolchest
• Mission control for developers
• Connect to both local and remote
Neo4j servers
• Includes development license for
Neo4j Enterprise Edition
• Manages updates, graph apps,
and add-ons
• Free with registration
https://neo4j.com/download
29. Strictly Confidential
Query-Based Knowledge Graphs
Connecting the Dots
Multiple graph layers of financial
information
Includes corporate data with cross-
relationships, external news, and
customized weighting
Dashboards and tools
• Credit risk
• Investment risk
• Portfolio news recommendations
has become...
29
31. Strictly Confidential
Definitions
Feature Engineering:
Developing machine learning
inputs that have predictive value
31
Graph-Enhanced Features:
ML inputs that express
information about the
connections
Or they can describe relationships:
Features can describe facts:
33. Strictly Confidential
het.io - HetioNet
Knowledge graph integrating
50+ years of biomedical data
Leveraged to predict new
uses for drugs by using the
graph topology to create
features to predict new links
Query-Based Feature Engineering
Mining Data for Drug Discovery
33
34. Strictly Confidential
Query-Based Feature Engineering
Mining Data for Drug Discovery
het.io - HetioNet
Knowledge graph integrating
50+ years of biomedical data
Leveraged to predict new
uses for drugs by using the
graph topology to create
features to predict new links
34
37. Strictly Confidential
Graph Algorithms: Basis for
Connected Features
Pathfinding
and Search
Finds the optimal paths or evaluates
route availability and quality
Centrality /
Importance
Determines the importance of
distinct nodes in the network
Community
Detection
Detects group clustering
or partition options
Heuristic
Link Prediction
Estimates the likelihood of
nodes forming a relationship
Evaluates how alike
nodes are
Similarity Embeddings
Learned representations
of connectivity or topology
37
38. Strictly Confidential
Graph algorithms that might add
value in this situation:
Connected components to identify
disjointed graphs sharing identifiers
PageRank to measure influence and
transaction volumes
Louvain to identify communities that
frequently interact
Jaccard to measure account similarity
Graph-Enhanced Feature Engineering
Example: Detecting Financial Fraud
Add connected features to existing pipelines,
to increase detection accuracy & reduce false positives
38
39. Strictly Confidential
Graph Algorithms Available Today in Neo4j
• Parallel Breadth First Search
• Parallel Depth First Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• Minimum Spanning Tree
• A* Shortest Path
• Yen’s K Shortest Path
• K-Spanning Tree (MST)
• Random Walk
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
Pathfinding
& Search
Centrality /
Importance
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity – 1 Step & Multi-Step
• Balanced Triad (identification)
Community
Detection
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Similarity
https://neo4j.com/docs/
graph-algorithms/current/
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
39
40. Strictly Confidential
Data Lake Graph DB Platform
Machine Learning
Platform
Data Engineering: Bringing it All Together
Graph
Transactions
Graph
Analytics
Morpheus
Explore with Cypher
Reshape & ship tables
into graphs
More tools coming in
Spark 3.0
with SparkCypher
Persist relevant data as a graph
Manage knowledge graphs
Run connected feature
extraction
Carry out graph analytics
Bring query-based
graph features to ML
pipeline
40
45. 1. Knowledge Graphs
Context for Decisions
2. Connected
Feature Extraction
Context for Credibility
4. AI Explainability3. Graph-
Accelerated AI
Context for Efficiency
Context for Accuracy
Four Pillars of Graph-Enhanced AI
46. Strictly Confidential
Spark Graph Native Graph Platform Machine Learning
Example: Spark and Neo4j Workflow
Graph
Transactions
Graph
Analytics
Cypher 9 in Spark 3.0
to create
non-persistent graphs
MLlib to
train models
Native Graph algorithms,
processing, and storage
Morpheus
46
47. Strictly Confidential
Explore Graphs Build Graph Solutions
Massively scalable
Powerful data pipelining
Robust ML Libraries
Non-persistent, non-native graphs
Persistent, dynamic graphs
Graph native query
and algorithm performance
Constantly growing list of graph
algorithms and embeddings
47
48. Strictly Confidential
The Path to Graph Data Science
Enterprise Maturity
DataScienceComplexity
Query-Based
Knowledge
Graph
Query-Based
Feature
Engineering
Graph
Algorithm
Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
48
49. Strictly Confidential
Embedding transforms graphs into a feature vector, or set of
vectors, describing topology, connectivity, or attributes of nodes
and edges in the graph
Graph Embeddings
• Vertex/node embeddings: describe connectivity of each node
• Path embeddings: traversals across the graph
• Graph embeddings: encode an entire graph into a single vector
Phases of Deep Walk Approach
49
50. Strictly Confidential
Graph Embeddings RECOMMENDATIONS
Explainable Reasoning over
Knowledge Graphs for
Recommendations
50
Pop
Folk
Castle on the Hill
÷ Album
Ed Sheeran
I See FireTony
Shape of You
SungBy IsSingerOf
Interact
Produce
WrittenBy
Derek
Recommendations
for Derek
52. Strictly Confidential
Graph Native Learning refers to deep
learning models that take a graph as
an input, performs computations, and
return a graph
Graph Native Learning
Battaglia et al, 201852
55. A Word on Graph Query Languages
55
openCypher
Most graph database
applications use Cypher
Industry-shared open initiative
since 2015
Makes Cypher available to
databases & tools
http://www.opencypher.org
ISO GQL
Formal ISO Language Standard
In-Progress. Sibling to SQL.
Expected to be highly compatible
with Cypher
https://www.gqlstandards.org