SlideShare a Scribd company logo
1 of 47
Neo4j Graph Data Science
Graph Data Science (GDS) Library Overview
Procedures to let you reshape and subset your transactional graph
so you have the right data in the right shape.
Mutable In-Memory Workspace
Computational Graph
Native Graph Store
Neo4j’s Graph catalog
Understand the GDS Graph Catalog.
• Named Graphs vs Anonymous Graphs
• Native Projection vs Cypher Projection
• Flexibility vs Performance Trade Off
• In-memory graph mutability syntax and benefits
3
Objective
4
Execution of Graph Algorithms
Procedures
Neo4j
In-memory
Projected Graph
Read projected
graph
Load projected
graph
Graph
Loader
Execute
algorithm
Stored
Results
1
2
4
3
● Named graphs
○ are loaded once, with a name, and can be used by multiple
algorithms
5
Graph Loader
CALL gds.graph.create(
'got-interactions',
'Person',
'INTERACTS_1',
{
nodeProperties: 'birth_year',
relationshipProperties:
'weight'
}
)
- Projects a monopartite graph
with Person nodes and
“INTERACTS_1” relationships
- Loads properties “birth_year”
and “weight”
● Anonymous graphs
○ are created on-demand
○ are deleted after its execution is completed
6
Graph Loader
CALL gds.<algo>.<mode>(
{
nodeProjection: 'Person',
relationshipProjection:'INTERACTS_1',
nodeProperties: 'birth_year',
relationshipProperties: 'weight'
}
)
YIELD <results>
- Projects a monopartite graph
with Person nodes and
“INTERACTS_1” relationships
- Loads properties “birth_year”
and “weight”
● Native Projection
○ loaded directly from the data store using node labels and
relationship types
7
Graph Loader
- Projects a monopartite graph
with Person nodes and
“INTERACTS_1” relationships
- Loads properties “birth_year”
and “weight”
CALL gds.graph.create(
'got-interactions',
'Person',
'INTERACTS_1',
{
nodeProperties: 'birth_year',
relationshipProperties:
'weight'
}
)
● Cypher Projection
○ execute cypher queries to populate nodes and relationships
which are then loaded into the analytics graph
8
Graph Loader
- Projects a monopartite graph with Person nodes and
“INTERACTS_1” relationships
- Loads properties “birth_year” and “weight”
CALL gds.graph.create.cypher(
'got-interactions',
'MATCH (p:Person) RETURN id(p) as id, p.birth_year as birth_year',
'MATCH (p1:Person)-[r:INTERACTS_1]->(p2:Person) RETURN id(p1) as source,
id(p2) as target, r.weight as weight')
9
Native Projection
10
Native Projection
CALL gds.graph.create(
graphName: STRING,
nodeProjection: STRING, LIST, or MAP,
relationshipProjection: STRING, LIST, or MAP,
configuration: MAP
);
CALL gds.graph.create(
'got-interactions',
'Person',
'INTERACTS_1',
{
nodeProperties: 'birth_year',
relationshipProperties: 'weight',
readConcurrency: 4
}
)
Syntax
Example
Shorthand:
Long form:
<node-label>: specify the node label name you want to use in your analytics
graph
label: specify the node label(s) from the Neo4j database
properties: one or mode node properties to map from Neo4j
11
Native Projections: Nodes
CALL gds.graph.create('my-graph', {
<node-label>: {
label: <neo4j-label>,
properties: <node-property-mappings>}
},
relationshipProjection: STRING, LIST, or MAP);
CALL gds.graph.create('my-graph', 'nodeLabel','relationshipType') ;
Node Labels:
12
Native Projections: Nodes
CALL gds.graph.create('my-graph', {
<node-label>: {
label: ['Label1', 'Label2'],
properties: {
<property-key-1>: {
property: <neo-property-key>
defaultValue: <numeric-value>
},
<property-key-2>: {
property: <neo-property-key>
defaultValue: <numeric-value>
}
}
}
},relationshipProjection)
Node Labels:
13
Native Projections: Nodes
CALL gds.graph.create('my-graph', {
Client: {
label: ['ComercialClient', 'CosumerClient'],
properties: {
stateId,
seed: {
property: 'prevCommunityId'
},
}
}
},relationshipProjection)
neo-property-keywill default to the specified property key
defaultValuedefaults to NaN
CALL gds.graph.create('my-graph', 'nodeLabel','relationshipType');Shorthand:
Long form:
14
Native Projections: Relationships
CALLs gds.graph.create('my-graph’,
nodeProjection: STRING, LIST, or MAP, {
<relationship-type-1>: {
type: <neo4j-type>,
orientation: <projection-type>,
aggregation: <aggregation-type>,
properties: <relationship-property-mappings>
},
<relationship-type-2>: {
type: <neo4j-type>,
orientation: <projection-type>,
aggregation: <aggregation-type>,
properties: <relationship-property-mappings>
}
});
Long Form:
Aggregation specifies how parallel (duplicate) Neo4j relationships are
represented in the analytics graph:
- NONE: No aggregation (default)
- MIN, MAX, SUM, COUNT: A single relationship is retained with an aggregate
property
- SINGLE: A single, arbitrary relationship is projected15
Native Projections: Relationships
CALL gds.graph.create('my-graph',
nodeProjection: STRING, LIST, or MAP, {
<relationship-type-1>: {
type: <neo4j-type>,
orientation: <projection-type>,
aggregation: <aggregation-type>,
properties: <relationship-property-mappings>
}});
Aggregation is pretty important:
- How you deduplicate (or don’t) your relationships impacts results
- Much faster way to aggregate properties than via Cypher
16
Native Projections: Relationships
using ‘sum’ to sum up the amounts:using ‘single’ to pick a single relationship:
Useful when you have many relationships you want to flexibly combine into a weightUseful when you aren’t using weights and a single relationship is appropriate
Long Form:
properties works the same as with nodes - specifies a mapping
of neo4j relationship properties to the in-memory graph.
17
Native Projections: Relationships
CALL gds.graph.create('my-graph',
nodeProjection: STRING, LIST, or MAP, {
<relationship-type-1>: {
type: <neo4j-type>,
orientation: <projection-type>,
aggregation: <aggregation-type>,
properties: {
<property-key1>: {
property:<neo4j-property-key>,
defaultValue: <defaultValue>,
aggregation: <aggregation-type>
}
}
}})
18
...putting it all together
CALL gds.graph.create('my-graph’,{
Customer: {
label: ['commercialClient', 'consumerClient'],
properties: {
seed: {
property:'stateId’
}
}
}
},
INTERACTS: {
type: 'TRANSACTION',
orientation: 'NATURAL',
aggregation: 'SUM',
properties: {
weight: {
property:'amount',
defaultValue: 0.0,
aggregation: 'SUM'
}
}
}});
19
...putting it all together
CALL gds.graph.create('my-graph’,{
Customer: {
label: ['commercialClient', 'consumerClient'],
properties: {
seed: {
property:'stateId'
}
}
},{
INTERACTS: {
type: 'TRANSACTION',
orientation: 'NATURAL',
aggregation: 'SUM',
properties: {
weight: {
property:’amount',
defaultValue: 0.0,
aggregation: 'SUM'
}
}
}})
graph name
20
...putting it all together
CALL gds.graph.create('my-graph’,{
Customer: {
label: ['commercialClient', 'consumerClient'],
properties: {
seed: {
property:'stateId'
}
}
},{
INTERACTS: {
type: 'TRANSACTION',
orientation: 'NATURAL',
aggregation: 'SUM',
properties: {
weight: {
property:’amount',
defaultValue: 0.0,
aggregation: 'SUM'
}
}
}})
nodes
21
...putting it all together
CALL gds.graph.create('my-graph’,{
Customer: {
label: ['commercialClient', 'consumerClient'],
properties: {
seed: {
property:'stateId'
}
}
},{
INTERACTS: {
type: 'TRANSACTION',
orientation: 'NATURAL',
aggregation: 'SUM',
properties: {
weight: {
property:’amount',
defaultValue: 0.0,
aggregation: 'SUM'
}
}
}})
relationships
22
Shorthand syntax
CALL gds.graph.create('my-graph',['commercialClient', 'consumerClient'],
'TRANSACTS',
{
nodeProperties: {seed: 'stateId'},
relationshipProperties: { interacts: {
property: 'amount', dddd
aggregation:’SUM’,
defaultValue: 0.0
}
}
}) YIELD graphName, nodeCount, relationshipCount
CALL gds.graph.create(
'got-interactions-1',
'Person',
{
INTERACTS_1: {
orientation: 'UNDIRECTED'
}
}
);
23
Let’s try it out!
How many nodes are in your projected graph? How many labels?
CALL gds.graph.drop('got-interactions-1');
24
Try it yourself!
Exercise: Can you add in a second relationship type, INTERACTS_2,
into your projection?
How many nodes and relationships are in this graph?
CALL gds.graph.create(
'got-interactions-12',
'Person',
{
INTERACTS_1: {
orientation: 'UNDIRECTED'
},
INTERACTS_2: {
orientation: 'UNDIRECTED'}});
25
Try it yourself!
Exercise: Can you reverse the direction of INTERACTS_2, and rename
it INTERACTS2_BACKWARDS
How many nodes and relationships are in this graph?
CALL gds.graph.create(
'got-interactions-12_reverse',
'Person',
{
INTERACTS_1: {
orientation: 'UNDIRECTED'
},
INTERACTS_2_BACKWARDS: {
type: 'INTERACTS_2',
orientation: 'REVERSE'}});
26
Don’t forget!
CALL gds.graph.drop('got-interactions-12');
CALL gds.graph.drop('got-interactions-12_reverse');
1. Use the Native shorthand syntax to load a graph with the
following specifications:
• Name: Season1Interactions
• Node Label: Character
• Relationship type: INTERACTS_SEASON1
2. Edit exercise 1 by renaming the graph to
Season1InteractionsWeighted and adding the relationship
property weight.
27
Native Projection Exercise
28
Native Projection Exercise Solution
29
What if I want to
project a different
structure?
Native graphs are fast, but Cypher graphs are flexible -- use them if:
- you’re in the experimentation phase and trying to decide on a
data model
- performance and repeatability aren’t too important
Syntax overview:
30
Cypher Projections
CALL gds.graph.create.cypher(
graphName: STRING,
nodeQuery: STRING,
relationshipQuery: STRING,
configuration: MAP
)
CALL gds.graph.create.cypher(
'my-graph',
'MATCH (p:Person) RETURN id(p) as id',
'MATCH (p1:Person)-[:LIVES]->(:Place)<-[:Lives]-(p2:Person)
RETURN id(p1) as source, id(p2) as target'
);
Native graphs are fast, but Cypher graphs are flexible -- use them if:
- you’re in the experimentation phase and trying to decide on a
data model
- performance and repeatability aren’t too important
Syntax overview:
31
Cypher Projections
CALL gds.graph.create.cypher(
'my-graph',
'MATCH (p:Person) RETURN id(p) as id',
'MATCH (p1:Person)-[:LIVES]->(:Place)<-[:Lives]-(p2:Person)
RETURN id(p1) as source, id(p2) as target'
);
The node query defines the population of nodes to consider, the
relationship query joins them up
Requirements:
- Node query must return a column called id that uniquely
identifies a node
- Relationship query must return source and target columns
with unique node identifiers32
Cypher Projections
CALL gds.graph.create.cypher(
'my-graph',
'MATCH (p:Person) RETURN id(p) as id',
'MATCH (p1:Person)-[:LIVES]->(:Place)<-[:Lives]-(p2:Person)
RETURN id(p1) as source, id(p2) as target'
);
Node and relationship properties are specified in the queries
Multiple relationships are identified in the relationship query
Directionality is interpreted from the ordering of source, target
33
Cypher Projections
CALL gds.graph.create.cypher(
'my-graph’,
'MATCH (p:Person) RETURN id(p) as id, p.community as community’,
'MATCH (p1:Person)-[r:WORKS_WITH|FRIEND_OF]->(p2:Person)
RETURN id(p1) as source, id(p2) as target, type(r) AS type,
count(r) as weight'
);
The easiest way to aggregate relationships is in the relationship query
Cypher graphs also support aggregation functions like Native graphs:
34
Cypher Projections: relationship
aggregation
CALL gds.graph.create.cypher(
graphName: STRING,
nodeQuery: STRING,
relationshipQuery: STRING,
configuration: MAP
);
35
Cypher Projections: relationship
aggregation
CALL gds.graph.create.cypher(
'my-graph',
'MATCH (p:Person) RETURN id(p) as id',
'MATCH (p1:Person)-[r:WORKS_WITH|FRIEND_OF]->(p2:Person)
RETURN id(p1) as source, id(p2) as target, r.duration as time',
{ relationshipProperties: {
time_known: {
property: 'time',
aggregation: 'SUM'
defaultValue: 0.0
}
}
}
);
Why bother?
This lets you bypass using the
Cypher engine to do the
aggregation -- it’s much more
performant on big data sets!
CALL gds.graph.create.cypher(
'same-house-graph',
'MATCH (n:Person) RETURN id(n) AS id',
'MATCH
(p1:Person)-[:BELONGS_TO]-(:House)-[:BELONGS_TO]-(p2:Person
) RETURN id(p1) AS source, id(p2) AS target'
);
36
Let’s try it out!
How many nodes are in your projected graph? How many labels?
CALL gds.graph.drop('got-interactions-cypher');
37
Try it yourself!
Exercise: Can you modify the cypher projection to find people who
APPEARED_IN the same Book?
How many nodes and relationships are in this graph?
How could you modify this query to add a weight property with the
number of books people APPEARED_IN together?
`Match (p1:Person)-(:APPEARED_IN)->(b:Book)<-(:APPEARED_IN)-(p2:Person)
RETURN id(p1) AS source, id(p2) AS target, count(distinct b) AS weight`
• Use a Cypher projection to load a graph with the following
specifications:
• Name: InteractsWeighted
• Node labels: Character
• Relationships: All
• Relationship property: weight
• Relationship aggregation: SUM
• Relationship property default value: 0.0
38
Cypher Projection Exercise
39
Cypher Projection Exercise Solution
Every Named Graph is stored in memory (the heap)
Management procedures:
- gds.graph.list: for listing named graphs
- gds.graph.exists: check is the graph exists
- gds.graph.drop: remove named graph
Graph access:
- all graphs are user specific -- another user can’t drop or use my
graphs
40
Graph Management
• Use graph management procedures to remove the graph
InteractsWeighted from the Graph Catalog.
• Solution:
41
Graph Management Exercise
42
Flexibility vs. Performance Trade Offs
1.
2.3.
4.
43
Mutating the in-memory graph
Mutate is similar to write but instead adds a new property to the in-memory graph,
specified by the mutateProperty parameter (note: this must be a new property)
CALL gds[.<tier>].<algorithm>.mutate(
graphName: STRING,
configuration: MAP
);
CALL gds.pageRank.mutate(
graphName: 'my-graph',
{mutateProperty:'pageRank'});
After you’ve finished your workflow, you can write your results back
to Neo4j:
44
Writing your results back to Neo4j
CALL gds.graph.writeNodeProperties(
graphName, [<node_properties>])
CALL gds.graph.writeRelationship(
graphName, <RELATIONSHIP>, [<relationship_property>] )
After you’ve finished your workflow, you can write your results back
to Neo4j:
Note: you can also export your in-memory graph with
gds.beta.graph.export('graph-name',
{storeDir:'/some/dir',dbName:'persisted-graph'})45
Writing your results back to Neo4j
CALL gds.graph.writeNodeProperties(
'my-graph', ['componentId', 'pageRank',
'communityId']);
CALL gds.graph.writeRelationship(
'my-graph','SIMILAR_TO', 'similarity_score' );
Writing back to Neo4j is often the slowest step:
- When chaining algorithms, skip writing back the results you
don’t need
- Write optimization for writing multiple properties more
efficiently
Example: Similarity + Louvain
- Run node similarity, update the in-memory graph with
SIMILAR_TO
- Run Louvain on SIMILAR_TO relationship
- Only write back Louvain communities
47
Why bother?
48
Next: Graph Algorithms
● What are graph algorithms and how can we use
them?
● Neo4j GDS Procedures and Functions Overview
● Algorithms support tiers and execution modes
● Deep Dive on Graph Algorithms Using Game of
Thrones Dataset
○ Community Detection
○ Similarity
○ Centrality
○ Path Finding & Search
○ Link Prediction
● Best Practices Using Neo4j GDS

More Related Content

What's hot

What's hot (20)

Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
 
Boost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined ProceduresBoost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined Procedures
 
Base de données graphe et Neo4j
Base de données graphe et Neo4jBase de données graphe et Neo4j
Base de données graphe et Neo4j
 
Will Lyon- Entity Resolution
Will Lyon- Entity ResolutionWill Lyon- Entity Resolution
Will Lyon- Entity Resolution
 
FIWARE Global Summit - NGSI-LD – an Evolution from NGSIv2
FIWARE Global Summit - NGSI-LD – an Evolution from NGSIv2FIWARE Global Summit - NGSI-LD – an Evolution from NGSIv2
FIWARE Global Summit - NGSI-LD – an Evolution from NGSIv2
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & Tricks
 
How Expedia’s Entity Graph Powers Global Travel
How Expedia’s Entity Graph Powers Global TravelHow Expedia’s Entity Graph Powers Global Travel
How Expedia’s Entity Graph Powers Global Travel
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
How Dell Used Neo4j Graph Database to Redesign Their Pricing-as-a-Service Pla...
How Dell Used Neo4j Graph Database to Redesign Their Pricing-as-a-Service Pla...How Dell Used Neo4j Graph Database to Redesign Their Pricing-as-a-Service Pla...
How Dell Used Neo4j Graph Database to Redesign Their Pricing-as-a-Service Pla...
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Risk Signature Profiles in Health Care Claims(Risk_Signature_Profiles)_.pptx
Risk Signature Profiles in Health Care Claims(Risk_Signature_Profiles)_.pptxRisk Signature Profiles in Health Care Claims(Risk_Signature_Profiles)_.pptx
Risk Signature Profiles in Health Care Claims(Risk_Signature_Profiles)_.pptx
 
EY: Why graph technology makes sense for fraud detection and customer 360 pro...
EY: Why graph technology makes sense for fraud detection and customer 360 pro...EY: Why graph technology makes sense for fraud detection and customer 360 pro...
EY: Why graph technology makes sense for fraud detection and customer 360 pro...
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
Neo4j 4.1 overview
Neo4j 4.1 overviewNeo4j 4.1 overview
Neo4j 4.1 overview
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
 
How Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondHow Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and Beyond
 
Training Week: Introduction to Neo4j 2022
Training Week: Introduction to Neo4j 2022Training Week: Introduction to Neo4j 2022
Training Week: Introduction to Neo4j 2022
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 

Similar to Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog Overview

MongoDB MapReduce Business Intelligence
MongoDB MapReduce Business IntelligenceMongoDB MapReduce Business Intelligence
MongoDB MapReduce Business Intelligence
Shafaq Abdullah
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL Database
Ruben Inoto Soto
 
CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010
Jonathan Weiss
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
MongoDB
 

Similar to Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog Overview (20)

Understanding Connected Data through Visualization
Understanding Connected Data through VisualizationUnderstanding Connected Data through Visualization
Understanding Connected Data through Visualization
 
Multiple Graphs: Updatable Views
Multiple Graphs: Updatable ViewsMultiple Graphs: Updatable Views
Multiple Graphs: Updatable Views
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Multiple graphs in openCypher
Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypher
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide training
 
MongoDB MapReduce Business Intelligence
MongoDB MapReduce Business IntelligenceMongoDB MapReduce Business Intelligence
MongoDB MapReduce Business Intelligence
 
05 neo4j gds graph catalog
05   neo4j gds graph catalog05   neo4j gds graph catalog
05 neo4j gds graph catalog
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL Database
 
Dex Technical Seminar (April 2011)
Dex Technical Seminar (April 2011)Dex Technical Seminar (April 2011)
Dex Technical Seminar (April 2011)
 
GraphQL & DGraph with Go
GraphQL & DGraph with GoGraphQL & DGraph with Go
GraphQL & DGraph with Go
 
CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010
 
CouchDB on Rails
CouchDB on RailsCouchDB on Rails
CouchDB on Rails
 
CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010
 
Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
 
3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB
3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB
3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
SenchaCon 2016: Add Magic to Your Ext JS Apps with D3 Visualizations - Vitaly...
SenchaCon 2016: Add Magic to Your Ext JS Apps with D3 Visualizations - Vitaly...SenchaCon 2016: Add Magic to Your Ext JS Apps with D3 Visualizations - Vitaly...
SenchaCon 2016: Add Magic to Your Ext JS Apps with D3 Visualizations - Vitaly...
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 

More from Neo4j

More from Neo4j (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog Overview

  • 1. Neo4j Graph Data Science Graph Data Science (GDS) Library Overview
  • 2. Procedures to let you reshape and subset your transactional graph so you have the right data in the right shape. Mutable In-Memory Workspace Computational Graph Native Graph Store Neo4j’s Graph catalog
  • 3. Understand the GDS Graph Catalog. • Named Graphs vs Anonymous Graphs • Native Projection vs Cypher Projection • Flexibility vs Performance Trade Off • In-memory graph mutability syntax and benefits 3 Objective
  • 4. 4 Execution of Graph Algorithms Procedures Neo4j In-memory Projected Graph Read projected graph Load projected graph Graph Loader Execute algorithm Stored Results 1 2 4 3
  • 5. ● Named graphs ○ are loaded once, with a name, and can be used by multiple algorithms 5 Graph Loader CALL gds.graph.create( 'got-interactions', 'Person', 'INTERACTS_1', { nodeProperties: 'birth_year', relationshipProperties: 'weight' } ) - Projects a monopartite graph with Person nodes and “INTERACTS_1” relationships - Loads properties “birth_year” and “weight”
  • 6. ● Anonymous graphs ○ are created on-demand ○ are deleted after its execution is completed 6 Graph Loader CALL gds.<algo>.<mode>( { nodeProjection: 'Person', relationshipProjection:'INTERACTS_1', nodeProperties: 'birth_year', relationshipProperties: 'weight' } ) YIELD <results> - Projects a monopartite graph with Person nodes and “INTERACTS_1” relationships - Loads properties “birth_year” and “weight”
  • 7. ● Native Projection ○ loaded directly from the data store using node labels and relationship types 7 Graph Loader - Projects a monopartite graph with Person nodes and “INTERACTS_1” relationships - Loads properties “birth_year” and “weight” CALL gds.graph.create( 'got-interactions', 'Person', 'INTERACTS_1', { nodeProperties: 'birth_year', relationshipProperties: 'weight' } )
  • 8. ● Cypher Projection ○ execute cypher queries to populate nodes and relationships which are then loaded into the analytics graph 8 Graph Loader - Projects a monopartite graph with Person nodes and “INTERACTS_1” relationships - Loads properties “birth_year” and “weight” CALL gds.graph.create.cypher( 'got-interactions', 'MATCH (p:Person) RETURN id(p) as id, p.birth_year as birth_year', 'MATCH (p1:Person)-[r:INTERACTS_1]->(p2:Person) RETURN id(p1) as source, id(p2) as target, r.weight as weight')
  • 10. 10 Native Projection CALL gds.graph.create( graphName: STRING, nodeProjection: STRING, LIST, or MAP, relationshipProjection: STRING, LIST, or MAP, configuration: MAP ); CALL gds.graph.create( 'got-interactions', 'Person', 'INTERACTS_1', { nodeProperties: 'birth_year', relationshipProperties: 'weight', readConcurrency: 4 } ) Syntax Example
  • 11. Shorthand: Long form: <node-label>: specify the node label name you want to use in your analytics graph label: specify the node label(s) from the Neo4j database properties: one or mode node properties to map from Neo4j 11 Native Projections: Nodes CALL gds.graph.create('my-graph', { <node-label>: { label: <neo4j-label>, properties: <node-property-mappings>} }, relationshipProjection: STRING, LIST, or MAP); CALL gds.graph.create('my-graph', 'nodeLabel','relationshipType') ;
  • 12. Node Labels: 12 Native Projections: Nodes CALL gds.graph.create('my-graph', { <node-label>: { label: ['Label1', 'Label2'], properties: { <property-key-1>: { property: <neo-property-key> defaultValue: <numeric-value> }, <property-key-2>: { property: <neo-property-key> defaultValue: <numeric-value> } } } },relationshipProjection)
  • 13. Node Labels: 13 Native Projections: Nodes CALL gds.graph.create('my-graph', { Client: { label: ['ComercialClient', 'CosumerClient'], properties: { stateId, seed: { property: 'prevCommunityId' }, } } },relationshipProjection) neo-property-keywill default to the specified property key defaultValuedefaults to NaN
  • 14. CALL gds.graph.create('my-graph', 'nodeLabel','relationshipType');Shorthand: Long form: 14 Native Projections: Relationships CALLs gds.graph.create('my-graph’, nodeProjection: STRING, LIST, or MAP, { <relationship-type-1>: { type: <neo4j-type>, orientation: <projection-type>, aggregation: <aggregation-type>, properties: <relationship-property-mappings> }, <relationship-type-2>: { type: <neo4j-type>, orientation: <projection-type>, aggregation: <aggregation-type>, properties: <relationship-property-mappings> } });
  • 15. Long Form: Aggregation specifies how parallel (duplicate) Neo4j relationships are represented in the analytics graph: - NONE: No aggregation (default) - MIN, MAX, SUM, COUNT: A single relationship is retained with an aggregate property - SINGLE: A single, arbitrary relationship is projected15 Native Projections: Relationships CALL gds.graph.create('my-graph', nodeProjection: STRING, LIST, or MAP, { <relationship-type-1>: { type: <neo4j-type>, orientation: <projection-type>, aggregation: <aggregation-type>, properties: <relationship-property-mappings> }});
  • 16. Aggregation is pretty important: - How you deduplicate (or don’t) your relationships impacts results - Much faster way to aggregate properties than via Cypher 16 Native Projections: Relationships using ‘sum’ to sum up the amounts:using ‘single’ to pick a single relationship: Useful when you have many relationships you want to flexibly combine into a weightUseful when you aren’t using weights and a single relationship is appropriate
  • 17. Long Form: properties works the same as with nodes - specifies a mapping of neo4j relationship properties to the in-memory graph. 17 Native Projections: Relationships CALL gds.graph.create('my-graph', nodeProjection: STRING, LIST, or MAP, { <relationship-type-1>: { type: <neo4j-type>, orientation: <projection-type>, aggregation: <aggregation-type>, properties: { <property-key1>: { property:<neo4j-property-key>, defaultValue: <defaultValue>, aggregation: <aggregation-type> } } }})
  • 18. 18 ...putting it all together CALL gds.graph.create('my-graph’,{ Customer: { label: ['commercialClient', 'consumerClient'], properties: { seed: { property:'stateId’ } } } }, INTERACTS: { type: 'TRANSACTION', orientation: 'NATURAL', aggregation: 'SUM', properties: { weight: { property:'amount', defaultValue: 0.0, aggregation: 'SUM' } } }});
  • 19. 19 ...putting it all together CALL gds.graph.create('my-graph’,{ Customer: { label: ['commercialClient', 'consumerClient'], properties: { seed: { property:'stateId' } } },{ INTERACTS: { type: 'TRANSACTION', orientation: 'NATURAL', aggregation: 'SUM', properties: { weight: { property:’amount', defaultValue: 0.0, aggregation: 'SUM' } } }}) graph name
  • 20. 20 ...putting it all together CALL gds.graph.create('my-graph’,{ Customer: { label: ['commercialClient', 'consumerClient'], properties: { seed: { property:'stateId' } } },{ INTERACTS: { type: 'TRANSACTION', orientation: 'NATURAL', aggregation: 'SUM', properties: { weight: { property:’amount', defaultValue: 0.0, aggregation: 'SUM' } } }}) nodes
  • 21. 21 ...putting it all together CALL gds.graph.create('my-graph’,{ Customer: { label: ['commercialClient', 'consumerClient'], properties: { seed: { property:'stateId' } } },{ INTERACTS: { type: 'TRANSACTION', orientation: 'NATURAL', aggregation: 'SUM', properties: { weight: { property:’amount', defaultValue: 0.0, aggregation: 'SUM' } } }}) relationships
  • 22. 22 Shorthand syntax CALL gds.graph.create('my-graph',['commercialClient', 'consumerClient'], 'TRANSACTS', { nodeProperties: {seed: 'stateId'}, relationshipProperties: { interacts: { property: 'amount', dddd aggregation:’SUM’, defaultValue: 0.0 } } }) YIELD graphName, nodeCount, relationshipCount
  • 23. CALL gds.graph.create( 'got-interactions-1', 'Person', { INTERACTS_1: { orientation: 'UNDIRECTED' } } ); 23 Let’s try it out! How many nodes are in your projected graph? How many labels? CALL gds.graph.drop('got-interactions-1');
  • 24. 24 Try it yourself! Exercise: Can you add in a second relationship type, INTERACTS_2, into your projection? How many nodes and relationships are in this graph? CALL gds.graph.create( 'got-interactions-12', 'Person', { INTERACTS_1: { orientation: 'UNDIRECTED' }, INTERACTS_2: { orientation: 'UNDIRECTED'}});
  • 25. 25 Try it yourself! Exercise: Can you reverse the direction of INTERACTS_2, and rename it INTERACTS2_BACKWARDS How many nodes and relationships are in this graph? CALL gds.graph.create( 'got-interactions-12_reverse', 'Person', { INTERACTS_1: { orientation: 'UNDIRECTED' }, INTERACTS_2_BACKWARDS: { type: 'INTERACTS_2', orientation: 'REVERSE'}});
  • 26. 26 Don’t forget! CALL gds.graph.drop('got-interactions-12'); CALL gds.graph.drop('got-interactions-12_reverse');
  • 27. 1. Use the Native shorthand syntax to load a graph with the following specifications: • Name: Season1Interactions • Node Label: Character • Relationship type: INTERACTS_SEASON1 2. Edit exercise 1 by renaming the graph to Season1InteractionsWeighted and adding the relationship property weight. 27 Native Projection Exercise
  • 29. 29 What if I want to project a different structure?
  • 30. Native graphs are fast, but Cypher graphs are flexible -- use them if: - you’re in the experimentation phase and trying to decide on a data model - performance and repeatability aren’t too important Syntax overview: 30 Cypher Projections CALL gds.graph.create.cypher( graphName: STRING, nodeQuery: STRING, relationshipQuery: STRING, configuration: MAP ) CALL gds.graph.create.cypher( 'my-graph', 'MATCH (p:Person) RETURN id(p) as id', 'MATCH (p1:Person)-[:LIVES]->(:Place)<-[:Lives]-(p2:Person) RETURN id(p1) as source, id(p2) as target' );
  • 31. Native graphs are fast, but Cypher graphs are flexible -- use them if: - you’re in the experimentation phase and trying to decide on a data model - performance and repeatability aren’t too important Syntax overview: 31 Cypher Projections CALL gds.graph.create.cypher( 'my-graph', 'MATCH (p:Person) RETURN id(p) as id', 'MATCH (p1:Person)-[:LIVES]->(:Place)<-[:Lives]-(p2:Person) RETURN id(p1) as source, id(p2) as target' );
  • 32. The node query defines the population of nodes to consider, the relationship query joins them up Requirements: - Node query must return a column called id that uniquely identifies a node - Relationship query must return source and target columns with unique node identifiers32 Cypher Projections CALL gds.graph.create.cypher( 'my-graph', 'MATCH (p:Person) RETURN id(p) as id', 'MATCH (p1:Person)-[:LIVES]->(:Place)<-[:Lives]-(p2:Person) RETURN id(p1) as source, id(p2) as target' );
  • 33. Node and relationship properties are specified in the queries Multiple relationships are identified in the relationship query Directionality is interpreted from the ordering of source, target 33 Cypher Projections CALL gds.graph.create.cypher( 'my-graph’, 'MATCH (p:Person) RETURN id(p) as id, p.community as community’, 'MATCH (p1:Person)-[r:WORKS_WITH|FRIEND_OF]->(p2:Person) RETURN id(p1) as source, id(p2) as target, type(r) AS type, count(r) as weight' );
  • 34. The easiest way to aggregate relationships is in the relationship query Cypher graphs also support aggregation functions like Native graphs: 34 Cypher Projections: relationship aggregation CALL gds.graph.create.cypher( graphName: STRING, nodeQuery: STRING, relationshipQuery: STRING, configuration: MAP );
  • 35. 35 Cypher Projections: relationship aggregation CALL gds.graph.create.cypher( 'my-graph', 'MATCH (p:Person) RETURN id(p) as id', 'MATCH (p1:Person)-[r:WORKS_WITH|FRIEND_OF]->(p2:Person) RETURN id(p1) as source, id(p2) as target, r.duration as time', { relationshipProperties: { time_known: { property: 'time', aggregation: 'SUM' defaultValue: 0.0 } } } ); Why bother? This lets you bypass using the Cypher engine to do the aggregation -- it’s much more performant on big data sets!
  • 36. CALL gds.graph.create.cypher( 'same-house-graph', 'MATCH (n:Person) RETURN id(n) AS id', 'MATCH (p1:Person)-[:BELONGS_TO]-(:House)-[:BELONGS_TO]-(p2:Person ) RETURN id(p1) AS source, id(p2) AS target' ); 36 Let’s try it out! How many nodes are in your projected graph? How many labels? CALL gds.graph.drop('got-interactions-cypher');
  • 37. 37 Try it yourself! Exercise: Can you modify the cypher projection to find people who APPEARED_IN the same Book? How many nodes and relationships are in this graph? How could you modify this query to add a weight property with the number of books people APPEARED_IN together? `Match (p1:Person)-(:APPEARED_IN)->(b:Book)<-(:APPEARED_IN)-(p2:Person) RETURN id(p1) AS source, id(p2) AS target, count(distinct b) AS weight`
  • 38. • Use a Cypher projection to load a graph with the following specifications: • Name: InteractsWeighted • Node labels: Character • Relationships: All • Relationship property: weight • Relationship aggregation: SUM • Relationship property default value: 0.0 38 Cypher Projection Exercise
  • 40. Every Named Graph is stored in memory (the heap) Management procedures: - gds.graph.list: for listing named graphs - gds.graph.exists: check is the graph exists - gds.graph.drop: remove named graph Graph access: - all graphs are user specific -- another user can’t drop or use my graphs 40 Graph Management
  • 41. • Use graph management procedures to remove the graph InteractsWeighted from the Graph Catalog. • Solution: 41 Graph Management Exercise
  • 42. 42 Flexibility vs. Performance Trade Offs 1. 2.3. 4.
  • 43. 43 Mutating the in-memory graph Mutate is similar to write but instead adds a new property to the in-memory graph, specified by the mutateProperty parameter (note: this must be a new property) CALL gds[.<tier>].<algorithm>.mutate( graphName: STRING, configuration: MAP ); CALL gds.pageRank.mutate( graphName: 'my-graph', {mutateProperty:'pageRank'});
  • 44. After you’ve finished your workflow, you can write your results back to Neo4j: 44 Writing your results back to Neo4j CALL gds.graph.writeNodeProperties( graphName, [<node_properties>]) CALL gds.graph.writeRelationship( graphName, <RELATIONSHIP>, [<relationship_property>] )
  • 45. After you’ve finished your workflow, you can write your results back to Neo4j: Note: you can also export your in-memory graph with gds.beta.graph.export('graph-name', {storeDir:'/some/dir',dbName:'persisted-graph'})45 Writing your results back to Neo4j CALL gds.graph.writeNodeProperties( 'my-graph', ['componentId', 'pageRank', 'communityId']); CALL gds.graph.writeRelationship( 'my-graph','SIMILAR_TO', 'similarity_score' );
  • 46. Writing back to Neo4j is often the slowest step: - When chaining algorithms, skip writing back the results you don’t need - Write optimization for writing multiple properties more efficiently Example: Similarity + Louvain - Run node similarity, update the in-memory graph with SIMILAR_TO - Run Louvain on SIMILAR_TO relationship - Only write back Louvain communities 47 Why bother?
  • 47. 48 Next: Graph Algorithms ● What are graph algorithms and how can we use them? ● Neo4j GDS Procedures and Functions Overview ● Algorithms support tiers and execution modes ● Deep Dive on Graph Algorithms Using Game of Thrones Dataset ○ Community Detection ○ Similarity ○ Centrality ○ Path Finding & Search ○ Link Prediction ● Best Practices Using Neo4j GDS