APOC Pearls
Michael Hunger
Developer Relations Engineering, Neo4j
APOC Unicorns
All Images by
& Unstable Unicorns
Power Up
Extending Neo4j
Neo4j Execution Engine
User Defined
User Defined Procedures let you write
custom code that is:
• Written in any JVM language
• Deployed to the Database
• Accessed by applications via Cypher
APOC History
• My Unicorn Moment
• 3.0 was about to have
User Defined Procedures
• Add the missing utilities
• Grew quickly 50 - 150 - 450
• Active OSS project
• Many contributors
• Neo4j Sandbox
• Neo4j Desktop
• Neo4j Cloud
Available On
• Utilities & Converters
• Data Integration
• Import / Export
• Graph Generation / Refactoring
• Transactions / Jobs / TTL
What's in the Box?
• Videos
• Documentation
• Browser Guide
• APOC Training
• Neo4j Community Forum
Where can I learn more?
If you learn one thing:"keyword)")
Video Series
Youtube Playlist
• installation instructions
• videos
• searchable overview table
• detailed explaination
• examples
Browser Guide
:play apoc
• live examples
The Pearls -
That give you
• Relational / Cassandra
• MongoDB, Couchbase,
• Cypher, GraphML
• ...
Data Integration
• load json from web-apis and files
• JSON Path
• streaming JSON
• compressed data
WITH "" AS url
CALL apoc.load.json(url) YIELD value
UNWIND value.items AS q
MERGE (question:Question {id:q.question_id})
ON CREATE SET question.title = q.title,
question.share_link = q.share_link,
question.favorite_count = q.favorite_count
MERGE (owner:User {id:q.owner.user_id})
ON CREATE SET owner.display_name = q.owner.display_name
MERGE (owner)-[:ASKED]->(question)
FOREACH (tagName IN q.tags |
MERGE (tag:Tag {name:tagName}) MERGE (question)-[:TAGGED]->(tag))
Run large scale updates
CALL apoc.periodic.iterate(
'MATCH (n:Person) RETURN n',
'SET = n.firstName + " " + n.lastName',
{batchSize:10000, parallel:true})
Run large scale updates
CALL apoc.periodic.iterate(
'LOAD CSV … AS row',
'MERGE (n:Node {})
SET =',
{batchSize:10000, concurrency:10})
Text Functions - apoc.text.*
indexOf, indexesOf
split, replace, regexpGroups
capitalize, decapitalize
random, lpad, rpad
snakeCase, camelCase, upperCase
charAt, hexCode
base64, md5, sha1,
Collection Functions - apoc.coll.*
sum, avg, min,max,stdev,
zip, partition, pairs
sort, toSet, contains, split
indexOf, .different
occurrences, frequencies, flatten
disjunct, subtract, union, …
set, insert, remove
Map Functions -*
• .fromNodes, .fromPairs,
.fromLists, .fromValues
• .merge
• .setKey,removeKey
• .clean(map,[keys],[values])
• .groupBy
JSON - apoc.convert.*
.fromJsonMap( '{"a":42,"b":"foo","c":[1,2,3]}')
• .cloneNodes
• .mergeNodes
• .extractNode
• .collapseNode
• .categorize
Relationship Modifications
• .to(rel, endNode)
• .from(rel, startNode)
• .invert(rel)
• .setType(rel, 'NEW-TYPE')
Aggregation Function - apoc.refactor.*
MATCH (n:Person)
WITH AS email, collect(n) as people
WHERE size(people) > 1
CALL apoc.refactor.mergeNodes(people)
YIELD node
MATCH (n:Movie)
CALL apoc.create.addLabels( id(n), [ n.genre ] ) YIELD node
REMOVE node.genre
CALL apoc.trigger.add(
name, statement,{phase:before/after})
• pause/resume/list/remove
• Transaction-Event-Handler calls cypher statement
• parameters: createdNodes, assignedNodeProperties, deletedNodes,...
• utility functions to extract entities/properties from update-records
• stores in graph properties
Time to Live
enable in config: apoc.ttl.enabled=true
Label :TTL, time, unit)
Creates Index on :TTL(ttl)
Time To Live TTL
background job (every 60s - configurable)
that runs:
WHERE n.ttl > timestamp()
Time To Live TTL
Aggregation Function - apoc.agg.*
• more efficient variants of collect(x)[a..b]
• .nth,.first,.last,.slice
• .median(x)
• .percentiles(x,[0.5,0.9])
• .product(x)
• .statistics() provides a full
numeric statistic
Graph Grouping
MATCH (p:Person) set p.decade = b.born / 10;
MATCH (p1:Person)-->()<--(p2:Person)
WITH p1,p2,count(*) as c
MERGE (p1)-[r:INTERACTED]-(p2)
ON CREATE SET r.count = c
YIELD node, relationship RETURN *;
(name,statement, columns, params)
• Register statements as real procedures & functions
• 'custom' namespace prefix
• Pass parameters, configure result columns
• Stored in graph and distributed across cluster
Custom Procedures (WIP)
call apoc.custom.asProcedure('neighbours',
'MATCH (n:Person {name:$name})-->(nb)
RETURN neighbour',
call custom.neighbours('Joe') YIELD neighbour;
Custom Procedures (WIP)
Report Issues
Ask Questions
APOC on GitHub
Join the
Any Questions?
gets a box!
Expand Operation
Expand Operations
Customized path expansion from start node(s)
• Min/max traversals
• Limit number of results
• Optional (no rows removed if no results)
• Choice of BFS/DFS expansion
• Custom uniqueness (restrictions on visitations of nodes/rels)
• Relationship and label filtering
• Supports repeating sequences
Expand Operations
apoc.path.expand(startNode(s), relationshipFilter, labelFilter, minLevel, maxLevel) YIELD path
• The original, when you don’t need much customization
apoc.path.expandConfig(startNode(s), configMap) YIELD path
• Most flexible, rich configuration map
apoc.path.subgraphNodes(startNode(s), configMap) YIELD node
• Only distinct nodes, don't care about paths
apoc.path.spanningTree(startNode(s), configMap) YIELD path
• Only one distinct path to each node
apoc.path.subgraphAll(startNode(s), configMap) YIELD nodes, relationships
• Only (collected) distinct nodes (and all rels between them)
Config map values
• minLevel: int
• maxLevel: int
• relationshipFilter
• labelFilter
• uniqueness: (‘RELATIONSHIP_PATH’, ’NODE_GLOBAL’, ‘NODE_PATH’, etc)
• bfs: boolean,
• filterStartNode: boolean
• limit: int
• optional: boolean
• endNodes: [nodes]
• terminatorNodes: [nodes]
• sequence
• beginSequenceAtStart: boolean
Relationship Filter
• '<ACTED_IN' - Incoming Rel
• 'DIRECTED>' - Outgoing Rel
• 'REVIEWED' - Any direction
• '<ACTED_IN | DIRECTED> | REVIEWED' - Multiple, in varied directions
• You can't do that with Cypher
Label Filter
What is/isn't allowed during expansion, and what is/isn't returned
• '-Director' – Blacklist, not allowed in path
• '+Person' –Whitelist, only allowed in path (no whitelist = all allowed)
• '>Reviewer' – End node, only return these, and continue expansion
• '/Actor:Producer' – Terminator node, only return these, stop expansion
'Person|Movie|-Director|>Reviewer|/Actor:Producer' – Combine them
Repeating sequences of relationships, labels, or both.
Uses labelFilter and relationshipFilter, just add commas
Or use sequence for both together
labelFilter:'Post | -Blocked, Reply, >Admin'
sequence:'Post |-Blocked, NEXT>, Reply, <FROM, >Admin,
End nodes / Terminator nodes
What if we already have the nodes that should end the expansion?
endNodes – like filter, but takes a collection of nodes (or ids)
terminatorNodes – like filter (stop expand), but also takes a collection
(whitelistNodes and blacklistNodes too! )
Can be used with labelFilter or sequence, but continue or include must be unanimous
End nodes / Terminator nodes
What if we already have the nodes that should end the expansion?
endNodes – like filter, but takes a collection of nodes (or ids)
terminatorNodes – like filter (stop expand), but also takes a collection
(whitelistNodes and blacklistNodes too! )
Can be used with labelFilter or sequence, but continue or include must be unanimous
Bolt Connector
Bolt Connector
CALL apoc.bolt.execute(url, statement, params, config) YIELD row
CALL apoc.bolt.load(url, statement, params, config) YIELD row
call apoc.bolt.load("bolt://user:password@localhost:7687","
match(p:Person {name:{name}}) return p", {name:'Michael'})
supports bolt connector parameters
returns: scalars, Maps (row), virtual nodes,rels,paths
Connect to Community
and load all Meetup Group
Conversion Functions
Turn "[1,2,3]" into a Cypher
in plain Cypher
Turn JSON List into Cypher List
with "[1,2,3]" as str
with split(substring(str,1, length(str)-2),",") as numbers
return [x IN numbers| toInteger(x)]
JSON Conversion Functions
Conversion Functions
Gephi Integration
Gephi Integration
match path = (:Person)-[:ACTED_IN]->(:Movie)
WITH path LIMIT 1000
with collect(path) as paths
call apoc.gephi.add(null,'workspace0', paths) yield nodes,
relationships, time
return nodes, relationships, time
incremental send to Gephi, needs Gephi Streaming extension
Graph Refactorings
Refactor the movie
Cypher Execution
77, params)
apoc.cypher.doIt(fragment, params)
apoc.cypher.runFile(file or url,{config})
apoc.cypher.runSchemaFile(file or url,{config})
apoc.cypher.mapParallel(fragment, params, list-to-parallelize)
Cypher Execution
Check out the other
periodic procs
Try apoc.periodic.iterate
Graph Grouping
• load page-cache
• page-skipping
• new implementation based on PageCache.*
• nodes + rels + rel-groups
• properties
• string / array properties
• index pages
• apoc.monitor.ids
• apoc.monitor.kernel
• apoc.monitor.tx
• apoc.monitor.locks(minWaitTime long)
Conditional Cypher
Conditional Cypher Execution
CALL apoc.[do.]when(condition, ifQuery, elseQuery, params)
CALL apoc.[do.]case([condition, query, condition, query, …​],
elseQuery, params)
Graph Generation
Graph Generation
•, noEdges, 'label', 'type')
Erdos-Renyi model (uniform)
•, degree, beta, 'label', 'type')
Watts-Strogatz model (clusters)
•, edgesPerNode, 'label', 'type')
Barabasi-Albert model (preferential attachment
• apoc.generate.complete(noNodes, 'label', 'type')
• apoc.generate.simple([degrees], 'label', 'type')
call apoc.lock.nodes([nodes])
call apoc.lock.rels([relationships])
call apoc.lock.all([nodes],[relationships])
apoc.export.csv .all / .data / .query
leaving off filename does stream cypher to client
Data Creation
Data Creation
CALL apoc.create.node(['Label'], {key:value,…​})
CALL apoc.create.nodes(['Label'], [{key:value,…​}])
CALL apoc.create.addLabels, .removeLabels
CALL apoc.create.setProperty
CALL apoc.create.setProperties
CALL apoc.create.relationship(from,'TYPE',{key:value,…​}, to)
Virtual Entities
Virtual Entities
Function AND Procedure
apoc.create.vNode(['Label'], {key:value,…​}) YIELD node
apoc.create.vRelationship(from,TYPE,{key:value,…​}, to)
{key:value,…​}, {_labels:['LabelB'],key:value})
Try* with datetime()
text, coll, map, convert funcs
And many more!
Latest Releases
Summer Release (Aug 8)
Spring Release (May 16)
Winter Release (Feb 23)
Aggregation Functions
Latest Additions
• apoc.diff graph
• new text similarity functions
• CSV loader based on neo4j-
import format
• apoc.load.xls
• Accessor functions for
(virtual) entities
• S3 Support
• HDFS Support
• apoc.index.addNodeMap
• apoc.path.create
• apoc.path.slice
• apoc.path.combine
• apoc.text.code(codepoint)
• stream apoc.export.cypher
• apoc.coll.combinations(),
Which of these are you
interested in?
Ask / Try
Procedures / Functions from Cypher
CALL apoc.custom.asProcedure('answer','RETURN 42 as answer');
CALL custom.answer();
works also with parameters, and return columns declarations
CALL apoc.custom.asFunction('answer','RETURN $input','long',
RETURN custom.answer(42) as answer;
Neo4j Developer Surface
Native LanguageDrivers
BOLT User Defined
2000-2010 0.x Embedded Java API
2010-2014 1.x REST
2014-2015 2.x Cypher over HTTP
2016 3.0.x Bolt, Official Language Drivers, User Defined Procedures
2016 3.1.x User Defined Functions
2017 3.2.x User Defined Aggregation Functions
Aggregate Functions
Can be written in any JVM language
User Defined Procedures
Callable Standalone
and in
Cypher Statements
How to build them
Developer Manual
Build a procedure or function
you'd like
start with
the template repo
User Defined Procedures
User-defined procedures are
● @Procedure annotated, named Java Methods
○ default name: package + method
● take @Name'ed parameters (3.1. default values)
● return a Stream of value objects
● fields are turned into columns
● can use @Context injected GraphDatabaseService etc
● run within Transaction
public class FullTextIndex {
public GraphDatabaseService db;
@Procedure( name = "", mode = Procedure.Mode.READ )
public Stream<SearchHit> search( @Name("index") String index,
@Name("query") String query ) {
if( !db.index().existsForNodes( index )) {
return Stream.empty();
return db.index().forNodes( index ).query( query ).stream()
.map( SearchHit::new );
public static class SearchHit {
public final Node node;
SearchHit(Node node) { this.node = node; }
try ( Driver driver = GraphDatabase.driver( "bolt://localhost", ) ) {
try ( Session session = driver.session() ) {
String call = "CALL'User',$query)";
Map<String,Object> params = singletonMap( "query", "name:Brook*");
StatementResult result = call, params);
while ( result.hasNext() {
// process results
Deploy & Register in Neo4j Server via neo4j-harness
Call & test via neo4j-java-driver
Deploying User Defined Procedures
Build or download (shadow) jar
● Drop jar-file into $NEO4J_HOME/plugins
● Restart server
● Procedure should be available
● Otherwise check neo4j.log / debug.log
User Defined
Useable in any Cypher
expression or lightweight
RETURN example.join(['Hello', 'World'],' ')
=> "Hello World"
public class Join {
@Description("example.join(['s1','s2',...], delimiter)
- join the given strings with the given delimiter.")
public String join(
@Name("strings") List<String> strings,
@Name(value = "delimiter", defaultValue = ",") String delimiter ) {
if ( strings == null || delimiter == null ) {
return null;
return String.join( delimiter, strings );
public class Join {
@Description("example.join(['s1','s2',...], delimiter)
- join the given strings with the given delimiter.")
public String join(
@Name("strings") List<String> strings,
@Name(value = "delimiter", defaultValue = ",") String delimiter ) {
if ( strings == null || delimiter == null ) {
return null;
return String.join( delimiter, strings );
public class Join {
@Description("example.join(['s1','s2',...], delimiter)
- join the given strings with the given delimiter.")
public String join(
@Name("strings") List<String> strings,
@Name(value = "delimiter", defaultValue = ",") String delimiter ) {
if ( strings == null || delimiter == null ) {
return null;
return String.join( delimiter, strings );
try ( Driver driver = GraphDatabase.driver( "bolt://localhost", ) )
try ( Session session = driver.session() )
String query = "RETURN example.join(['Hello', 'World']) AS result";
String result = query )
.single().get( "result" ).asString();
User Defined
Aggregation Functions
Custom, efficient aggregations
for Data Science and BI
Aggregation Function In APOC
• more efficient variants of collect(x)[a..b]
• apoc.agg.nth, apoc.agg.first, apoc.agg.last, apoc.agg.slice
• apoc.agg.median(x)
• apoc.agg.percentiles(x,[0.5,0.9])
• apoc.agg.product(x)
• apoc.agg.statistics() provides a full numeric statistic
UNWIND ['abc', 'abcd', 'ab'] AS string
RETURN example.longestString(string)
=> 'abcd'
public class LongestString {
@Description( "aggregates the longest string found" )
public LongStringAggregator longestString() {
return new LongStringAggregator();
public static class LongStringAggregator {
private int longest;
private String longestString;
public void findLongest( @Name( "string" ) String string ) {
if ( string != null && string.length() > longest) {
longest = string.length();
longestString = string;
public String result() { return longestString; }
public class LongestString {
@Description( "aggregates the longest string found" )
public LongStringAggregator longestString() {
return new LongStringAggregator();
public static class LongStringAggregator {
private int longest;
private String longestString;
public void findLongest( @Name( "string" ) String string ) {
if ( string != null && string.length() > longest) {
longest = string.length();
longestString = string;
public String result() { return longestString; }
public class LongestString {
@Description( "aggregates the longest string found" )
public LongStringAggregator longestString() {
return new LongStringAggregator();
public static class LongStringAggregator {
private int longest;
private String longestString;
public void findLongest( @Name( "string" ) String string ) {
if ( string != null && string.length() > longest) {
longest = string.length();
longestString = string;
public String result() { return longestString; }
public class LongestString {
@Description( "aggregates the longest string found" )
public LongStringAggregator longestString() {
return new LongStringAggregator();
public static class LongStringAggregator {
private int longest;
private String longestString;
public void findLongest( @Name( "string" ) String string ) {
if ( string != null && string.length() > longest) {
longest = string.length();
longestString = string;
public String result() { return longestString; }
try ( Driver driver = GraphDatabase.driver( "bolt://localhost", ) ) {
try ( Session session = driver.session() ) {
String query = "UNWIND ['abc', 'abcd', 'ab'] AS string " +
"RETURN example.longestString(string) AS result";
String result ="result").asString();
One Question / Comment
from each!

Recently uploaded (20)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service

APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library

  • 1. APOC Pearls Michael Hunger Developer Relations Engineering, Neo4j Follow @mesirii APOC Unicorns
  • 2. All Images by & Unstable Unicorns
  • 4. Extending Neo4j Neo4j Execution Engine User Defined Procedure Applications Bolt User Defined Procedures let you write custom code that is: • Written in any JVM language • Deployed to the Database • Accessed by applications via Cypher
  • 5. APOC History • My Unicorn Moment • 3.0 was about to have User Defined Procedures • Add the missing utilities • Grew quickly 50 - 150 - 450 • Active OSS project • Many contributors
  • 6.
  • 7. • Neo4j Sandbox • Neo4j Desktop • Neo4j Cloud Available On
  • 9. • Utilities & Converters • Data Integration • Import / Export • Graph Generation / Refactoring • Transactions / Jobs / TTL What's in the Box?
  • 10. • Videos • Documentation • Browser Guide • APOC Training • Neo4j Community Forum • Where can I learn more?
  • 11. If you learn one thing:"keyword)")
  • 13. APOC Docs • installation instructions • videos • searchable overview table • detailed explaination • examples
  • 15. The Pearls - That give you Superpowers 17
  • 17. • Relational / Cassandra • MongoDB, Couchbase, ElasticSearch • JSON, XML, CSV, XLS • Cypher, GraphML • ... Data Integration
  • 18. apoc.load.json • load json from web-apis and files • JSON Path • streaming JSON • compressed data
  • 19.
  • 20. WITH "" AS url CALL apoc.load.json(url) YIELD value UNWIND value.items AS q MERGE (question:Question {id:q.question_id}) ON CREATE SET question.title = q.title, question.share_link = q.share_link, question.favorite_count = q.favorite_count MERGE (owner:User {id:q.owner.user_id}) ON CREATE SET owner.display_name = q.owner.display_name MERGE (owner)-[:ASKED]->(question) FOREACH (tagName IN q.tags | MERGE (tag:Tag {name:tagName}) MERGE (question)-[:TAGGED]->(tag)) …
  • 22. Run large scale updates CALL apoc.periodic.iterate( 'MATCH (n:Person) RETURN n', 'SET = n.firstName + " " + n.lastName', {batchSize:10000, parallel:true})
  • 23. Run large scale updates CALL apoc.periodic.iterate( 'LOAD CSV … AS row', 'MERGE (n:Node {}) SET =', {batchSize:10000, concurrency:10})
  • 25. Text Functions - apoc.text.* indexOf, indexesOf split, replace, regexpGroups format capitalize, decapitalize random, lpad, rpad snakeCase, camelCase, upperCase charAt, hexCode base64, md5, sha1,
  • 26. Collection Functions - apoc.coll.* sum, avg, min,max,stdev, zip, partition, pairs sort, toSet, contains, split indexOf, .different occurrences, frequencies, flatten disjunct, subtract, union, … set, insert, remove
  • 27. Map Functions -* • .fromNodes, .fromPairs, .fromLists, .fromValues • .merge • .setKey,removeKey • .clean(map,[keys],[values]) • .groupBy
  • 28. JSON - apoc.convert.* .toJson([1,2,3]) .fromJsonList('[1,2,3]') .fromJsonMap( '{"a":42,"b":"foo","c":[1,2,3]}') .toTree([paths],[lowerCaseRels=true]) .getJsonProperty(node,key) .setJsonProperty(node,key,complexValue)
  • 30. • .cloneNodes • .mergeNodes • .extractNode • .collapseNode • .categorize Relationship Modifications • .to(rel, endNode) • .from(rel, startNode) • .invert(rel) • .setType(rel, 'NEW-TYPE') Aggregation Function - apoc.refactor.*
  • 31. apoc.refactor.mergeNodes MATCH (n:Person) WITH AS email, collect(n) as people WHERE size(people) > 1 CALL apoc.refactor.mergeNodes(people) YIELD node RETURN node
  • 32. apoc.create.addLabels MATCH (n:Movie) CALL apoc.create.addLabels( id(n), [ n.genre ] ) YIELD node REMOVE node.genre RETURN node
  • 34. Triggers CALL apoc.trigger.add( name, statement,{phase:before/after}) • pause/resume/list/remove • Transaction-Event-Handler calls cypher statement • parameters: createdNodes, assignedNodeProperties, deletedNodes,... • utility functions to extract entities/properties from update-records • stores in graph properties
  • 36. enable in config: apoc.ttl.enabled=true Label :TTL, time, unit) Creates Index on :TTL(ttl) Time To Live TTL
  • 37. background job (every 60s - configurable) that runs: MATCH (n:TTL) WHERE n.ttl > timestamp() WITH n LIMIT 1000 DET DELETE n Time To Live TTL
  • 39. Aggregation Function - apoc.agg.* • more efficient variants of collect(x)[a..b] • .nth,.first,.last,.slice • .median(x) • .percentiles(x,[0.5,0.9]) • .product(x) • .statistics() provides a full numeric statistic
  • 41. Graph Grouping MATCH (p:Person) set p.decade = b.born / 10; MATCH (p1:Person)-->()<--(p2:Person) WITH p1,p2,count(*) as c MERGE (p1)-[r:INTERACTED]-(p2) ON CREATE SET r.count = c CALL['Person'],['decade']) YIELD node, relationship RETURN *;
  • 43. apoc.custom.asProcedure/asFunction (name,statement, columns, params) • Register statements as real procedures & functions • 'custom' namespace prefix • Pass parameters, configure result columns • Stored in graph and distributed across cluster Custom Procedures (WIP)
  • 44. call apoc.custom.asProcedure('neighbours', 'MATCH (n:Person {name:$name})-->(nb) RETURN neighbour', [['neighbour','NODE']],[['name','STRING']]); call custom.neighbours('Joe') YIELD neighbour; Custom Procedures (WIP)
  • 52. Expand Operations Customized path expansion from start node(s) • Min/max traversals • Limit number of results • Optional (no rows removed if no results) • Choice of BFS/DFS expansion • Custom uniqueness (restrictions on visitations of nodes/rels) • Relationship and label filtering • Supports repeating sequences
  • 53. Expand Operations apoc.path.expand(startNode(s), relationshipFilter, labelFilter, minLevel, maxLevel) YIELD path • The original, when you don’t need much customization apoc.path.expandConfig(startNode(s), configMap) YIELD path • Most flexible, rich configuration map apoc.path.subgraphNodes(startNode(s), configMap) YIELD node • Only distinct nodes, don't care about paths apoc.path.spanningTree(startNode(s), configMap) YIELD path • Only one distinct path to each node apoc.path.subgraphAll(startNode(s), configMap) YIELD nodes, relationships • Only (collected) distinct nodes (and all rels between them)
  • 54. Config map values • minLevel: int • maxLevel: int • relationshipFilter • labelFilter • uniqueness: (‘RELATIONSHIP_PATH’, ’NODE_GLOBAL’, ‘NODE_PATH’, etc) • bfs: boolean, • filterStartNode: boolean • limit: int • optional: boolean • endNodes: [nodes] • terminatorNodes: [nodes] • sequence • beginSequenceAtStart: boolean
  • 55. Relationship Filter • '<ACTED_IN' - Incoming Rel • 'DIRECTED>' - Outgoing Rel • 'REVIEWED' - Any direction • '<ACTED_IN | DIRECTED> | REVIEWED' - Multiple, in varied directions • You can't do that with Cypher -[ACTED_IN|DIRECTED|REVIEWED]->
  • 56. Label Filter What is/isn't allowed during expansion, and what is/isn't returned • '-Director' – Blacklist, not allowed in path • '+Person' –Whitelist, only allowed in path (no whitelist = all allowed) • '>Reviewer' – End node, only return these, and continue expansion • '/Actor:Producer' – Terminator node, only return these, stop expansion 'Person|Movie|-Director|>Reviewer|/Actor:Producer' – Combine them
  • 57. Sequences Repeating sequences of relationships, labels, or both. Uses labelFilter and relationshipFilter, just add commas Or use sequence for both together labelFilter:'Post | -Blocked, Reply, >Admin' relationshipFilter:'NEXT>,<FROM,POSTED>|REPLIED>' sequence:'Post |-Blocked, NEXT>, Reply, <FROM, >Admin, POSTED>| REPLIED>'
  • 58. End nodes / Terminator nodes What if we already have the nodes that should end the expansion? endNodes – like filter, but takes a collection of nodes (or ids) terminatorNodes – like filter (stop expand), but also takes a collection (whitelistNodes and blacklistNodes too! ) Can be used with labelFilter or sequence, but continue or include must be unanimous
  • 59. End nodes / Terminator nodes What if we already have the nodes that should end the expansion? endNodes – like filter, but takes a collection of nodes (or ids) terminatorNodes – like filter (stop expand), but also takes a collection (whitelistNodes and blacklistNodes too! ) Can be used with labelFilter or sequence, but continue or include must be unanimous
  • 61. Bolt Connector CALL apoc.bolt.execute(url, statement, params, config) YIELD row CALL apoc.bolt.load(url, statement, params, config) YIELD row call apoc.bolt.load("bolt://user:password@localhost:7687"," match(p:Person {name:{name}}) return p", {name:'Michael'}) supports bolt connector parameters returns: scalars, Maps (row), virtual nodes,rels,paths
  • 64. Turn "[1,2,3]" into a Cypher List in plain Cypher 66
  • 65. Turn JSON List into Cypher List with "[1,2,3]" as str with split(substring(str,1, length(str)-2),",") as numbers return [x IN numbers| toInteger(x)]
  • 72. Gephi Integration match path = (:Person)-[:ACTED_IN]->(:Movie) WITH path LIMIT 1000 with collect(path) as paths call apoc.gephi.add(null,'workspace0', paths) yield nodes, relationships, time return nodes, relationships, time incremental send to Gephi, needs Gephi Streaming extension
  • 76., params) apoc.cypher.doIt(fragment, params) apoc.cypher.runTimeboxed apoc.cypher.runFile(file or url,{config}) apoc.cypher.runSchemaFile(file or url,{config}) apoc.cypher.runMany('cypher;nstatements;',{params},{config}) apoc.cypher.mapParallel(fragment, params, list-to-parallelize) Cypher Execution
  • 77. Check out the other periodic procs Try apoc.periodic.iterate example 79
  • 80. Warmup • load page-cache • page-skipping • new implementation based on PageCache.* • nodes + rels + rel-groups • properties • string / array properties • index pages
  • 82. Monitoring • apoc.monitor.ids • apoc.monitor.kernel • • apoc.monitor.tx • apoc.monitor.locks(minWaitTime long)
  • 84. Conditional Cypher Execution CALL apoc.[do.]when(condition, ifQuery, elseQuery, params) CALL apoc.[do.]case([condition, query, condition, query, …​], elseQuery, params)
  • 86. Graph Generation •, noEdges, 'label', 'type') Erdos-Renyi model (uniform) •, degree, beta, 'label', 'type') Watts-Strogatz model (clusters) •, edgesPerNode, 'label', 'type') Barabasi-Albert model (preferential attachment • apoc.generate.complete(noNodes, 'label', 'type') • apoc.generate.simple([degrees], 'label', 'type')
  • 91. Export apoc.export.csv .all / .data / .query apoc.export.cypher apoc.export.graphml leaving off filename does stream cypher to client
  • 93. Data Creation CALL apoc.create.node(['Label'], {key:value,…​}) CALL apoc.create.nodes(['Label'], [{key:value,…​}]) CALL apoc.create.addLabels, .removeLabels CALL apoc.create.setProperty CALL apoc.create.setProperties CALL apoc.create.relationship(from,'TYPE',{key:value,…​}, to) CALL[nodes],'REL_TYPE')
  • 95. Virtual Entities Function AND Procedure apoc.create.vNode(['Label'], {key:value,…​}) YIELD node apoc.create.vRelationship(from,TYPE,{key:value,…​}, to) apoc.create.vPattern({_labels:[Label],key:value},'TYPE', {key:value,…​}, {_labels:['LabelB'],key:value})
  • 96. Try* with datetime() text, coll, map, convert funcs 98
  • 98. Latest Releases Summer Release (Aug 8) Spring Release (May 16) Winter Release (Feb 23)
  • 100. Latest Additions • apoc.diff graph • new text similarity functions • CSV loader based on neo4j- import format • apoc.load.xls • • Accessor functions for (virtual) entities • S3 Support • HDFS Support • apoc.index.addNodeMap • apoc.path.create • apoc.path.slice • apoc.path.combine • apoc.text.code(codepoint) • stream apoc.export.cypher • apoc.coll.combinations(), apoc.coll.frequencies() 102
  • 101. TASK Which of these are you interested in? Ask / Try 103
  • 102. Procedures / Functions from Cypher CALL apoc.custom.asProcedure('answer','RETURN 42 as answer'); CALL custom.answer(); works also with parameters, and return columns declarations CALL apoc.custom.asFunction('answer','RETURN $input','long', [['input','number']]); RETURN custom.answer(42) as answer;
  • 103. Neo4j Developer Surface Native LanguageDrivers BOLT User Defined Procedure 2000-2010 0.x Embedded Java API 2010-2014 1.x REST 2014-2015 2.x Cypher over HTTP 2016 3.0.x Bolt, Official Language Drivers, User Defined Procedures 2016 3.1.x User Defined Functions 2017 3.2.x User Defined Aggregation Functions
  • 105. Can be written in any JVM language
  • 109. How to build them Developer Manual
  • 110. Build a procedure or function you'd like start with the template repo 112
  • 111. User Defined Procedures User-defined procedures are ● @Procedure annotated, named Java Methods ○ default name: package + method ● take @Name'ed parameters (3.1. default values) ● return a Stream of value objects ● fields are turned into columns ● can use @Context injected GraphDatabaseService etc ● run within Transaction
  • 112. public class FullTextIndex { @Context public GraphDatabaseService db; @Procedure( name = "", mode = Procedure.Mode.READ ) public Stream<SearchHit> search( @Name("index") String index, @Name("query") String query ) { if( !db.index().existsForNodes( index )) { return Stream.empty(); } return db.index().forNodes( index ).query( query ).stream() .map( SearchHit::new ); } public static class SearchHit { public final Node node; SearchHit(Node node) { this.node = node; } } }
  • 113. try ( Driver driver = GraphDatabase.driver( "bolt://localhost", ) ) { try ( Session session = driver.session() ) { String call = "CALL'User',$query)"; Map<String,Object> params = singletonMap( "query", "name:Brook*"); StatementResult result = call, params); while ( result.hasNext() { // process results } } } Deploy & Register in Neo4j Server via neo4j-harness Call & test via neo4j-java-driver
  • 114. Deploying User Defined Procedures Build or download (shadow) jar ● Drop jar-file into $NEO4J_HOME/plugins ● Restart server ● Procedure should be available ● Otherwise check neo4j.log / debug.log
  • 116. Useable in any Cypher expression or lightweight computation
  • 118. public class Join { @UserFunction @Description("example.join(['s1','s2',...], delimiter) - join the given strings with the given delimiter.") public String join( @Name("strings") List<String> strings, @Name(value = "delimiter", defaultValue = ",") String delimiter ) { if ( strings == null || delimiter == null ) { return null; } return String.join( delimiter, strings ); } }
  • 119. public class Join { @UserFunction @Description("example.join(['s1','s2',...], delimiter) - join the given strings with the given delimiter.") public String join( @Name("strings") List<String> strings, @Name(value = "delimiter", defaultValue = ",") String delimiter ) { if ( strings == null || delimiter == null ) { return null; } return String.join( delimiter, strings ); } }
  • 120. public class Join { @UserFunction @Description("example.join(['s1','s2',...], delimiter) - join the given strings with the given delimiter.") public String join( @Name("strings") List<String> strings, @Name(value = "delimiter", defaultValue = ",") String delimiter ) { if ( strings == null || delimiter == null ) { return null; } return String.join( delimiter, strings ); } }
  • 121. try ( Driver driver = GraphDatabase.driver( "bolt://localhost", ) ) { try ( Session session = driver.session() ) { String query = "RETURN example.join(['Hello', 'World']) AS result"; String result = query ) .single().get( "result" ).asString(); } }
  • 123. Custom, efficient aggregations for Data Science and BI
  • 124. Aggregation Function In APOC • more efficient variants of collect(x)[a..b] • apoc.agg.nth, apoc.agg.first, apoc.agg.last, apoc.agg.slice • apoc.agg.median(x) • apoc.agg.percentiles(x,[0.5,0.9]) • apoc.agg.product(x) • apoc.agg.statistics() provides a full numeric statistic
  • 125. UNWIND ['abc', 'abcd', 'ab'] AS string RETURN example.longestString(string) => 'abcd'
  • 126. public class LongestString { @UserAggregationFunction @Description( "aggregates the longest string found" ) public LongStringAggregator longestString() { return new LongStringAggregator(); } public static class LongStringAggregator { private int longest; private String longestString; @UserAggregationUpdate public void findLongest( @Name( "string" ) String string ) { if ( string != null && string.length() > longest) { longest = string.length(); longestString = string; } } @UserAggregationResult public String result() { return longestString; } } }
  • 127. public class LongestString { @UserAggregationFunction @Description( "aggregates the longest string found" ) public LongStringAggregator longestString() { return new LongStringAggregator(); } public static class LongStringAggregator { private int longest; private String longestString; @UserAggregationUpdate public void findLongest( @Name( "string" ) String string ) { if ( string != null && string.length() > longest) { longest = string.length(); longestString = string; } } @UserAggregationResult public String result() { return longestString; } } }
  • 128. public class LongestString { @UserAggregationFunction @Description( "aggregates the longest string found" ) public LongStringAggregator longestString() { return new LongStringAggregator(); } public static class LongStringAggregator { private int longest; private String longestString; @UserAggregationUpdate public void findLongest( @Name( "string" ) String string ) { if ( string != null && string.length() > longest) { longest = string.length(); longestString = string; } } @UserAggregationResult public String result() { return longestString; } } }
  • 129. public class LongestString { @UserAggregationFunction @Description( "aggregates the longest string found" ) public LongStringAggregator longestString() { return new LongStringAggregator(); } public static class LongStringAggregator { private int longest; private String longestString; @UserAggregationUpdate public void findLongest( @Name( "string" ) String string ) { if ( string != null && string.length() > longest) { longest = string.length(); longestString = string; } } @UserAggregationResult public String result() { return longestString; } } }
  • 130. try ( Driver driver = GraphDatabase.driver( "bolt://localhost", ) ) { try ( Session session = driver.session() ) { String query = "UNWIND ['abc', 'abcd', 'ab'] AS string " + "RETURN example.longestString(string) AS result"; String result ="result").asString(); } }
  • 131. One Question / Comment from each!