APOC has become the de-facto standard utility library for Neo4j. In this talk, I will demonstrate some of the lesser known but very useful components of APOC that will save you a lot of work. You will also learn how to combine individual functions into powerful constructs to achieve impressive feats
This will be a fast-paced demo/live-coding talk.
Video: https://neo4j.com/graphconnect-2018/session/neo4j-utility-library-apoc-pearls
Unicorn images by TeeTurtle.com (Unstable Unicorns is a fun game & cool t-shirts)
4. Extending Neo4j
Neo4j Execution Engine
User Defined
Procedure
Applications
Bolt
User Defined Procedures let you write
custom code that is:
• Written in any JVM language
• Deployed to the Database
• Accessed by applications via Cypher
5. APOC History
• My Unicorn Moment
• 3.0 was about to have
User Defined Procedures
• Add the missing utilities
• Grew quickly 50 - 150 - 450
• Active OSS project
• Many contributors
41. Graph Grouping
MATCH (p:Person) set p.decade = b.born / 10;
MATCH (p1:Person)-->()<--(p2:Person)
WITH p1,p2,count(*) as c
MERGE (p1)-[r:INTERACTED]-(p2)
ON CREATE SET r.count = c
CALL apoc.nodes.group(['Person'],['decade'])
YIELD node, relationship RETURN *;
52. Expand Operations
Customized path expansion from start node(s)
• Min/max traversals
• Limit number of results
• Optional (no rows removed if no results)
• Choice of BFS/DFS expansion
• Custom uniqueness (restrictions on visitations of nodes/rels)
• Relationship and label filtering
• Supports repeating sequences
53. Expand Operations
apoc.path.expand(startNode(s), relationshipFilter, labelFilter, minLevel, maxLevel) YIELD path
• The original, when you don’t need much customization
apoc.path.expandConfig(startNode(s), configMap) YIELD path
• Most flexible, rich configuration map
apoc.path.subgraphNodes(startNode(s), configMap) YIELD node
• Only distinct nodes, don't care about paths
apoc.path.spanningTree(startNode(s), configMap) YIELD path
• Only one distinct path to each node
apoc.path.subgraphAll(startNode(s), configMap) YIELD nodes, relationships
• Only (collected) distinct nodes (and all rels between them)
55. Relationship Filter
• '<ACTED_IN' - Incoming Rel
• 'DIRECTED>' - Outgoing Rel
• 'REVIEWED' - Any direction
• '<ACTED_IN | DIRECTED> | REVIEWED' - Multiple, in varied directions
• You can't do that with Cypher
-[ACTED_IN|DIRECTED|REVIEWED]->
56. Label Filter
What is/isn't allowed during expansion, and what is/isn't returned
• '-Director' – Blacklist, not allowed in path
• '+Person' –Whitelist, only allowed in path (no whitelist = all allowed)
• '>Reviewer' – End node, only return these, and continue expansion
• '/Actor:Producer' – Terminator node, only return these, stop expansion
'Person|Movie|-Director|>Reviewer|/Actor:Producer' – Combine them
57. Sequences
Repeating sequences of relationships, labels, or both.
Uses labelFilter and relationshipFilter, just add commas
Or use sequence for both together
labelFilter:'Post | -Blocked, Reply, >Admin'
relationshipFilter:'NEXT>,<FROM,POSTED>|REPLIED>'
sequence:'Post |-Blocked, NEXT>, Reply, <FROM, >Admin,
POSTED>| REPLIED>'
58. End nodes / Terminator nodes
What if we already have the nodes that should end the expansion?
endNodes – like filter, but takes a collection of nodes (or ids)
terminatorNodes – like filter (stop expand), but also takes a collection
(whitelistNodes and blacklistNodes too! )
Can be used with labelFilter or sequence, but continue or include must be unanimous
59. End nodes / Terminator nodes
What if we already have the nodes that should end the expansion?
endNodes – like filter, but takes a collection of nodes (or ids)
terminatorNodes – like filter (stop expand), but also takes a collection
(whitelistNodes and blacklistNodes too! )
Can be used with labelFilter or sequence, but continue or include must be unanimous
65. Turn JSON List into Cypher List
with "[1,2,3]" as str
with split(substring(str,1, length(str)-2),",") as numbers
return [x IN numbers| toInteger(x)]
72. Gephi Integration
match path = (:Person)-[:ACTED_IN]->(:Movie)
WITH path LIMIT 1000
with collect(path) as paths
call apoc.gephi.add(null,'workspace0', paths) yield nodes,
relationships, time
return nodes, relationships, time
incremental send to Gephi, needs Gephi Streaming extension
102. Procedures / Functions from Cypher
CALL apoc.custom.asProcedure('answer','RETURN 42 as answer');
CALL custom.answer();
works also with parameters, and return columns declarations
CALL apoc.custom.asFunction('answer','RETURN $input','long',
[['input','number']]);
RETURN custom.answer(42) as answer;
103. Neo4j Developer Surface
Native LanguageDrivers
BOLT User Defined
Procedure
2000-2010 0.x Embedded Java API
2010-2014 1.x REST
2014-2015 2.x Cypher over HTTP
2016 3.0.x Bolt, Official Language Drivers, User Defined Procedures
2016 3.1.x User Defined Functions
2017 3.2.x User Defined Aggregation Functions
110. Build a procedure or function
you'd like
start with
the template repo
github.com/neo4j-examples/neo4j-procedure-template
112
111. User Defined Procedures
User-defined procedures are
● @Procedure annotated, named Java Methods
○ default name: package + method
● take @Name'ed parameters (3.1. default values)
● return a Stream of value objects
● fields are turned into columns
● can use @Context injected GraphDatabaseService etc
● run within Transaction
112. public class FullTextIndex {
@Context
public GraphDatabaseService db;
@Procedure( name = "example.search", mode = Procedure.Mode.READ )
public Stream<SearchHit> search( @Name("index") String index,
@Name("query") String query ) {
if( !db.index().existsForNodes( index )) {
return Stream.empty();
}
return db.index().forNodes( index ).query( query ).stream()
.map( SearchHit::new );
}
public static class SearchHit {
public final Node node;
SearchHit(Node node) { this.node = node; }
}
}
113. try ( Driver driver = GraphDatabase.driver( "bolt://localhost",
Config.build().toConfig() ) ) {
try ( Session session = driver.session() ) {
String call = "CALL example.search('User',$query)";
Map<String,Object> params = singletonMap( "query", "name:Brook*");
StatementResult result = session.run( call, params);
while ( result.hasNext() {
// process results
}
}
}
Deploy & Register in Neo4j Server via neo4j-harness
Call & test via neo4j-java-driver
114. Deploying User Defined Procedures
Build or download (shadow) jar
● Drop jar-file into $NEO4J_HOME/plugins
● Restart server
● Procedure should be available
● Otherwise check neo4j.log / debug.log
124. Aggregation Function In APOC
• more efficient variants of collect(x)[a..b]
• apoc.agg.nth, apoc.agg.first, apoc.agg.last, apoc.agg.slice
• apoc.agg.median(x)
• apoc.agg.percentiles(x,[0.5,0.9])
• apoc.agg.product(x)
• apoc.agg.statistics() provides a full numeric statistic