Dr. Jim Webber presented the closing keynote at a conference. He recapped Neo4j's recent releases, including version 3.0 which delivered new graph capabilities. He discussed future hardware trends and how native graph technology provides performance advantages. He also showed how Neo4j can handle workloads at massive scales of over 1 trillion relationships and 1 million writes per second on a single machine. Finally, he outlined Neo4j's new causal clustering architecture in version 3.1 which provides security, scalability, and fault tolerance for enterprise graph applications.
8. Graph Insanity**
Dr. Jim Webber
Chief Scientist, Neo4j
** = “insanity” in this context refers to scientifically responsible jubilation
9. Overview
• A brief recap
• Future hardware trends
• Performance advantages of native graph
technology
• Looking to the future
• Drinks
10.
11. Neo4j 3.0 Recap APRIL 2016 RELEASE
Delivering New Graph Capabilities
Developers
Develop applications
faster and easier
Architects
Design bigger and
faster applications
Administrators
Deploy Neo4j
anywhere easily
Neo4j 3.0 enables and accelerates large-scale graph initiatives
Giant graphs,
fast performance
Easy full-stack
development
Cloud, container
and on-premise
11
12. Introducing Neo4j 3.1
New Security and Clustering Architecture
Build and deploy graph applications across
an entire enterprise
• Compliance with internal and external
enterprise Information Security needs
• Robust and flexible new clustering
architecture for diverse operational
scenarios and application needs
A foundation that enables mainstream
enterprise solutions on-premises and
in the cloud
ENTERPRISE GRAPH FOUNDATION
Operational, Analytic, and Transactional Uses
Security Clustering Operability
Enterprise
Graph Applications
12
The Graph Foundation for the Enterprise
20. Neo4j IBM POWER8 CAPI Flash
20
• Enables ultra-large in-memory graphs
• High performance, ultra-high
throughput graph processing on
56TB of near memory
• IBM CAPI Flash is a specialized IO
co-processor that provides IO gains
similar to GPUs for graphics
Significant improvements in
concurrency and scale
21. Pushing Neo4j to the Limits
• Asymptotic benchmarking effort
• “What Neo4j can do when it’s pushed to
its limits?”
• And the results are pretty amazing
This is our CTO Johan, please talk to
him. He’s totally a people person.
22. Traversals
• Realistic retail dataset from Amazon
• Commodity dual Xeon processor server
• Social recommendation (Java procedure) equivalent to:
MATCH (you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco)
WHERE id(you)={id}
RETURN reco
Threads Hops/second
1 3-4M
10 17-29M
20 34-50M
30 36-60M
23. Trillions!
@profbriancox
Read Scale
• Can comfortably handle 1 trillion
relationships on a single server
• 24x2TB SSDs, 33TB size on disk.
• Compiled Cypher query
• Random reads
• Sustains over 100k user
transactions/sec
• Even with 99.8% page faults because
of small 512G RAM machine
24. Write Scale
• Import highly connected
Friendster dataset
• 1.8 billion relationships
takes around 20 minutes
• That is 1M writes/second!
Millions and
billions!
@profbriancox
27. Graph-native advantages
• Prioritize graph workloads
• Adapt at any point in the stack for graphs
• Disks, RAM, NVRAM, Coprocessors, RDMA, drivers,
query language, consensus protocol…
• Non-native approaches will adapt for their
primary use case
• Columns, documents
28.
29. Comparison on a ~10M node, ~100M relationship graph
Workload Non-native graph DB: 6 machines, each with
48 VCPUs, 256 GB disk and 256 GB of RAM
Count nodes 201s
Count outgoing rels 202s
Count outgoing rels at depth 2 276s
Count outgoing rels at depth 3 511s
Group nodes by property val 212s
Group rels by type 198s
Count depth 2 knows-likes 324s
Page Rank 2571s
Neo4j: single thread
< 1ms
< 1ms
23s
423s*
8s
54s
149s*
27s*
30. • Consider the possum
• What’s that Emil? If I bring another animal into this venue, this keynote is
over?
• (emil’s head saying those words?)
31. Raft-based architecture
• Continuously available
• Consensus commits
• Third-generation cluster architecture
Cluster-aware stack
• Seamless integration among drivers,
Bolt protocol and cluster
• Eliminates need for external load balancer
• Cluster-aware sessions with encrypted
connections
Streamlined development
• Relieves developers from complex infrastructure concerns
• Faster and easier to develop distributed graph applications
Neo4j Causal Clustering Architecture
Fault-Tolerant and Scalable.
31 ENTERPRISE EDITION
32. How Causal Clustering Works
32
Replica Servers
Query, View
Core Servers
Synced Cluster
Read
Replica
Read-
Write
Read
Replica
Read-
WriteRead
Replica
Read Replica
Reporting
and Analysis
Graph
App
Driver
BOLT
Write
Read
Read
Replica
Read
Replica
Read
Replica
Built-in load balancing
• Spreads reads to core and replica servers
• Directs writes to core servers
Causal consistency
• Always-consistent view of data at any scale
• Stronger than eventual consistency
• Best model for graphs:
• Reliability >> Availability
Large heterogeneous clusters
• Non-blocking & asynchronous
protocols
• Mix and match instance types
App servers, reporting servers,
IoT devices…
ENTERPRISE EDITION
33. R E P L I C A Q U E R I E S C O R E Q U E R I E S
Causal Clustering Architecture Optimizes for
Cost-Consistency at Query Time
Read
Any
33
Read
Your Own
Writes
Read
Any
Read
Your Own
Writes
Linearizable
(Future 3.x)
QUORATE
The Holy Grail
of Distributed
Systems
Q U E R Y C O S T
ENTERPRISE EDITION
34. Causal Clustering Topology Awareness
• Today cluster round-robin load balances based on
consistency level
• Defaults to a core instance for writes, a read-replica for reads
• Tomorrow cluster will load balance by:
• Network topology
• Geography
• Bandwidth
• Server load
• Server capacity
• User preference
• Etc.
35. Efficient Fan-Out for Very Large Clusters
• Replica-to Replica catchup
• Chains, trees…
• Exploit DC locality
• Retain causal consistency
• Never see earlier versions
of the data
• Even over WAN latencies
38. Neo4j 3.1 Creates a New Foundation
Enables Graphs Across the Enterprise
The graph database has gone mainstream
Has become a core enterprise technology spanning
a wide variety of business domains
Neo4j is the leading graph database
Extensive track record of graph leadership and
innovation
Neo4j 3.1 is the graph foundation for the enterprise
Provides the security, scalability, integration,
administration and operability required
to support enterprise graph applications
38
World-Class Research and Development
Reactions:
First kindly
Then awkward
Then distaste
My friend and colleague Max Sumrall has an ironic Trump hat and gets more respect than me!
Tough days.
But my Silicon Valley CEO told me to stop worrying about piffling things like Brexit, collapsing currency, social division, austerity, inequality, and a new prime minister that makes Thatcher seem cuddly
Focus on the important stuff, which all happens in the Silicon Valley tech scene.
OK boss. Game on.
Of course I do have to be a little careful, [since with $35M in his pocket Emil could really fund the engineering department]
Emil could make us build a box
Bulldog
What neo4j engineers and neo4j contributors have done to neo4j since we last met at the neo4j conference to talk with neo4j people about neo4j things
You folks might have missed this – we announced it in London at GraphConnect “pre-Brexit edition”
But there’s a lot to like in the 3.x series.
For Architects, we bring giant graphs with no practical limits and fast performance.
For Developers, we bring easy full-stack development (drivers).
And for Administrators, 3.0 provides better support for cloud, container and on premise deployments.
Establishes Neo4j as the enterprise standard graph technology
There’s a lot of useful stuff in the 3.x releases. Clustering, security, IO, modelling, performance.
But there’s relentless innovation in the computing industry. And as a native graph database, we’re well positioned to benefit from that.
Let’s take a look at some near-future hardware trends.
Consider the elephant.
Legend has it, it’s memory is so robust it never forgets.And something very interesting is happening with computer memory
For the last few years something profound happened with memory and it passed most of us by
Circa 2012 Sandy Bridge: cache sub-system is considered the "source of truth" for mainstream systems
Cache miss costs 500 instructions per socket – optimise for cache!
Neo4j responds:
Compact representations for native graph data storage – yields cache friendliness!
In 3.0 with the enterprise format we use relative addressing which means generally smaller pointers and more efficient use of cache space.
We can do this because we’re graph native – we can optimise all the way down the stack for graph traversals.
Super important macro-level trend:
Jure Leskovec from Stanford: analysis from GRADES 2016
Over the past 10 years there is indication that in data analytics size of RAM has been growing 50% every year while size of the data only by 20%.
“Maybe your data increases faster. Maybe you think data is bigger and increasing faster. But facts should trump opinions” – Szilard Pafka
Let’s look at the memory.
100G commodity, soon to be high-end laptop territory.
10T off the shelf (Jure’s group bought a 12TB machine)
1P not yet available in a commodity single machine, but 50% per year growth mean’s it’s not far away (~7 years)
Another very interesting trend: Large disks are coming.
Next year Seagate will release a 60TB SSD.
3.0 effectively removes the upper limit for how many nodes, relationships and properties you can have in a single graph – quadrillions of items
Actually pushes past the ext4 boundaries!
Not for everyone, but cost effective for valuable graph production use cases.
We are well placed to benefit from this – neo4j loves SSDs, it’s on-disk format is optimised for rapid, low-overhead traversals
BECAUSE NATIVE we can perform optimisations for graphs ALL THE WAY DOWN TO HARDWARE
Large ram and large disks are Neo4j’s sweet spot – we’re optimized for index-free adjacency
Native databases support “index free adjacency” c.f. Neubauer and Rodriguez 2010 Graph Traversal Pattern. Few index lookups; most hops cost O(1) – in Neo4j that’s cheap pointer chasing through memory or on-disk.
Index free adjacency isn’t some academic curiosity, it’s important for performance. Links need to be first class citizens in the database, natively.
And pointer chasing in a memory space is mechanically cheap in – way cheaper than hash lookups over a network for example.
Some non-native databases use global indexes – one index per direction even. This means every hop is O(log n). For millions graphs, that’s 6x more IO. For billions graphs (typical) that’s 9x more IO.
(green river colorado river confluence)
NVRAM is coming: promising near RAM speed latency (4x slower but yet 10x the size).
Massive paradigm shift for database folks
Neo4j has an in-memory representation of graph and on-disk representation
These can converge with NVRAM
Result: less time spent translating formats, more time spent doing useful graph workloads
RAM is getting bigger; NVRAM will follow suit
Most graphs today can be accommodated in large RAM (including yours)
Medium term: practically all graphs will be hosted in durable RAM
And finally we’re also seeing the emergence of specialised coprocessors on the bus like IBM’s P8 and soon P9.
Right now that means 56TB of fast SSD and lots of RAM too (16TB)
But this isn’t just some “Big Iron” play.
CAPI Flash is IBM's innovation, doing for IO what GPUs do for graphics workloads
And since we’re native all the way down the stack we can and have built a plugin for Neo4j that takes advantage of the hardware acceleration.
IBM engineers measured 2x better performance with hardware acceleration.
We’re a very unconventional database in many respects.
We’re graph-native, transactional, and we can even be embedded.
Because of that, there’s a lot of misleading FUD.
Our CTO Johan Svensson is a very chilled out Swedish guy, unrecognisable from his Viking ancestors. He’s slow to anger.
But he did want to document that our unconventional design has real-world benefits.
So he ran a project to benchmark to see what happens to the database when you accelerate down various axes of scale
Neo4j best in class on traversal speed and scaling reads.
User transaction means real units of work that are meaningful and valuable to the application.
Lots of traversals involved.
Not some artificial to-first-byte delivery benchmark.
Random reads are the hardest for a database to optimise so this is a truly challenging benchmark.
But don’t take my word for it, please give it up for Professor Brian Cox!
You can get so much work done so quickly with numbers like those.
Consider Tortoise and his nemesis the insolent hare.
Well, we’ve seen neo4j’s no slow-coach. But how does it compare to non-native graph tech?
I saw this on the internet and thought it looked like a neat laptop-sized exercise.
We had the Dbpedia dataset to hand which is comparable in size (slightly larger but from the real world, 11M nodes 116M links)
The original experiment ran on 288 cores with 1.5TB RAM. I ran on one core with 8GB given to neo4j (half for heap, half for cache).
*Michael Hunger ran on one core with 128GB given over to the database when I needed more heap space than my laptop could give for the larger queries - I join the many thousands of you that have been on the benefitting end of Michael’s amazing tech support!
That itself is remarkable illustration of how efficient neo4j can be. Sure it’s macho to run 6 large machines, but it’s more sensible not to.
*** Describe what’s going on *** then:
This is not really a fair comparison.
The work undertaken by the non-native store is far higher than the work undertaken by neo4j.
But that’s the whole point!
Because neo4j can optimise for graphs all the way down the stack, we can and have implemented all kinds of shortcuts that databases optimised for tables or columns or keys and values or documents can’t do.
And in our forthcoming compiled runtime, you’ll see results and order of magnitude better still for aggregation.
Consider the data centre of the future.
No possums.
Non-blocking consensus, “Masterless” system
Raft at the core
Aysnc replication to the edges
But with read consistency!
Leading edge modern architecture. Doug Terry, inventor of eventual consistency at Xerox Parc in 1994.
Further reading: https://littlemindslargeclouds.wordpress.com/2014/05/27/implementing-causal-consistency/
RYOW is a major step forward for developers
With eventual consistency you have to figure this out in your app.
Now you always see your changes to the graph with no special work.
Makes reasoning about your app as easy as if you just had one computer, even when you have hundreds.
Trade off: latency
Oh, just one more thing…
Cypher is by far the leading graph query language, and growing with the openCypher consortium
As Cypher-aware humans you can see the opportunities for parallelism here.
Two patterns can be evaluated in parallel.
Pipeline parallel possible form pattern eval via WITH into MERGE.
But a computer can do better, much better.
Cypher represented as a tree of operators
Subtrees can be evaluated for suitability for dispatch to different thread by cost at runtime
Run threads in parallel, dynamically check whether more subtrees can be cost-effectively extracted, return results and aggregate
Based on:
Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age
Dynamic subtree allocation, locality aware – thread affinity
Partnering with several universities, hardware orgs to develop deeply scalable tech for mixed OLTP and analytic workloads.
You choose whether to:
All resources to a single query (analytics)
Single thread per query (classic OLTP)
Some queries get > 1 thread (optimise for difficult OLTP queries)
Target: next-gen Cypher (compiled) runtime
Graphs mainstream: emergence of enterprise standard database
Leading graph database: mature and with a wonderful community (that’s YOU folks!)
But the foundation for all of this is the neo4j research and development team’s efforts
World-class Research, development
I’m almost done now, but…
For those of you old hands expecting me to talk about triangles, I bet you’ve been disappointed this far.
So let me relive you with a little something they “now teach at business school”
Let’s ride this graph unicorn!
Onwards to the afterparty disConnect!