3. The Evolution of Databases
1960 1970 1980 1990 2000 2010 2020 2030
Hierarchical
Relational (+ Object Relational and Object Mapping)
Desktop
Object
NoSQL/NOSQL/NewSQL/Specialised
Graph Databases
Cloud Databases
4. The Evolution of Databases
1960 1970 1980 1990 2000 2010 2020 2030
TRADITIONAL OLTP/RELATIONAL
Store and retrieve data
2-3 hops
in a query
5. The Evolution of Databases
1960 1970 1980 1990 2000 2010 2020 2030
TRADITIONAL OLTP/RELATIONAL
Store and retrieve data
2-3 hops
in a query
BIG DATA TECHNOLOGY
Aggregate and filter data
1 hop
in a query
6. Connections in data
The Evolution of Databases
1960 1970 1980 1990 2000 2010 2020 2030
TRADITIONAL OLTP/RELATIONAL
Store and retrieve data
2-3 hops
in a query
BIG DATA TECHNOLOGY
Aggregate and filter data
1 hop
in a query
Millions of
hops
7. Connections in data
The Evolution of Databases
TRADITIONAL OLTP/RELATIONAL
Store and retrieve data
2-3 hops
in a query
BIG DATA TECHNOLOGY
Aggregate and filter data
1 hop
in a query
Millions of
hops
Real time storage
& retrieval
Long running queries Real-time connected insights
“Our Neo4j solution is literally thousands of times faster
than the prior MySQL solution, with queries that require
10-100 times less code”
Volker Pacher, Senior Developer
9. Requirements of Next Gen Applications
Unbounded Scale Intelligence and Learning
Revealed Context Security and Data Privacy
Ability to scale up and scale out Incorporate data science into
operational systems
Use richness of data to reveal
context and causality in real time Ensure data protection and
regulatory compliance
10. • Physical data and transaction isolation
underneath shared infrastructure
• Multiple databases run in parallel
on a single Neo4j Cluster
(or standalone server)
• Agility through cloud-friendly
management: quickly move databases
from one online cluster to another
Great for SaaS providers, Enterprises,
data scientists and developers
Multi-tenancy with Neo4j 4.0
12. The Quirkness of [Maybe] Eventual Consistency
Art by Pascal Jousselin - Europe Comics Publisher Dupuis - Creative Commons
13. Raft-based cluster and ACID database:
- Graph transactions over ACID
Consistency
- Maintains integrity over time
Causal Clustering with Neo4j
Replica Servers
Query, View
Core Servers
Synced Cluster
14. • Unbounded scalability to Neo4j by adding horizontal scale-out to the power of Neo4j’s existing
replicated scale-up model.
• Two modes of operations:
SHARDING: operate over single large graph
FEDERATION: query across disjoint graphs
Data Scientists
Run analysis on large, distributed databases.
Developers
Develop large scale applications on
their laptops/desktops and deploy
them in a network of Neo4j clusters.
Enterprises
Keep data in designated geographies
Analyse graphs without replicating or
moving them.
Introducing Sharding and Federated Graphs
15. • Unbounded scalability to Neo4j by adding horizontal scale-out to the power of Neo4j’s existing
replicated scale-up model.
• Executes queries in parallel on multiple databases, combining or aggregating results.
• Chains queries together from multiple databases for sophisticated real-time analyses.
Multi-graph Cypher Queries
SQL
MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate, count(report) AS Total
Cypher in Neo4j 3.5
UNWIND corporate.graphIds() AS gid
CALL {
USE corporate.graph( gid )
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
RETURN boss.name AS Boss,
sub.name AS Subordinate,
count(report) AS Total
}
RETURN Boss, Subordinate, Total ORDER BY Total
Cypher in Neo4j 4.0
18. BobJoe
• Designed with new cases in mind - PII, credit
card, patient information, etc.
• Based on Role-based Access Control for
graphs
• Restrictions on what data can be seen by
different users, applied to all database
interactions
• Implicit security view of the data for each user
through schema-based security definitions
• Grant/Deny permissions to traverse, read or
write data based on node labels, relationship
types or database and property names
• Security rules are replicated across the cluster
via roles that are associated with the users
Security and Data Privacy
Baseline_Personnel
_Security_Standard
Security_Check Counter_Terrorism
_Check
Developed_Vetting
22. The Largest Investment in Graph Databases
Built on Neo4j’s robust native foundation, Neo4j 4.0 brings:
Unlimited scalability with sharding & federation
A fully reactive architecture for modern applications
Granular security controls for security & privacy
Deployment flexibility with multi-database
Neo4j 4.0
25. 25
Neo4j Desktop
essential tools
- Neo4j Browser for Cypher editing workflows
- Neo4j Bloom for interactive graph exploration
- Neo4j Labs tools: Neo4j ETL, Halin, Query Log Analyzer, Neuler
- Awesome 3rd party applications from Kineviz, yWorks, and others!
26. 26
Neo4j Desktop
connect to Neo4j databases anywhere
- locally managed single-instances for development and experimentation
- on-premise in test and staging environments
- in production deployments in the cloud using Aura
27. 27
Neo4j Desktop
developer workflows in 2020
- integration with local filesystem
- CLI tooling to integrate with build systems
- project templates for bootstrapping application development
- easy prototyping of graph application user interfaces
- first-class GraphQL development
- fantastic integration with Aura
28. 28
Neo4j Desktop
from graph idea to Neo4j in production
- essential tools for working with Neo4j
- locally managed Neo4j Enterprise Edition for Developers
- connect your development to Neo4j anywhere
29. Flexible
Neo4j Aura
Reliable Developer Friendly
● Zero Administration
● On-Demand Scaling
● Simple, capacity-based
pricing
● Always-On and self-healing,
clustered configuration
● Data Integrity & Durability
● Secure including end-to-end
encryption
● Native graph performance
● World’s most popular graph
query language
● Broad language support -
drivers for Java, .NET,
JavaScript, Python, Go, Spring,
etc.
31. Visualizing Graphs in Neo4j
Neo4j Bloom
Out-of-the-box from Neo4j
Exclusively for Neo4j graphs
Focused on ad-hoc graph
exploration needs
“Search first” exploration
paradigm
For data analysts, data
scientists and developers
Visualization Toolkits
3rd party e.g. Ogma, Keylines,
vis.js, d3.js, sigma.js, yFiles
Custom app development or
visualization embedding needs
Some are vendor supported
Some have Neo4j data hooks,
others require development
Offer robust APIs for flexible
control over the app experience
For developers who build apps
BI Tools
3rd party supported e.g.
Tableau, Power BI, Qlik
Dashboard and report creation
combining various data sources
and many visualization types
Not optimized for graph data -
need a special Neo4j connector
For data analysts and business
users
Graph Visualization Apps
3rd party supported e.g.
Linkurious, Kineviz, Graphileon
Purpose-built applications for
specific domains and graph
visualization needs
Support multiple graph models
and sources
Customization and integration
capable
For data analysts, data scientists
and business users
Little technical expertise Most technically involved
Smaller user scale Larger user scale
Self-guided, low curation Highly guided and curated
33. Neo4j Bloom’s Intuitive User Interface
Search with type-ahead
suggestions
Flexible Color, Size and
Icon schemes
Visualize, Explore and
Discover
Pan, Zoom and Select
Property Browser and
editor
34. What’s New in Bloom 1.2
Flexible colors and sizes Style using properties
Expand by relationship or
neighbor type
Case insensitive search Specify parameter types Export csv data
35. Graph Recipes &
Analytics
Graph Enhanced
ML & AI
Graph Data Science
GDS is a science-driven approach to gain knowledge from the
relationships and structures in data, typically to power predictions.
It uses multi-disciplinary workflows that may include
queries, statistics, algorithms and machine learning.
`
Answers specific questions to gain insights
from connections in existing/historical data
Approaches typically include global queries
and algorithms and direct use of results
Training models (ML) with “graphy” data to
be used to emulate human, probabilistic
decisions within a solution/ application (AI
system)
36. Optimized for Analytics
Leverage custom data structures
optimized for global traversals and
aggregation
Flexibly subset and reshape your
graph for specific use cases
Algorithms for Insights
Robust algorithms that are highly
parallelized and scale to billions of
nodes
Early access to dozens of
experimental implementations
Intuitive Interface
Drastically simplified and
standardized API that enables
custom configurations
Documentation, training, and
examples so getting started is simple
The Graph Algos Library becomes the
Product Supported & Under Active Development
Graph Data Science Library
37. GA in early March (preview release February 6!)
Graph Data Science in 2020
Analytics projections:
- Specialized data structure for algorithms,
capable of supporting billions of nodes
- Cypher loaders for experimentation
- Quickly reshape, combine, aggregate, and
deduplicate your transactional data
- Support for multiple node labels, relationship
types, and properties
- Manage multiple in-memory analytics graphs
for different workloads
- Drastically reduced memory footprint
Graph algorithms & more:
- 40+ algorithms in 5 categories: community,
centrality, similarity, pathfinding, and link
prediction
- Helper algorithms like graph generation, one hot
encoding, and random walk
- Early previews to new implementations in the
alpha & beta name spaces
- Supported, scalable algorithms include seeding,
determinism, and incremental calculations
- Estimate mode for memory requirements
38.
39. Neo4j 4.0-all-the-things!
Ready Today:
• Java, .NET, JavaScript, Spring Data Neo4j Drivers ✅
• Neo4j Desktop ✅
• Neo4j Browser ✅
Work in Progress:
• Python (Q2) & Go (Q4) - Coming!
• Graph Data Science Library (Q2) - Coming!
• Bloom Q2 - Coming!
• Aura (Q2) - Coming!