SlideShare a Scribd company logo
1 of 27
Download to read offline
A quick review of
Python and Graph
Databases
NIC CROUCH
@FPHHOTCHIPS
Who am I?
◦ Consultant at Deloitte Melbourne
in Enterprise Information Management
◦ Recent graduate of Flinders University in Adelaide
◦ Casual/Enthusiast reviewer of Graph Databases
What is a graph?
“A set of objects connected by links” – Wikipedia
Objects: Vertices, nodes, points
Links: Edges, arcs, lines, relationships
Prior Work on Graphs in Python
Graph Database Patterns in Python – Elizabeth Ramirez, PyCon US 2015
Practical Graph/Network Analysis Made Simple – Eric Ma, PyCon US 2015
Graphs, Networks and Python: The Power of Interconnection – Lachlan Blackhall, PyCon AU
2014
An introduction to Python and graph databases with Neo4j - Holger Spill, PyCon NZ 2014
Mogwai: Graph Databases in your App – Cody Lee, PyTexas 2014
Today: Pythonic Graphs
An exploration of graph storage in Python:
◦ API must be Pythonic
◦ execute(“<Not Python>”) doesn’t count.
◦ As little configuration as possible
Caveats:
◦ No configuration means no tuning
◦ Can’t compare distributed performance on a single node
◦ Limited to rough comparisons of performance – not a lab environment!
The Simple
1) Set up a dictionary of nodes
2) Each node keeps a list of relationships (or two, if you want a directed graph)
3) Set up add and get convenience methods
Pros:
• Sometimes the simplest ways are the best
• Very quick
Cons:
• Not consistent
• Probably going to need to be
maintained
• Not persistent
The (slightly less) Simple
1) Set up a Shelf of nodes
2) Each node keeps a list of relationships (or two, if you want a directed graph)
3) Set up add and get convenience methods
Pros:
• Still reasonably quick
Cons:
• Not consistent
• Probably going to need to be
maintained
Off-topic: NetworkX
All the advantages of using a dictionary with none of the custom code.
◦ Comes with graph generators
◦ BSD Licenced
◦ Loads of standard analysis algorithms
◦ 90% test coverage
◦ … no persistence (except Pickle).
The Popularity Test
DBMS
Score
Jul
2015
Neo4j 31.34
OrientDB 4.46
Titan 3.89
ArangoDB 1.29
Giraph 1.03
The Incumbent: Neo4j
Released in 2007
Written in Java
GPLv3/AGPLv3 or a commercial license
Runs as a server that exposes a REST Interface
Natively uses Cypher – an in-house developed graph query language
Best established, most popular graph-database
Easy to install – unzip and run a script
High Availability, but a little difficult to scale
Neo4j from Python
Py2Neo:
◦ Built by Nigel Small from Neo4j
◦ Actively maintained
Neo4j-rest-client
◦ Javier de la Rosa from University of Western Ontario
◦ Maintained through 9 months ago
neo4jdb-python
◦ Jacob Hansson of Neo4j
◦ Maintained through 8 months ago
◦ Mostly just wrappers around Cypher
Bulbflow:
◦ Built by James Thornton of Pipem/Espeed
◦ Maintained to 8 months ago
◦ Connects to multiple backends
Py2Neo: Syntax
Set up a connection:
◦ graph=Graph("http://neo4j:password@localhost:7474/db/data/")
Create a node:
◦ graph.create(Node("node_label", name=node_name))
◦ Node labels are like classes
Find a node:
◦ graph.find_one("node_label", property_key="name",property_value=node_name)
Create a relationship:
◦ graph.create(Relationship(node1, relationship, node2))
Find a relationship:
o graph.match_one(node1, relationship, node2, bidirectional=False)
Py2Neo: Good and Bad
The good:
Simple API
Well documented
Easy to connect and get started.
Cool (if preliminary) spatial support
Not so much:
◦ Skinny API
◦ No transaction support for Pythonic calls
◦ Performance struggles on large inputs
◦ No ORM (kinda)
neo4j-rest-client Syntax
Set up a connection:
◦ graph=GraphDatabase("http://localhost:7474/db/data/", username="username",
password="password")
Create a node:
◦ node=graph.nodes.create(name=node_name)
◦ Node labels are like classes
Find a node:
◦ graph.nodes.filter(Q("name", iexact=node)).elements[0]
Create a relationship:
◦ relationship=node1.is_related_to(node2)
neo4j-rest-client:
Good and Bad
Transaction support with a context manager*
Strong filtering syntax
Very strong labelling syntax – searchable tags for nodes
Lazy evaluation of queries
Still REST based – still difficult to make it perform
*Seemingly. Somewhat difficult to make it work.
Py2Neo vs Neo4j-Rest-Client:
Performance
100 nodes with 20% connection:
Loading:
Py2Neo: ~8 seconds
Neo4j-rest-client: ~5 seconds
Postgres: 4s
Retrieving:
Py2Neo: ~6 seconds
Neo4j-rest-client: ~5 seconds
Postgres: 4s
1000 nodes with 20% connection:
Loading:
Py2Neo: ~7 minutes
Neo4j-rest-client: ~50 minutes
Postgres: 6 minutes
Retrieving:
Py2Neo: ~7 minutes
Neo4j-rest-client: ~50 minutes
Postgres: 6 minutes
Machine:
AWS Memory Optimised
xLarge node (30GB RAM)
on Ubuntu Server using
iPython2 3.0.0
Important note
Completely unoptimised! No indexes, no attempt to chunk, only
a couple OS optimisations.
OrientDB
PyOrient:
◦ Official OrientDB Driver for Python
◦ Binary Driver
◦ Not Pythonic
Released in 2011
More NoSQL than Neo and Titan (Documents as well as graphs)
Scalable across multiple servers
Supports SQL
Titan
First released in 2012
Written in Java
Licenced under Apache Licence
Many storage backends, including Cassandra, HBase and BerkeleyDB
Hadoop integration
Large amount of search back-ends
Built for scalability
Commercially supported by DataStax (formerly Aurelius)
Titan and Python
Mogwai:
◦ Written by Cody Lee of wellaware
◦ Binary Driver for RexPro Server
◦ Very pythonic!
Bulbflow:
◦ Built by James Thornton of Pipem/Espeed
◦ REST-based interface
◦ Maintained to 8 months ago
◦ Connects to multiple backends
RexPro and the
Tinkerpop Stack
Apache Incubator Open Source Graph Framework
◦ Built around Gremlin
◦ Written in Java
◦ Extensively documented
Mogwai Performance
100 nodes with 20% connection:
Loading:
14 seconds
Retrieving:
18 seconds
1000 nodes with 20% connection:
Loading:
~9 minutes
Retrieving:
~25 minutes
So, what should I use?*
Neo4j:
◦ Good, relatively quick
bindings
◦ Well supported
◦ Could be expensive
◦ May not scale
*The full title of this slide is “What should I research further to ensure it meets my specific needs and then
consider using?” In any case, the answer is still “It depends”
It depends.
Titan:
◦ Good bindings
◦ Support in doubt
◦ Should be cheaper
◦ Proven scalability
Orient:
◦ Poor bindings
◦ Well supported
◦ Open pricing structure
◦ Should scale well
What about Python Graph Databases?
Not just Python bindings –pure(ish) Python.
GrapheekDB: https://bitbucket.org/nidusfr/grapheekdb
◦ Uses local memory, Kyoto Cabinet or Symas LMDB as backend
◦ Under active development
◦ Exposes client/server interface
◦ Code is Beta quality at best
◦ Documentation is very spotty
Ajgu: https://bitbucket.org/amirouche/ajgu-graphdb/
◦ Uses Berkeley Database backend
◦ Under active development
◦ “This program is alpha becarful”
◦ Python 3 only
Ajgu
Set up a connection:
◦ graph = GraphDatabase(Storage('./BSDDB/graph'))
Create a node:
◦ transaction = self.graph.transaction(sync=True)
◦ node = transaction.vertex.create(node)
Find a node:
◦ transaction.vertex.label(start)
Create a relationship:
◦ relationship=transaction.edge.create(node1,node2)
Take-aways
Graphs match plenty of data sets
The big three Graph Databases are Neo4j, Titan and Orient
All three have upsides and downsides – depending on the usecase.
If you want to have a bit more fun, try Ajgu or Grapheek!
Thanks!
Questions?
nic@niccrouch.com
@fphhotchips
Py2Neo: Performance and
Transactional Support
Large imports should be done in one transaction to decrease overhead:
Graph.create(long_list_of_nodes_and_relationships)
This kills the client (essentially hangs in string processing).
So:
for chunk in izip_longest(*[iter(iterator)]*size, fillvalue=''):
try:
chunk = chunk[0:chunk.index('')]
except ValueError:
pass
try:
self.graph.create(*chunk)
except Exception as ex:
pass #chunk dividing goes here
We lose ACID at this point.
What if this fails? Have to chunk it up again to find what
failed.

More Related Content

What's hot

Using PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataJimmy Angelakos
 
Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in RAnqi Fu
 
data.table and H2O at LondonR with Matt Dowle
data.table and H2O at LondonR with Matt Dowledata.table and H2O at LondonR with Matt Dowle
data.table and H2O at LondonR with Matt DowleSri Ambati
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jArangoDB Database
 
Neo, Titan & Cassandra
Neo, Titan & CassandraNeo, Titan & Cassandra
Neo, Titan & Cassandrajohnrjenson
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchclintongormley
 
Sem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraintsSem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraintsClark & Parsia LLC
 
Hands on Training – Graph Database with Neo4j
Hands on Training – Graph Database with Neo4jHands on Training – Graph Database with Neo4j
Hands on Training – Graph Database with Neo4jSerendio Inc.
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...NoSQLmatters
 
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixStefan Krawczyk
 
OrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KWOrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KWgmccarvell
 
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyterdata science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & JupyterRaj Singh
 
Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklNeo4j
 
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariIntro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariKarissa Rae McKelvey
 
Stardog Linked Data Catalog
Stardog Linked Data CatalogStardog Linked Data Catalog
Stardog Linked Data Catalogkendallclark
 
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...Lucidworks
 
A general introduction to Spring Data / Neo4J
A general introduction to Spring Data / Neo4JA general introduction to Spring Data / Neo4J
A general introduction to Spring Data / Neo4JFlorent Biville
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...jexp
 

What's hot (20)

Using PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic Data
 
Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in R
 
data.table and H2O at LondonR with Matt Dowle
data.table and H2O at LondonR with Matt Dowledata.table and H2O at LondonR with Matt Dowle
data.table and H2O at LondonR with Matt Dowle
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
 
Neo, Titan & Cassandra
Neo, Titan & CassandraNeo, Titan & Cassandra
Neo, Titan & Cassandra
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Sem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraintsSem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraints
 
Hands on Training – Graph Database with Neo4j
Hands on Training – Graph Database with Neo4jHands on Training – Graph Database with Neo4j
Hands on Training – Graph Database with Neo4j
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Bids talk 9.18
Bids talk 9.18Bids talk 9.18
Bids talk 9.18
 
HUG France - Apache Drill
HUG France - Apache DrillHUG France - Apache Drill
HUG France - Apache Drill
 
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch Fix
 
OrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KWOrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KW
 
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyterdata science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
 
Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quickl
 
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariIntro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
 
Stardog Linked Data Catalog
Stardog Linked Data CatalogStardog Linked Data Catalog
Stardog Linked Data Catalog
 
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
 
A general introduction to Spring Data / Neo4J
A general introduction to Spring Data / Neo4JA general introduction to Spring Data / Neo4J
A general introduction to Spring Data / Neo4J
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...
 

Viewers also liked

Persistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jPersistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jTobias Lindaaker
 
Word Puzzles with Neo4j and Py2neo
Word Puzzles with Neo4j and Py2neoWord Puzzles with Neo4j and Py2neo
Word Puzzles with Neo4j and Py2neoGrant Paton-Simpson
 
Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonOdessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonMax Klymyshyn
 
Introduction to py2neo
Introduction to py2neoIntroduction to py2neo
Introduction to py2neoNigel Small
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseAndreas Jung
 
Creative Data Analysis with Python
Creative Data Analysis with PythonCreative Data Analysis with Python
Creative Data Analysis with PythonGrant Paton-Simpson
 
Knowledge structure
Knowledge structureKnowledge structure
Knowledge structureRushdi Shams
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyCharlie Greenbacker
 
Round pegs and square holes
Round pegs and square holesRound pegs and square holes
Round pegs and square holesDaniel Greenfeld
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWSAmazon Web Services
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)Sumit Raj
 
Airflow - a data flow engine
Airflow - a data flow engineAirflow - a data flow engine
Airflow - a data flow engineWalter Liu
 
ArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB Database
 
Building social network with Neo4j and Python
Building social network with Neo4j and PythonBuilding social network with Neo4j and Python
Building social network with Neo4j and PythonAndrii Soldatenko
 
Graphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks AgeGraphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks AgeLorenzo Alberton
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Neo4j
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 

Viewers also liked (20)

Persistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jPersistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4j
 
Word Puzzles with Neo4j and Py2neo
Word Puzzles with Neo4j and Py2neoWord Puzzles with Neo4j and Py2neo
Word Puzzles with Neo4j and Py2neo
 
Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonOdessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and Python
 
Introduction to py2neo
Introduction to py2neoIntroduction to py2neo
Introduction to py2neo
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL Database
 
Creative Data Analysis with Python
Creative Data Analysis with PythonCreative Data Analysis with Python
Creative Data Analysis with Python
 
Knowledge structure
Knowledge structureKnowledge structure
Knowledge structure
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in Lumify
 
Round pegs and square holes
Round pegs and square holesRound pegs and square holes
Round pegs and square holes
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
 
Airflow - a data flow engine
Airflow - a data flow engineAirflow - a data flow engine
Airflow - a data flow engine
 
ArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQL
 
Building social network with Neo4j and Python
Building social network with Neo4j and PythonBuilding social network with Neo4j and Python
Building social network with Neo4j and Python
 
Graphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks AgeGraphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks Age
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 

Similar to A quick review of Python and Graph Databases

GR8Conf 2011: Neo4j Plugin
GR8Conf 2011: Neo4j PluginGR8Conf 2011: Neo4j Plugin
GR8Conf 2011: Neo4j PluginGR8Conf
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j
 
GraphQL-ify your APIs - Devoxx UK 2021
 GraphQL-ify your APIs - Devoxx UK 2021 GraphQL-ify your APIs - Devoxx UK 2021
GraphQL-ify your APIs - Devoxx UK 2021Soham Dasgupta
 
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Sascha Wenninger
 
Using React with Grails 3
Using React with Grails 3Using React with Grails 3
Using React with Grails 3Zachary Klein
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Derek Jacoby
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101Timothy Spann
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesJonathan Katz
 
Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147DoKC
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a bossFrancisco Ribeiro
 
Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Henry S
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20Phil Wilkins
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsMike Broberg
 

Similar to A quick review of Python and Graph Databases (20)

GR8Conf 2011: Neo4j Plugin
GR8Conf 2011: Neo4j PluginGR8Conf 2011: Neo4j Plugin
GR8Conf 2011: Neo4j Plugin
 
Grails and Neo4j
Grails and Neo4jGrails and Neo4j
Grails and Neo4j
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform Overview
 
Netty training
Netty trainingNetty training
Netty training
 
Netty training
Netty trainingNetty training
Netty training
 
GraphQL-ify your APIs - Devoxx UK 2021
 GraphQL-ify your APIs - Devoxx UK 2021 GraphQL-ify your APIs - Devoxx UK 2021
GraphQL-ify your APIs - Devoxx UK 2021
 
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
 
Using React with Grails 3
Using React with Grails 3Using React with Grails 3
Using React with Grails 3
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9
 
Beginners Node.js
Beginners Node.jsBeginners Node.js
Beginners Node.js
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with Kubernetes
 
Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a boss
 
Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

A quick review of Python and Graph Databases

  • 1. A quick review of Python and Graph Databases NIC CROUCH @FPHHOTCHIPS
  • 2. Who am I? ◦ Consultant at Deloitte Melbourne in Enterprise Information Management ◦ Recent graduate of Flinders University in Adelaide ◦ Casual/Enthusiast reviewer of Graph Databases
  • 3. What is a graph? “A set of objects connected by links” – Wikipedia Objects: Vertices, nodes, points Links: Edges, arcs, lines, relationships
  • 4. Prior Work on Graphs in Python Graph Database Patterns in Python – Elizabeth Ramirez, PyCon US 2015 Practical Graph/Network Analysis Made Simple – Eric Ma, PyCon US 2015 Graphs, Networks and Python: The Power of Interconnection – Lachlan Blackhall, PyCon AU 2014 An introduction to Python and graph databases with Neo4j - Holger Spill, PyCon NZ 2014 Mogwai: Graph Databases in your App – Cody Lee, PyTexas 2014
  • 5. Today: Pythonic Graphs An exploration of graph storage in Python: ◦ API must be Pythonic ◦ execute(“<Not Python>”) doesn’t count. ◦ As little configuration as possible Caveats: ◦ No configuration means no tuning ◦ Can’t compare distributed performance on a single node ◦ Limited to rough comparisons of performance – not a lab environment!
  • 6. The Simple 1) Set up a dictionary of nodes 2) Each node keeps a list of relationships (or two, if you want a directed graph) 3) Set up add and get convenience methods Pros: • Sometimes the simplest ways are the best • Very quick Cons: • Not consistent • Probably going to need to be maintained • Not persistent
  • 7. The (slightly less) Simple 1) Set up a Shelf of nodes 2) Each node keeps a list of relationships (or two, if you want a directed graph) 3) Set up add and get convenience methods Pros: • Still reasonably quick Cons: • Not consistent • Probably going to need to be maintained
  • 8. Off-topic: NetworkX All the advantages of using a dictionary with none of the custom code. ◦ Comes with graph generators ◦ BSD Licenced ◦ Loads of standard analysis algorithms ◦ 90% test coverage ◦ … no persistence (except Pickle).
  • 9. The Popularity Test DBMS Score Jul 2015 Neo4j 31.34 OrientDB 4.46 Titan 3.89 ArangoDB 1.29 Giraph 1.03
  • 10. The Incumbent: Neo4j Released in 2007 Written in Java GPLv3/AGPLv3 or a commercial license Runs as a server that exposes a REST Interface Natively uses Cypher – an in-house developed graph query language Best established, most popular graph-database Easy to install – unzip and run a script High Availability, but a little difficult to scale
  • 11. Neo4j from Python Py2Neo: ◦ Built by Nigel Small from Neo4j ◦ Actively maintained Neo4j-rest-client ◦ Javier de la Rosa from University of Western Ontario ◦ Maintained through 9 months ago neo4jdb-python ◦ Jacob Hansson of Neo4j ◦ Maintained through 8 months ago ◦ Mostly just wrappers around Cypher Bulbflow: ◦ Built by James Thornton of Pipem/Espeed ◦ Maintained to 8 months ago ◦ Connects to multiple backends
  • 12. Py2Neo: Syntax Set up a connection: ◦ graph=Graph("http://neo4j:password@localhost:7474/db/data/") Create a node: ◦ graph.create(Node("node_label", name=node_name)) ◦ Node labels are like classes Find a node: ◦ graph.find_one("node_label", property_key="name",property_value=node_name) Create a relationship: ◦ graph.create(Relationship(node1, relationship, node2)) Find a relationship: o graph.match_one(node1, relationship, node2, bidirectional=False)
  • 13. Py2Neo: Good and Bad The good: Simple API Well documented Easy to connect and get started. Cool (if preliminary) spatial support Not so much: ◦ Skinny API ◦ No transaction support for Pythonic calls ◦ Performance struggles on large inputs ◦ No ORM (kinda)
  • 14. neo4j-rest-client Syntax Set up a connection: ◦ graph=GraphDatabase("http://localhost:7474/db/data/", username="username", password="password") Create a node: ◦ node=graph.nodes.create(name=node_name) ◦ Node labels are like classes Find a node: ◦ graph.nodes.filter(Q("name", iexact=node)).elements[0] Create a relationship: ◦ relationship=node1.is_related_to(node2)
  • 15. neo4j-rest-client: Good and Bad Transaction support with a context manager* Strong filtering syntax Very strong labelling syntax – searchable tags for nodes Lazy evaluation of queries Still REST based – still difficult to make it perform *Seemingly. Somewhat difficult to make it work.
  • 16. Py2Neo vs Neo4j-Rest-Client: Performance 100 nodes with 20% connection: Loading: Py2Neo: ~8 seconds Neo4j-rest-client: ~5 seconds Postgres: 4s Retrieving: Py2Neo: ~6 seconds Neo4j-rest-client: ~5 seconds Postgres: 4s 1000 nodes with 20% connection: Loading: Py2Neo: ~7 minutes Neo4j-rest-client: ~50 minutes Postgres: 6 minutes Retrieving: Py2Neo: ~7 minutes Neo4j-rest-client: ~50 minutes Postgres: 6 minutes Machine: AWS Memory Optimised xLarge node (30GB RAM) on Ubuntu Server using iPython2 3.0.0 Important note Completely unoptimised! No indexes, no attempt to chunk, only a couple OS optimisations.
  • 17. OrientDB PyOrient: ◦ Official OrientDB Driver for Python ◦ Binary Driver ◦ Not Pythonic Released in 2011 More NoSQL than Neo and Titan (Documents as well as graphs) Scalable across multiple servers Supports SQL
  • 18. Titan First released in 2012 Written in Java Licenced under Apache Licence Many storage backends, including Cassandra, HBase and BerkeleyDB Hadoop integration Large amount of search back-ends Built for scalability Commercially supported by DataStax (formerly Aurelius)
  • 19. Titan and Python Mogwai: ◦ Written by Cody Lee of wellaware ◦ Binary Driver for RexPro Server ◦ Very pythonic! Bulbflow: ◦ Built by James Thornton of Pipem/Espeed ◦ REST-based interface ◦ Maintained to 8 months ago ◦ Connects to multiple backends
  • 20. RexPro and the Tinkerpop Stack Apache Incubator Open Source Graph Framework ◦ Built around Gremlin ◦ Written in Java ◦ Extensively documented
  • 21. Mogwai Performance 100 nodes with 20% connection: Loading: 14 seconds Retrieving: 18 seconds 1000 nodes with 20% connection: Loading: ~9 minutes Retrieving: ~25 minutes
  • 22. So, what should I use?* Neo4j: ◦ Good, relatively quick bindings ◦ Well supported ◦ Could be expensive ◦ May not scale *The full title of this slide is “What should I research further to ensure it meets my specific needs and then consider using?” In any case, the answer is still “It depends” It depends. Titan: ◦ Good bindings ◦ Support in doubt ◦ Should be cheaper ◦ Proven scalability Orient: ◦ Poor bindings ◦ Well supported ◦ Open pricing structure ◦ Should scale well
  • 23. What about Python Graph Databases? Not just Python bindings –pure(ish) Python. GrapheekDB: https://bitbucket.org/nidusfr/grapheekdb ◦ Uses local memory, Kyoto Cabinet or Symas LMDB as backend ◦ Under active development ◦ Exposes client/server interface ◦ Code is Beta quality at best ◦ Documentation is very spotty Ajgu: https://bitbucket.org/amirouche/ajgu-graphdb/ ◦ Uses Berkeley Database backend ◦ Under active development ◦ “This program is alpha becarful” ◦ Python 3 only
  • 24. Ajgu Set up a connection: ◦ graph = GraphDatabase(Storage('./BSDDB/graph')) Create a node: ◦ transaction = self.graph.transaction(sync=True) ◦ node = transaction.vertex.create(node) Find a node: ◦ transaction.vertex.label(start) Create a relationship: ◦ relationship=transaction.edge.create(node1,node2)
  • 25. Take-aways Graphs match plenty of data sets The big three Graph Databases are Neo4j, Titan and Orient All three have upsides and downsides – depending on the usecase. If you want to have a bit more fun, try Ajgu or Grapheek!
  • 27. Py2Neo: Performance and Transactional Support Large imports should be done in one transaction to decrease overhead: Graph.create(long_list_of_nodes_and_relationships) This kills the client (essentially hangs in string processing). So: for chunk in izip_longest(*[iter(iterator)]*size, fillvalue=''): try: chunk = chunk[0:chunk.index('')] except ValueError: pass try: self.graph.create(*chunk) except Exception as ex: pass #chunk dividing goes here We lose ACID at this point. What if this fails? Have to chunk it up again to find what failed.