Neo4j - 7 databases in 7 weeks


Presentation of Neo4j based on my experience through the book 7 databases in 7 weeks.

  1. 1. 1 Tél : +33 (0)1 58 56 10 00 Fax : +33 (0)1 58 56 10 01© OCTO 2012 50, avenue des Champs-Elysées 75008 Paris - FRANCE 7 databases in 7 weeks Neo4j – A graph database presentation Nicolas Landier
  2. 2. 2 Neo4j, WTF? BTW, what is a graph? Day-1: WebUI Day-2: REST API Day-3: Consistency and HA Demo?! Strengths & Weaknesses Outline
  3. 3. 3 Graph oriented database Key-Value minded Schemaless Graph theory based Whiteboard friendly… Neo4j, WTF?
  4. 4. 4 Neo4j loves whiteboards
  5. 5. 5 Node = Vertex Relationship = Edge Property = Attribute BTW, what is a graph? Nexus 4 manufacturer: ‘LG’ color: ‘Black’ Nicolas traits: {‘nice’, ‘awesome’} Loves how_much: ‘you don’t want to know’
  6. 6. 6 Great and useful WebUI to get familiar and test Create a first data model Gremlin: a Groovy based syntax => exploratory queries Cypher: a SQL-like syntax => criteria queries Create your own DSL g.V.filter{'Prancing Wolf Ice Wine 2007'} Day-1 – Head first in the web UI! ###CRASH!! $$DB files fucked up### PLEASE FORMAT###
  7. 7. 7 You can Neo4j query from Java but you better stick with Cypher if you do so! ice_wine = g.v(0) ice_wine.out('grape_type').in('grape_type').filter{ !it.equals(ice_wine) }
  8. 8. 8 REST API Paths algorithms: shortest, all paths, Dijkstraaaahhh (500) Index: manual indexing Full-Text search Cypher Movie database exercice using external Graph algorithms Day-2 – Let’s move to the REST API!
  9. 9. 9 1 server: ACID One-liners are implicitly transactions (like PostgreSQL) Transactions can be handled manually Cluster: no more ACID (enterprise only) Master gets elected by servers R/W can be handled by every server but the Master is GOLD version of data Replication is strictly handled by the Master Eventual consistency Zookeeper-based Day-3 – Let’s be HA!
  10. 10. 10 To relax your eyes
  11. 11. 11 Write transactions can be performed on any database instance in a cluster. Neo4j HA is fault tolerant and can continue to operate from any number of machines down to a single machine. Slaves will be automatically synchronized with the master on write operations. If the master fails a new master will be elected automatically. The cluster automatically handles instances becoming unavailable (for example due to network issues), and also makes sure to accept them as members in the cluster when they are available again. Transactions are atomic, consistent and durable but eventually propagated out to other slaves. Updates to slaves are eventual consistent by nature but can be configured to be pushed optimistically from master during commit. If the master goes down any running write transaction will be rolled back and new transactions will block or fail until a new master has become available. Reads are highly available and the ability to handle read load scales with more database instances in the cluster. Neo4J HA summed up Source:
  12. 12. 12 Demo
  13. 13. 13 Strengths & Weaknesses 2 languages: Groovy support (Gremlin) Cypher Simple data model: « JIT » for developers « Promised scalability » Great for human-related network data DSL definitions are written in stone (not editable, not updatable) Licence cost 6,000 USD/node/year for Commercial 24,000 USD/node/year for Enterprise (HA) Index support??
  15. 15. 15 Information is based on my experience from the book from 7 Databases in 7 weeks authored by Eric Redmond and Jim R. Wilson. Thanks for their work. Source