Graph processing and graph databases have been with us for a while. However, since their physical implementations are the same for every database in production (Node connected to node, or triplets), there's a perception that data modeling (and data modelers) have no role on projects where graph databases are used.
This month we'll talk about where graph databases are a best fit in a modern data architecture and where data models add value.
Streamlining Python Development: A Guide to a Modern Project Setup
Graph Databases - Where Do We Do the Modeling Part?
1. Karen Lopez @datachick #HeartData
Heart of Data Modeling
Graph Databases: Where does the modeling go?
2. Yes, Please do Tweet/Share
today’s event
@datachick #heartdata
3. Karen López
Karen has 20+ years of data and information architecture
experience on large, multi-project programs.
She is a frequent speaker on data modeling, data-driven
methodologies and pattern data models.
She wants you to love your data…
She is loves new tech and gadgets
4. How new tech are you?
...so let’s get to know you….
14. Ragged Hierarchies
A hierarchy where
there is variability
in the number of
levels across
branches.
Node
Node
Node
Node
Node
Node
Node Node
Node
Node
Node
Node
14
15. Automobile
Engine
Fuel Line Valve Injector Fan
Fan Blade
Bearing
Bolt
Fanbelt
Entertainment
System
Bolt
Radio
Satellite Radio
Media Player
Backup camera
Automobile
Engine
Injection System
Fuel Line
Valve
Injector
Fan
Fan Blade
Fan Bearing
Fanbelt
Entertainment
System
Bolt
Radio
Satellite Radio
Media Player
Backup camera
Energy Graph
What Happens When…….?
Sometimes we take a group
of “sibling widgets” and
make them a widget just for
them. Think “subassembly”.
Then we have to think of this
new group as a widget.
15
16. How we Model Graph in
Relational
Lots of tricks and tips happening here.
28. Labeled Property Graph
Nodes have properties
(think key-value pairs)
Nodes have labels
(think meta-data and categories)
Relationships are directed
Relationships have names
Relationships have a start and end
node
Relationships have properties
28
Node
property
property
Node
property
NodeNode
Node
Label
LabelLabel
Label
32. TripleStores
Come from semantic technologies movement
A triple is a subject:predicate:object data structure
Individually triples are semantically poor
En masse they provide rich dataset to harvest knowledge and
infer connections
Use RDF and XML--SPARQL for queries
Ginger dances with Fred
Fred likes ice cream
Karen loves data
32
33. Graph Databases – Neo4j
CREATE (matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })
CREATE (matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })
CREATE (matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })
CREATE (keanu:Actor { name:'Keanu Reeves' })
CREATE (laurence:Actor { name:'Laurence Fishburne' })
CREATE (carrieanne:Actor { name:'Carrie-Anne Moss' })
CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)
CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)
CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)
CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)
CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)
CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)
CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)
CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)
CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)
http://neo4j.com/docs/stable/cypherdoc-movie-database.html
40. Tools and Graph Databases
•No native supportERwin
•No native supportER/Studio
•No native supportPowerDesigner
“the data model
is the database”
“the database is
the data model”
ODBC / JDBC connectively for
querying.
41. So what about modeling for graph?
Marketing vs. Real Project
There is a model
The model isn’t the structure
The model would be used to design
the graph(s)
Same modeling issues:
Naming
Properties
Rules
Consistency
Governance
42. Data Modeling & Graph
No* Logical + Physical Data model
The graph is the data model...and the database
Whiteboard data modeling
Traditional data models still have a role
45. 10+ Tips for Architects
1. Understand the use cases for graph technologies
2. Evaluate/profile your data requirements for
suitability for graph databases and/or graph
processing
3. ACID support varies across products. You’ll want
to test your use cases.
4. Your query data stories will guide your decisions
5. Test your current development tools for support
46. 10+ Tips for Architects
6. Test your database design/data modeling tools
7. Leverage your existing metadata/models
8. True hierarchies are VERY RARE in the real world.
9. Know the questions you have to ask about all the
exceptions
10.Keep asking where the data integrity happens/is
relevant
46
49. Fun with Graphs
Scotch Whiskeys
Belgian Beer
Bank Fraud Detection
Access Control Management
…and 100 more….
GraphGist Project - http://gist.neo4j.org/