Quand la gestion des données de nos applications web dépasse la simple persistance dans une base de données relationnelle (type SGBD), l’utilisation de technologies alternatives dites « NoSql » est nécessaire. Nous aborderons les 4 grandes familles de NoSql (Key/Value, Document, Column-oriented et Graph) ainsi que leur intégration dans des applications PHP.
3. 3
Association Française des Utilisateurs de PHP
• Crée en 2001
• Forum PHP ( 21 & 22 Novembre 2013 à Paris)
• AperoPHP et Rendez Vous
• Antennes Locale
• Président en 2009 www.afup.org
Association Francophone des utilisateurs de SYmfony
• Initié en 2010 par Hugo Hamon
• Pas encore une vraie association
• Sfpot mensuel avec conférence suivie d’un apéro
• Antenne à Marseille, Lyon ??
www.afsy.fr
4. 4
Elao
• Fondateur en 2005
• Lyon & Paris
• Agence Web Technique de 15 personnes
• Symfony depuis 2006
• Partenaire officiel SensioLabs
www.elao.com
10. Data Size
• 500 million page views a day
• ~3TB of new data to store a day
• Posts are about 50GB a day.
Follower list updates are about
2.7TB a day.
10
12. Uniformity
• Semi-‐structured
data
• Different
data
lifecycle
• Store
more
data
about
each
en7ty
• Individualisa7on
&
decentraliza7on
of
content
genera7on
12
16. Column 1 : value
Column 2 : value
Column 3 : value
Key
Key
Key/Value Column-oriented
Field 1 : value
Field A : value
Field B : value
Field 2 : value
Node 1
Node 3
Node 2
Node 4
Node 5
Document
oriented
Graph
Key Value
Key Value
Key Value
Key Value
16
17. Column 1 : value
Column 2 : value
Column 3 : value
Key
Key
Key/Value Column-oriented
Field 1 : value
Field A : value
Field B : value
Field 2 : value
Node 1
Node 3
Node 2
Node 4
Node 5
Document
oriented
Graph
Key Value
Key Value
Key Value
Key Value
17
18. Key-value databases
• Inspired by Amazon’s Dynamo (2007)
• Global collection of key-value
• Big scalable HashMap
18
19. • Strengths
• Simple data model
• High performance
• Great at scaling out horizontally
• Weaknesses
• Simplistic data model
• Poor for complex data
19
Key-value databases
20. • Written in C - BSD License - 2009
• Very fast and light-weigth
• All data in memory
• Persistence
• Master/Slave Replication
• Used for caching, session or working
queue
20
Key-value databases
http://redis.io/
22. Column 1 : value
Column 2 : value
Column 3 : value
Key
Key
Key/Value Column-oriented
Field 1 : value
Field A : value
Field B : value
Field 2 : value
Node 1
Node 3
Node 2
Node 4
Node 5
Document
oriented
Graph
Key Value
Key Value
Key Value
Key Value
22
23. Document databases
• Inspired by IBM Lotus Notes/Domino
• Idem from Key/Value with value as a
document
• A document is a key-value collection
• Flexible schema
• Non-relational, data is de-normalized
23
24. Document databases
• Strengths
• Simple, powerful data model
• Good scaling, Easy/Auto sharding
• Usually “ACID” compliant
• Weaknesses
• Unsuited for interconnected data
• Query model limited to keys (and indexes)
24
25. Document databases
• Written in C++ - License AGPL - 2009
• JSON-style documents
• Full Index Support
• Fast In-Place Updates
• Auto-Sharding
• Replication & High Availability
• A lot of Connector
• Big Community
• Commercial Support
25
http://www.mongodb.org
26. Document databases
• Lotus Notes / Domino
• CouchDB
written in Erlang, Javascript for Query
• OrientDB
written in Java, relationship as graph
26
27. Column 1 : value
Column 2 : value
Column 3 : value
Key
Key
Key/Value Column-oriented
Field 1 : value
Field A : value
Field B : value
Field 2 : value
Node 1
Node 3
Node 2
Node 4
Node 5
Document
oriented
Graph
Key Value
Key Value
Key Value
Key Value
27
28. Graph databases
• Nodes with properties
• Named relationships with properties
• Focus on the data structure
• Direct pointer to its adjacent element and
no indexlookups are necessary
28
29. Graph databases
• Strengths
• Powerful data model
• Fast for connected data
• A new data architecture
• Weaknesses
• No Sharding : All data in one instance
• Using Node/Relation property for Query kill
performance
• A new data architecture
29
32. Column 1 : value
Column 2 : value
Column 3 : value
Key
Key
Key/Value Column-oriented
Field 1 : value
Field A : value
Field B : value
Field 2 : value
Node 1
Node 3
Node 2
Node 4
Node 5
Document
oriented
Graph
Key Value
Key Value
Key Value
Key Value
32
33. Column-oriented database
• A big table, with column families
• Data stored by column instead of row
• Build for distributed architecture
• Map-reduce for querying/processing
• Flexible schema
• Easy sharding (partitioning)
33
34. Column-oriented database
• Strengths
• Data model supports semi-structured data
• Naturally indexed (columns)
• Horizontally scalable – RW increase linearly
• Fault tolerant – no single point of failure
• Weaknesses
• Unsuited for interconnected data
34
35. Column-oriented database
• Java - Apache License 2 - 2008
• Developed by Facebook
• Decentralized
• Supports replication and multi data center
replication
• Scalability
• Fault-tolerant
• MapReduce support
http://cassandra.apache.org/
35
37. Conclusion
• Application architecture impact
• Store your data in the way you want to
query it
• Denormalize your data and try to keep
them up-to-date !
37