The following might sound all too familiar: We’re taking and sharing more photos than ever before, and as a result we’re drowning in our photos. Somewhere in that pile or stream of photos are the ones that really matter – if only we could find them. To make this easier Albumprinter created a solution which recently launched with more than 500 million photos for 1 million existing customers.
Neo4j is a key component in this new solution. It holds all the information of how photos relate to each other, to which user they belong and with which users they are shared. In total the store size is over 1.3 TB. We will explain why we choose Neo4j, how we use it and how we scaled to handle the massive import. Of course we will also share the lessons learned and how we solved some of our challenges.
4. Who are we
• Wouter Crooy – Solution Architect
• Ruben Heusinkveld – Technical Lead
• Neo4j Certified Professionals
5. The photo organizer
• Deliver well organized, easy to use and secure storage for all
your images
• Ease the process of selecting photos for creating photo
products
• Started as part of a R&D ‘Skunk works’ project
12. The challenge
• Replace legacy system with the new photo organizer
• Move 1.3 PB of photos from on premise to cloud storage
• Analyze & organize all photos (511 million)
• Data cleansing while importing
• Using the same technology / architecture during import and
after
• Ability to add features while importing
• Core of the systems are built in .NET
13. The import
• Hard deadline
• Factory closing that holds the data center with all photos
• Started 1st of April
• Minimum processing of 150 images / second
• ~500 queries / second to Neo4j
• Up to 700 EC2 instances on AWS
14. How we did it
• Micro services
• Command Query Responsibility Segregation (CQRS)
• Cluster
• Multiple write nodes
• Single master read only nodes
• HAProxy
• Cypher only via REST interface
• .NET Neo4jClient
16. Why we choose Neo4j
• Close to domain model
• Not an ordinary (relational) database
• Looking for relations between photos/users
• Scalable
• Flexible schema
• Natural / fluent queries
• ACID / data consistency
21. Our Neo4j database
• More than 1 billion nodes
• 4.1 billion properties
• 2.6 billion relations
• Total store size of 863 GB
22. Command Query Responsibility Segregation
• Seperation between writing and reading data
• Different model between Query and Command API
• Independent scaling
UI
Cache
DB
Component
Component
Update
Publish
Write
Query
Command
24. CQRS Seperate Reads & Writes
• No active event publishing in place
• Specific scenarios for updating / writing data
• Ability to create seperate model for read and write
• Updates (pieces) the user graph
• Requires reliable and consistent read
• Scale out -> overloading locking of (user) graph
• After import
• Low performance scenarios -> cache with lower update priority
25. Read after write consistency
• All reads should contain the very latest and most accurate data
• Replication delay between servers
• Split on consistency
• Article by Aseem Kishore:
• https://neo4j.com/blog/advanced-neo4j-fiftythree-reading-writing-
scaling/
26. Graph locking
• Concurrency challenge
• Scale-out => more images from the same user
• Manage the input
• High spread of user/image combination
• Prevent concurrent analysis of multiple images from the same user
• :GET
/db/manage/server/jmx/domain/org.neo4j/instance%3Dkernel%
230%2Cname%3DLocking
27. Batch insert vs single insert
• Cypher CSV import per 1000 records
• Prevent locking caused by concurrency issues
28. No infinite scale out
• Find the sweet spot for the amount of cluster nodes
• +1 nodes => more replications updates => higher load on write
master
29. Timeline
• We’re looking for photos which should belong to each other
based on date-taken.
• Moving from full property scan to graph walking via the timeline.
• For large collection 75% less DB-hits
• Walking the timeline if looking for photos within a certain
timeframe
• Less photos to evaluate for property scan (SecondsSinceEpoch)
• Works perfectly for year, month, day selections
30. .NET & Rest interface
• Custom headers to REST Cypher endpoint (Filtered by HaProxy)
• To route to multiple write servers
• Sticky session per user
• Custom additions to .NET Neo4jclient
• Managing JSON resultset
31. Graph design considerations
• Property scan
• (User)<-[:BelongsTo]-(Photo)
• More photos
• Property search => full-graph-scan
• Differentiating property
• Create node
• No path/clustered indexes…. (yet.. )
• Making changes to the schema….
• For 550+ million nodes
32. Graph design improvements
Property search
match (u:User { Id: “001"})<-
[:BelongsTo]-(p:Photo)
where p.Favourite = true
return p
=> 2812 db hits
Node/Relationship search
match (u:User { Id: "001"})-
[:HasFavourites]-(f:Favourites)<-
[:IsFavourite]-(p:Photo)
return p
=> 13 db hits
• dbms.logs.query.* (don’t forget to enable parameters resolving)
• Our alternative: Integrate with Kibana / Elasticsearch
• https://neo4j.com/docs/operations-manual/current/reference/
Ruben
At Albelli we want to inspire people to relive and share life’s moments by easily creating beautiful personalized photo products.Vision: To brighten up the world by bringing people’s moments to life.
Albumprinter is a Cimpress company. The most known brand here in the US for Cimpress is Vistaprint. I’m sure you’ve all know it.
Albumprinter is based in Amsterdam, The Netherlands.
We have multiple consumer brands to serve the European market
Albumprinter aquired FotoKnudsen in June 2014
Ruben
Goal: Deliver well organized, easy to use and secure storage for all your images
Build by team of 5 (1 designer, 1 frontend developer, 1 quality engineer, and Wouter and myself focusing on the backend)
Ruben
Launched June of this year
Available on all devices
Ruben
Photos are automatically grouped together into events
Ruben:
Easy to share photos with friends or publicly if you want
Privately via invites
Ruben:
The photos can be used to create any product like a photo book, calendar or wall decor
Ruben:
The photos can be used to create any product like a photo book, calendar or wall decor
Wouter
WouterNot uploading duplicates
Wouter
Wouter
Wouter
In Neo4j we only store the metadata. The actual photos are stored in Amazon Simple Storage Service (S3).
Wouter
Ruben
Ruben
Ruben
Ruben
Ruben
For all those photos this resulted in:
More than 1 billion nodes
4.1 billion properties
2.6 billion relations
Total store size of 863 GB
RubenI know it’s really ambitious to explain CQRS within 2 slides. But I would still like to explain why and how it could work with Neo4j.
Events sourcing.
Double update to db and cache.
In our case we used a cache update/flush on certain rules.
Pro: Less work, database is to large for cache.
Con: Not always reliable cache source.
Wouter
Wouter
Neo4j in it’s core is very capable of handling CQRS interfaces. Since you’re not updating a table but (parts) of the graph.
Due to it’s ACID nature is should also be able to make sure there are no race-conditions.
But since this archicture allows to massively scale out that does not always match the capebilities of a ACID DB.
Especially in the cases where the writes are more occuring then the reads.
Make sure the read is consistent
In our situation, CQRS is extra complex since we have a ordered crawler (5+ steps) which also does the writes. But the crawler(s) and query api are still allowed to do reads.
https://www.infoq.com/news/2015/05/cqrs-advantages
http://udidahan.com/2011/04/22/when-to-avoid-cqrs/
http://udidahan.com/2009/12/09/clarified-cqrs/
http://udidahan.com/2010/08/31/race-conditions-dont-exist/
See also consistent read solution. In cases were we don’t need to have consistsent read we can use the case.
Wouter
Read fastly outnumber writes in our application as for many applications.
Split on consistency, not read vs. write
Track user last write time for read after write consistency
Monitor and tune slave lag, via push/pull configs
Stick slaves by user for read after read consistency
https://neo4j.com/blog/advanced-neo4j-fiftythree-reading-writing-scaling/
Credits to Aseem Kishore and his team at FiftyThree for sharing this on the conference last year.