SlideShare a Scribd company logo
1 of 22
Graph at Scale 
10/22/14 
Brad Nussbaum 
CTO | MediaHound 
brad@mediahound.com | @bradnussbaum 
www.mediahound.com
MediaHound started with a search…
But Entertainment is more than just Film & TV
Entertainment Preferences Are Scattered 
Music Movies Television Books
All Content 
one platform that connects: 
All Sources All Devices 
All Brands All Artists All Users
The Entertainment Graph 
A comprehensive database that brings together movies, 
books, games, music, and TV, including the cast & crew, 
sources, reviews, categories, genres, lists and more! 
The Entertainment Graph powers meaningful recommendations, 
exciting data insights and comprehensive social discovery.
Use Cases: 
Academy Award Winners On Netflix 
MATCH (c:Collection)-[:CONTAINS]-(m:Movie) WHERE (m)- 
[:MATCHED_SOURCE]-(n:NETFLIX) AND c.name=“Academy Award 
Winners” RETURN m; 
Movies and Shows Based On Zombie Books 
MATCH (m:Media)-[:BASED_ON]-(b:Book)-[:HAS_TRAIT]-(z:Trait) 
WHERE (m.type=“Movie” OR m.type=“Show”) AND z.name=“Zombies” 
RETURN m; 
To access ‘The Movie Graph’ mini app (which uses a different model than above), from your browser, run 
:play movie graph
Entertainment Recommendations 
Collaborative Filtering: 
If Joe, Amy and Steve like Gladiator, AND Joe and Amy like Toy Story, 
THEN MediaHound recommends Toy Story to Steve. 
Graph Influencers 
Joe is an early adopter. 
Joe, Amy and Steve like several things in common. 
MediaHound recommends Amy and Steve follow Joe and Amy 
Joe discovers the next big hit and shares it on his feed 
Amy and Steve see Steve’s post and give it a listen.
We needed our graph database to perform 
under sustained user write load AND during 
heavy batch update operations. 
We needed to recommend media content in 
real-time which required many concurrent 
pattern matching operations on the graph.
We realized through trial and error that using 
the Transactional Cypher HTTP Endpoint was 
the BEST solution to control batch writes. 
POST http://localhost:7474/db/data/transaction/commit 
Accept: application/json; charset=UTF-8 
Content-Type: application/json 
{ 
"statements": [ 
{ 
"statement": "MERGE(n:User{username:"bradnussbaum"})-[:FOLLOWS]->(m:User{username:"bennussbaum"});" 
}, 
{ 
"statement": "MERGE(n:User{username:"bradnussbaum"})<-[:FOLLOWS]-(m:User{username:"bennussbaum"});" 
} 
] 
}
We ran tests using the low-level kernel and 
found that sustained transaction writes 
performed optimally between 400-2,000 nodes 
and relationships per transaction. 
Compare the difference between… 
• Writing a single relationship (33 bytes) per transaction f 
or 10k iterations compared to 
• Writing 1k relationships per transaction for 10 iterations. 
**As of 2.2, Neo4j will batch writes on the server 
http://neo4j.com/docs/stable/linux-performance-guide.html 
git clone git@github.com:neo4j-contrib/tooling.git
We used Enterprise Integration Patterns (EIP) 
to create optimal batch sizes for each 
transaction. 
• Splitters to break down larger messages 
• Aggregators to combine single CQL statements t 
ogether into a single batch transaction 
• Throttling to control concurrent requests and requests p 
er second
We frequently run 40+ concurrent write 
transactions to a 3 instance cluster for hours 
at time. 
Deadlocks can occur often with many concurrent write o 
perations. 
• Retry Transient Errors after a small period of time. 
• Use the Error Index to split failed TX statements. 
Read here to learn all the error status codes, seriously. 
http://neo4j.com/docs/stable/status-codes.html
Test write throughput on your cluster with the 
push factor you plan to use in production and 
intentionally kill your master under load. 
You need to have two load balancers: 
• One for all the instances you want performing reads 
• One for your master ONLY  (send writes here) 
Master Check: /db/manage/server/ha/master 
- Returns true|false 
Slave Check: /db/manage/server/ha/slave 
- Returns true|false 
Available Check: /db/manage/server/ha/available 
- Returns master|slave
Check your driver for transaction support. 
Embedded mode has full transaction support 
but most remote drivers do not at this time. 
This will be changing in the near future…depending on 
which driver you use. 
**Spring Data Neo4j is actively being developed to included these features as part of 2.2.
Graph at Scale 
10/22/14 
Ben Nussbaum 
Director of Engineering | MediaHound 
ben@mediahound.com | @bennussbaum 
www.mediahound.com
We built custom algorithms that needed run-time 
decision making as Neo4j Extensions 
with Spring Data Neo4j. 
• Cache abstraction with Google’s Guava to build large 
in-memory indexes of nodes and relationships. 
• Integration for jobs instructions and results to and from 
the broker. 
• Async for batch job processing. 
https://github.com/AtomRain/neo4j-extensions
We took advantage of spot processing from 
AWS to run our custom extension algoritms. 
• On-demand graph processing with as many instances 
at a time as needed (we have used up to 9). 
• Concurrent job operations per spot. 
• Cache optimizations based on Labels and context of 
the jobs.
We built a flexible job controller that enables 
concurrent job processing on spot instances 
• Large jobs are broken into smaller jobs that can be 
processed by a single spot instance. 
• Spots process unit jobs and return results. If a spot 
dies, the job stays in the queue and another spot picks 
it up. 
• Memory and CPU constraints on an instance make this 
a necessity, especially when processing 30M+ songs.
Spot instances run Neo4j in SINGLE mode 
and stay up to date using a Topic. 
Neo4j HA 
Spot 1 
SINGLE 
Job 
Result 
MQ 
Spot 2 
SINGLE 
Job 
Process 
MQ 
ESB 
SDN SDN 
TX 
Topic 
1 
2 
3 
4 
5 
1. ESB sends Jobs to MQ 
2. Spots consume job instructions, 
process and send results b 
ack to MQ 
3. ESB posts jobs results to HA 
4. On successful post, send u 
pdates to Topic 
5. Spots consume from Topic to s 
tay u to date
Batch jobs return thousands of CQL 
statements which must not be dependent on 
any statements before or after. 
• Compound statements to create nodes and 
relationships for specific sub-graphs to avoid the need 
for layering wherever possible. If not… 
• Run jobs in a linear phases (layering) to create nodes 
first then connect relationships
Graph at Scale 
10/22/14 
Q/A

More Related Content

What's hot

3.1.Performance and BigData Ecosystem
3.1.Performance and BigData Ecosystem3.1.Performance and BigData Ecosystem
3.1.Performance and BigData Ecosystem振东 刘
 
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward
 
Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...
Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...
Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...Flink Forward
 
Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...
Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...
Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...Flink Forward
 
Cowboy dating with big data
Cowboy dating with big data Cowboy dating with big data
Cowboy dating with big data b0ris_1
 
Audience counting at Scale
Audience counting at ScaleAudience counting at Scale
Audience counting at Scaleb0ris_1
 
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...Flink Forward
 
Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform confluent
 
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Till Rohrmann
 
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward
 
Flink Forward San Francisco 2019: Developing and operating real-time applicat...
Flink Forward San Francisco 2019: Developing and operating real-time applicat...Flink Forward San Francisco 2019: Developing and operating real-time applicat...
Flink Forward San Francisco 2019: Developing and operating real-time applicat...Flink Forward
 
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...Flink Forward
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Thomas Weise
 
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward
 
Apache Beam @ GCPUG.TW Flink.TW 20161006
Apache Beam @ GCPUG.TW Flink.TW 20161006Apache Beam @ GCPUG.TW Flink.TW 20161006
Apache Beam @ GCPUG.TW Flink.TW 20161006Randy Huang
 
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...Flink Forward
 
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatAdministrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatHostedbyConfluent
 

What's hot (20)

3.1.Performance and BigData Ecosystem
3.1.Performance and BigData Ecosystem3.1.Performance and BigData Ecosystem
3.1.Performance and BigData Ecosystem
 
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
 
Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...
Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...
Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...
 
Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...
Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...
Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...
 
Cowboy dating with big data
Cowboy dating with big data Cowboy dating with big data
Cowboy dating with big data
 
Audience counting at Scale
Audience counting at ScaleAudience counting at Scale
Audience counting at Scale
 
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...
 
Optimized Hive replication
Optimized Hive replicationOptimized Hive replication
Optimized Hive replication
 
Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform
 
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
 
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
 
Flink Forward San Francisco 2019: Developing and operating real-time applicat...
Flink Forward San Francisco 2019: Developing and operating real-time applicat...Flink Forward San Francisco 2019: Developing and operating real-time applicat...
Flink Forward San Francisco 2019: Developing and operating real-time applicat...
 
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019
 
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
 
Gobblin on-aws
Gobblin on-awsGobblin on-aws
Gobblin on-aws
 
Apache Beam @ GCPUG.TW Flink.TW 20161006
Apache Beam @ GCPUG.TW Flink.TW 20161006Apache Beam @ GCPUG.TW Flink.TW 20161006
Apache Beam @ GCPUG.TW Flink.TW 20161006
 
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatAdministrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
 

Viewers also liked

Graph all the things
Graph all the thingsGraph all the things
Graph all the thingsNeo4j
 
Neo4j Makes Graphs Easy
Neo4j Makes Graphs EasyNeo4j Makes Graphs Easy
Neo4j Makes Graphs EasyNeo4j
 
Transparency One : La (re)découverte de la chaîne d'approvisionnement
Transparency One : La (re)découverte de la chaîne d'approvisionnementTransparency One : La (re)découverte de la chaîne d'approvisionnement
Transparency One : La (re)découverte de la chaîne d'approvisionnementNeo4j
 
Graph Your Business - GraphDay JimWebber
Graph Your Business - GraphDay JimWebberGraph Your Business - GraphDay JimWebber
Graph Your Business - GraphDay JimWebberNeo4j
 
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2Neo4j
 
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucher
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucherNeo4j Makes Graphs Easy? - GraphDay AmandaLaucher
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucherNeo4j
 
Graph your business
Graph your businessGraph your business
Graph your businessNeo4j
 
GraphTalk Frankfurt - Master Data Management bei der Bayerischen Versicherung
GraphTalk Frankfurt - Master Data Management bei der Bayerischen VersicherungGraphTalk Frankfurt - Master Data Management bei der Bayerischen Versicherung
GraphTalk Frankfurt - Master Data Management bei der Bayerischen VersicherungNeo4j
 
Metadata and Access Control
Metadata and Access ControlMetadata and Access Control
Metadata and Access ControlNeo4j
 
GraphConnect 2014 SF: The Business Graph
GraphConnect 2014 SF: The Business GraphGraphConnect 2014 SF: The Business Graph
GraphConnect 2014 SF: The Business GraphNeo4j
 
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark DataGraph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark DataNeo4j
 
GraphDay Noble/Coolio
GraphDay Noble/CoolioGraphDay Noble/Coolio
GraphDay Noble/CoolioNeo4j
 
Meetup Analytics with R and Neo4j
Meetup Analytics with R and Neo4jMeetup Analytics with R and Neo4j
Meetup Analytics with R and Neo4jNeo4j
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2Neo4j
 
GraphTalk - Semantische Netze mit structr
GraphTalk - Semantische Netze mit structrGraphTalk - Semantische Netze mit structr
GraphTalk - Semantische Netze mit structrNeo4j
 
GraphTalk Frankfurt - Einführung in Graphdatenbanken
GraphTalk Frankfurt - Einführung in GraphdatenbankenGraphTalk Frankfurt - Einführung in Graphdatenbanken
GraphTalk Frankfurt - Einführung in GraphdatenbankenNeo4j
 
Graph all the things - PRathle
Graph all the things - PRathleGraph all the things - PRathle
Graph all the things - PRathleNeo4j
 
GraphTalks - Einführung
GraphTalks - EinführungGraphTalks - Einführung
GraphTalks - EinführungNeo4j
 
GraphTalk - Semantisches PDM bei Schleich
GraphTalk - Semantisches PDM bei Schleich GraphTalk - Semantisches PDM bei Schleich
GraphTalk - Semantisches PDM bei Schleich Neo4j
 
RDBMS to Graphs
RDBMS to GraphsRDBMS to Graphs
RDBMS to GraphsNeo4j
 

Viewers also liked (20)

Graph all the things
Graph all the thingsGraph all the things
Graph all the things
 
Neo4j Makes Graphs Easy
Neo4j Makes Graphs EasyNeo4j Makes Graphs Easy
Neo4j Makes Graphs Easy
 
Transparency One : La (re)découverte de la chaîne d'approvisionnement
Transparency One : La (re)découverte de la chaîne d'approvisionnementTransparency One : La (re)découverte de la chaîne d'approvisionnement
Transparency One : La (re)découverte de la chaîne d'approvisionnement
 
Graph Your Business - GraphDay JimWebber
Graph Your Business - GraphDay JimWebberGraph Your Business - GraphDay JimWebber
Graph Your Business - GraphDay JimWebber
 
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
 
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucher
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucherNeo4j Makes Graphs Easy? - GraphDay AmandaLaucher
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucher
 
Graph your business
Graph your businessGraph your business
Graph your business
 
GraphTalk Frankfurt - Master Data Management bei der Bayerischen Versicherung
GraphTalk Frankfurt - Master Data Management bei der Bayerischen VersicherungGraphTalk Frankfurt - Master Data Management bei der Bayerischen Versicherung
GraphTalk Frankfurt - Master Data Management bei der Bayerischen Versicherung
 
Metadata and Access Control
Metadata and Access ControlMetadata and Access Control
Metadata and Access Control
 
GraphConnect 2014 SF: The Business Graph
GraphConnect 2014 SF: The Business GraphGraphConnect 2014 SF: The Business Graph
GraphConnect 2014 SF: The Business Graph
 
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark DataGraph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
 
GraphDay Noble/Coolio
GraphDay Noble/CoolioGraphDay Noble/Coolio
GraphDay Noble/Coolio
 
Meetup Analytics with R and Neo4j
Meetup Analytics with R and Neo4jMeetup Analytics with R and Neo4j
Meetup Analytics with R and Neo4j
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2
 
GraphTalk - Semantische Netze mit structr
GraphTalk - Semantische Netze mit structrGraphTalk - Semantische Netze mit structr
GraphTalk - Semantische Netze mit structr
 
GraphTalk Frankfurt - Einführung in Graphdatenbanken
GraphTalk Frankfurt - Einführung in GraphdatenbankenGraphTalk Frankfurt - Einführung in Graphdatenbanken
GraphTalk Frankfurt - Einführung in Graphdatenbanken
 
Graph all the things - PRathle
Graph all the things - PRathleGraph all the things - PRathle
Graph all the things - PRathle
 
GraphTalks - Einführung
GraphTalks - EinführungGraphTalks - Einführung
GraphTalks - Einführung
 
GraphTalk - Semantisches PDM bei Schleich
GraphTalk - Semantisches PDM bei Schleich GraphTalk - Semantisches PDM bei Schleich
GraphTalk - Semantisches PDM bei Schleich
 
RDBMS to Graphs
RDBMS to GraphsRDBMS to Graphs
RDBMS to Graphs
 

Similar to GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns

Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Introduction to LAVA Workload Scheduler
Introduction to LAVA Workload SchedulerIntroduction to LAVA Workload Scheduler
Introduction to LAVA Workload SchedulerNopparat Nopkuat
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsDatabricks
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Landon Robinson
 
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...Flink Forward
 
Closing Keynote
Closing KeynoteClosing Keynote
Closing KeynoteNeo4j
 
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamMalo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamFlink Forward
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectMao Geng
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist SoftServe
 
Unified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache BeamUnified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache BeamDataWorks Summit/Hadoop Summit
 
Productionizing your Streaming Jobs
Productionizing your Streaming JobsProductionizing your Streaming Jobs
Productionizing your Streaming JobsDatabricks
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringDatabricks
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceDataWorks Summit/Hadoop Summit
 
NodeJS guide for beginners
NodeJS guide for beginnersNodeJS guide for beginners
NodeJS guide for beginnersEnoch Joshua
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...Chris Fregly
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloITCamp
 
RUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войны
RUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войныRUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войны
RUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войныDenis Gundarev
 

Similar to GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns (20)

Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Introduction to LAVA Workload Scheduler
Introduction to LAVA Workload SchedulerIntroduction to LAVA Workload Scheduler
Introduction to LAVA Workload Scheduler
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
 
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
 
Closing Keynote
Closing KeynoteClosing Keynote
Closing Keynote
 
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamMalo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
Why meteor
Why meteorWhy meteor
Why meteor
 
Unified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache BeamUnified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache Beam
 
Productionizing your Streaming Jobs
Productionizing your Streaming JobsProductionizing your Streaming Jobs
Productionizing your Streaming Jobs
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open Source
 
NodeJS guide for beginners
NodeJS guide for beginnersNodeJS guide for beginners
NodeJS guide for beginners
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
 
RUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войны
RUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войныRUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войны
RUCUG: 10. Robert Morris:Жизнь в окопах виртуализационной войны
 

More from Neo4j

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AINeo4j
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignNeo4j
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Neo4j
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 

More from Neo4j (20)

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by Design
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 

Recently uploaded

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 

Recently uploaded (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 

GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns

  • 1. Graph at Scale 10/22/14 Brad Nussbaum CTO | MediaHound brad@mediahound.com | @bradnussbaum www.mediahound.com
  • 3. But Entertainment is more than just Film & TV
  • 4. Entertainment Preferences Are Scattered Music Movies Television Books
  • 5. All Content one platform that connects: All Sources All Devices All Brands All Artists All Users
  • 6. The Entertainment Graph A comprehensive database that brings together movies, books, games, music, and TV, including the cast & crew, sources, reviews, categories, genres, lists and more! The Entertainment Graph powers meaningful recommendations, exciting data insights and comprehensive social discovery.
  • 7. Use Cases: Academy Award Winners On Netflix MATCH (c:Collection)-[:CONTAINS]-(m:Movie) WHERE (m)- [:MATCHED_SOURCE]-(n:NETFLIX) AND c.name=“Academy Award Winners” RETURN m; Movies and Shows Based On Zombie Books MATCH (m:Media)-[:BASED_ON]-(b:Book)-[:HAS_TRAIT]-(z:Trait) WHERE (m.type=“Movie” OR m.type=“Show”) AND z.name=“Zombies” RETURN m; To access ‘The Movie Graph’ mini app (which uses a different model than above), from your browser, run :play movie graph
  • 8. Entertainment Recommendations Collaborative Filtering: If Joe, Amy and Steve like Gladiator, AND Joe and Amy like Toy Story, THEN MediaHound recommends Toy Story to Steve. Graph Influencers Joe is an early adopter. Joe, Amy and Steve like several things in common. MediaHound recommends Amy and Steve follow Joe and Amy Joe discovers the next big hit and shares it on his feed Amy and Steve see Steve’s post and give it a listen.
  • 9. We needed our graph database to perform under sustained user write load AND during heavy batch update operations. We needed to recommend media content in real-time which required many concurrent pattern matching operations on the graph.
  • 10. We realized through trial and error that using the Transactional Cypher HTTP Endpoint was the BEST solution to control batch writes. POST http://localhost:7474/db/data/transaction/commit Accept: application/json; charset=UTF-8 Content-Type: application/json { "statements": [ { "statement": "MERGE(n:User{username:"bradnussbaum"})-[:FOLLOWS]->(m:User{username:"bennussbaum"});" }, { "statement": "MERGE(n:User{username:"bradnussbaum"})<-[:FOLLOWS]-(m:User{username:"bennussbaum"});" } ] }
  • 11. We ran tests using the low-level kernel and found that sustained transaction writes performed optimally between 400-2,000 nodes and relationships per transaction. Compare the difference between… • Writing a single relationship (33 bytes) per transaction f or 10k iterations compared to • Writing 1k relationships per transaction for 10 iterations. **As of 2.2, Neo4j will batch writes on the server http://neo4j.com/docs/stable/linux-performance-guide.html git clone git@github.com:neo4j-contrib/tooling.git
  • 12. We used Enterprise Integration Patterns (EIP) to create optimal batch sizes for each transaction. • Splitters to break down larger messages • Aggregators to combine single CQL statements t ogether into a single batch transaction • Throttling to control concurrent requests and requests p er second
  • 13. We frequently run 40+ concurrent write transactions to a 3 instance cluster for hours at time. Deadlocks can occur often with many concurrent write o perations. • Retry Transient Errors after a small period of time. • Use the Error Index to split failed TX statements. Read here to learn all the error status codes, seriously. http://neo4j.com/docs/stable/status-codes.html
  • 14. Test write throughput on your cluster with the push factor you plan to use in production and intentionally kill your master under load. You need to have two load balancers: • One for all the instances you want performing reads • One for your master ONLY  (send writes here) Master Check: /db/manage/server/ha/master - Returns true|false Slave Check: /db/manage/server/ha/slave - Returns true|false Available Check: /db/manage/server/ha/available - Returns master|slave
  • 15. Check your driver for transaction support. Embedded mode has full transaction support but most remote drivers do not at this time. This will be changing in the near future…depending on which driver you use. **Spring Data Neo4j is actively being developed to included these features as part of 2.2.
  • 16. Graph at Scale 10/22/14 Ben Nussbaum Director of Engineering | MediaHound ben@mediahound.com | @bennussbaum www.mediahound.com
  • 17. We built custom algorithms that needed run-time decision making as Neo4j Extensions with Spring Data Neo4j. • Cache abstraction with Google’s Guava to build large in-memory indexes of nodes and relationships. • Integration for jobs instructions and results to and from the broker. • Async for batch job processing. https://github.com/AtomRain/neo4j-extensions
  • 18. We took advantage of spot processing from AWS to run our custom extension algoritms. • On-demand graph processing with as many instances at a time as needed (we have used up to 9). • Concurrent job operations per spot. • Cache optimizations based on Labels and context of the jobs.
  • 19. We built a flexible job controller that enables concurrent job processing on spot instances • Large jobs are broken into smaller jobs that can be processed by a single spot instance. • Spots process unit jobs and return results. If a spot dies, the job stays in the queue and another spot picks it up. • Memory and CPU constraints on an instance make this a necessity, especially when processing 30M+ songs.
  • 20. Spot instances run Neo4j in SINGLE mode and stay up to date using a Topic. Neo4j HA Spot 1 SINGLE Job Result MQ Spot 2 SINGLE Job Process MQ ESB SDN SDN TX Topic 1 2 3 4 5 1. ESB sends Jobs to MQ 2. Spots consume job instructions, process and send results b ack to MQ 3. ESB posts jobs results to HA 4. On successful post, send u pdates to Topic 5. Spots consume from Topic to s tay u to date
  • 21. Batch jobs return thousands of CQL statements which must not be dependent on any statements before or after. • Compound statements to create nodes and relationships for specific sub-graphs to avoid the need for layering wherever possible. If not… • Run jobs in a linear phases (layering) to create nodes first then connect relationships
  • 22. Graph at Scale 10/22/14 Q/A