SlideShare a Scribd company logo
1 of 38
Download to read offline
http://blazegraph.com/http://blazegraph.com/
Global Knowledge Collaboration to Cure
Cancer: How GPUs Impact Graph &
Predictive Analytics
Database Camp
July 10, 2016
Blazegraph
Brad Bebee, CEO
http://blazegraph.com/
Exploding Data Volumes Requires New Approach
for Relationship Insight
SYSTAP™, LLC. © 2006-2015 All Rights Reserved
Graph databases are designed to analyze diverse entities and relationships. 
Today’s datasets have billions of edges and nodes.
A (very small) Knowledge Graph
2
http://blazegraph.com/
Powers Business
with Billion Edge
Datasets
​ Information
Management
​ Retrieval
​ Industrial
​ IoT
​ Cyber
​ Defense
​ Intelligence
​ Financial
Services
​ Fraud
Detection
​ Life Sciences
​ Precision
Medicine
SYSTAP™, LLC. © 2006-2016 All Rights Reserved 3
Blazegraph Vision: The Scalable Solution for Graphs
500+ weekly downloads
Thousands of active deployments
Blazegraph
​ Enterprise
High Availability
​ Enterprise
GPU Acceleration
​ Embedded and
Single Server
Deployments
http://blazegraph.com/
Precision Medicine and Hospitals
•  5,686 Hospitals in the United States
•  Top 10% have 100B+ Edge Graph Problems
•  256 K-80 GPUs for a 100B Edge Graph Integration Application
•  Syapse Precision Medicine Video
4
3/10/15
5-10	B	Edge	Graph	Problems
http://blazegraph.com/
Uncovering influence links in molecular knowledge networks to streamline personalized medicine | Shin, Dmitriy et al.Journal of Biomedical
Informatics , Volume 52 , 394 - 405"
Finding the Next Cure for Cancer is a
Billion+ Edge Graph Challenge
http://blazegraph.com/
Graphs	Enable	People	to	Find	Knowledge	
A Bunch of Pages! An Answer
http://blazegraph.com/
Graph Analytic Applications: Cyber Defense Example
•  Path algorithms (BFS, SSSP,
APSP, CC) and their
extensions (closeness,
betweenness, etc)
•  Ranking algorithms (cardinality,
PageRank, BadRank,
SecureRank)
•  Clustering algorithms (canopy,
k-means, Jaccard, community
detection, etc)
•  Many others
7
http://blazegraph.com/
http://bit.ly/1UdP2Sn
Blazegraph Stands Out!
Wikimedia
Evaluation:
SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved 8
http://blazegraph.com/
Blazegraph™: Embedded and Single Server
•  High performance, Scalable
–  50B edges/node
–  RDF/SPARQL level query language
–  Efficient Graph Traversal
–  High 9s solution
•  Property graphs
–  Blueprints, gremlin, rextser
•  REST API (NSS)
•  Extension points
–  Stored queries for custom application logic on
the server.
–  Custom services & indices
–  Custom functions
–  Vertex-centric programs
Embedded Server
Standalone Server
9
JVM	
Journal	
WAR	
Journal
http://blazegraph.com/
Blazegraph™: High Availability
•  Shared nothing architecture
–  Same data on each node
–  Coordinate only at commit
–  Transparent load balancing
•  Scaling
–  50 billion triples or quads
–  Query throughput scales linearly
•  Self healing
–  Automatic failover
–  Automatic resync after disconnect
–  Online single node disaster recovery
•  Online Backup
–  Online snapshots (full backups)
–  HA Logs (incremental backups)
•  Point in time recovery (offline)
10
HAService	
Quorum	
k=3	
size=3	
follower	
leader	
HAService	
HAService
http://blazegraph.com/
The Billion-Edge Graph Challenge:
Scaling Up Requires the Right Paradigm and Hardware
SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved 11
https://datatake.files.wordpress.com/2015/09/latency.png
TypeofCacheorMemory
Access Latency Per Clock Cycle
4.3
3
4
11
11
11
14
18
38
167
0 50 100 150
L1 Cache sequential access
L1 Cache in Page Random access
L1 Cache in Full Random access
L2 Cache sequential access
L2 Cache in Page Random access
L2 Cache in Full Random access
L3 Cache sequential access
L3 Cache in Page Random access
L3 Cache in Full Random access
Main memory
CPU Cache Access Latencies in Clock Cycles
Graph Cache Thrash
The CPU just waits for graph
data from main memory...
http://blazegraph.com/
Blazegraph Multi-GPU: Extreme Scale
Traverse 1B Edges/Sec (GTEP) 40x more affordably!
SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved
Cray XMT-2
$~180K per GTEP
COST
COST
Large Hadoop Cluster
$~18M per GTEP 1 GTEP = 1 Billion Traversed Edges Per Second
Blazegraph with
GPU Clusters
$16K per GTEP (K40)
$4K per GTEP (Pascal)
http://blazegraph.com/
With familiar graph APIs; 200-300x acceleration with almost no code changes.
Enabling GPU Acceleration without Code Changes
SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved 13
Blazegraph GPU 
Acceleration
NVIDIA 

Tesla GPU
Graph DB
Blazegraph Plug-in for GPU Acceleration Solves “Graph Cache Thrash” 
Funded by and 
co-developed with:
http://blazegraph.com/
172,632 results
187ms on Blazegraph GPU
172,632 results
53,960ms on Blazegraph CPU
LUBM Query #9 U1000 (167,697,263 Edges w/Inference)
Just Add GPUs –Testing Shows 200-300x Speed-up
SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved 14
Graph
Database
Graph
Database
290X
http://blazegraph.com/
1403
1843
700
0
200
400
600
800
1000
1200
1400
1600
1800
2000
700M Edges Single Node
Xeon 2650 (128G) vs 2 K80
(48G)
1.98 Edges 16 EC2
r3.xlarge (488G) vs 16 K40s
(192G)
1.98 Edges 16 EC2
r3.4xlarge (1952G) vs 16
K40s (192G)
1.98 Edges Spark CPU
Baseline
GPUs 700x-1800x Faster for Graphs
Compared to Apache Spark on 700M and 1.9B Edge Graph
SYSTAP™, LLC. © 2006-2015 All Rights Reserved
Speed-up(Higherisfaster)
1
15
Speed-up over Baseline Spark CPU Configuration
http://blazegraph.com/
Did you hear the one about the Analytics Developer who is
also a Parallel Programming expert?
SYSTAP™, LLC
© 2006-2015 All Rights Reserved
16
Need:		Simpler,	smarter	algorithms	that	run	efficiently	on	GPUs	without	requiring	knowledge	of	the	parallel	
programming	or	device-specific	opTmizaTon.	
Algorithm Developer Parallel Programmer
http://blazegraph.com/
Blazegraph DASL for GPU Acceleration
SYSTAP™, LLC. © 2006-2015 All Rights Reserved 17
Blazegraph DASL – A Domain specific language for graphs with Accelerated Scala using Linear algebra
…you can see why we call it “DASL”
Automation Enabling Analytics Coders to Optimize for GPU
Delivering ease of use of Spark and Scala for graph algorithms and predictive
analytics… your analytic code can work on GPUs and Blazegraph.
DASL Executor Multi-GPU Extension
DASL Translator
DASL Graph Algorithms
…with the speed of CUDA…
…Graph and Machine
Learning Algorithms…
http://blazegraph.com/
DASL your analytics
•  DASL is functional, domain-specific language (DSL) for graph and machine
learning algorithms
•  Provides of linear algebra primitives supporting the DSL designed for efficient,
multi-GPU execution
•  Reduce the barrier of graphs on GPUS, enabling simpler and smarter
algorithms
•  Co-opt existing data ecosystems, such as Apache Spark and Hadoop, for ease
of adoption and mission impact
SYSTAP™, LLC
© 2006-2015 All Rights Reserved
18
3/10/1
5
Graph and Machine Learning
Algorithms
1000X
DASL Executor Multi-GPU Extension
DASL Translator
DASL Graph Algorithms
http://blazegraph.com/
Anatomy of a DASL
algorithm
19
http://blazegraph.com/
GPU-Accelerated DASL Algorithms
SYSTAP™, LLC. © 2006-2015 All Rights Reserved 20
Graph
•  Hierarchical
partitioning/graph
coarsening
•  Louvain modularity
•  Jaccard Similarity
•  Triangle counting
Collaborative
Filtering
(Recommendation Systems)
•  User-based, Item-
based
•  Matrix Factorization
with ALS (alternating
least squares)
•  Weighted Matrix
Factorization, SVD+
Supervised Neural
Network Techniques
•  Hidden Markov
Models
•  Multilayer Perceptron
Clustering
•  Canopy, k-Means,
Fuzzy k-Means,
Streaming k-Means
•  Spectral Clustering
Dimensionality
Reduction
•  Singular Value
Decomposition (top-k)
•  Lanczos Algorithm
•  Stochastic SVD
•  QR Decomposition
•  Non-Negative matrix
factorization (NNMF)
•  Distance Geometry
…and many others!
http://blazegraph.com/
Stay in Touch
Brad Bebee, CEO
beebs@blazegraph.com
http://blazegraph.com
@blazegraph
http://blazegraph.com/
Case Study - NetFlow Data Analysis
•  What is NetFlow data?
•  NetFlow is a network protocol developed by Cisco for
collecting IP traffic information and monitoring network
traffic. http://www.solarwinds.com/what-is-netflow.aspx
•  Sample NetFlow record
{"start_time":"2007-08-01 14:31:02.946”,”duration":0.000,
“src_addr":"122.166.71.110", “dst_addr":"9.155.118.136",
"src_port":10822,"dst_port":13567,"protocol":"UDP
“,"tcp_flags":"......","input_packets":1,"input_bytes":46}
•  How is NetFlow collected?
•  Generated by routers and forwarded to collection point
•  Processed out of pcap records using tools such as nfcapd/
nfdump
22
http://blazegraph.com/
Graph Algorithms Applicable to NetFlow
•  Path algorithms (BFS, SSSP, APSP,
CC) and their extensions (closeness,
betweenness, etc)
•  Ranking algorithms (cardinality,
PageRank, BadRank, SecureRank)
•  Clustering algorithms (canopy, k-
means, Jaccard, community detection,
etc)
•  Many others!
23
http://blazegraph.com/
Challenges of Network Flow Analysis
24
•  The volume of NetFlow packets could not previously be efficiently analyzed or
visualized to identify threats in real time
–  One hour of collection on a modest network can easily generate more than 100M NetFlow
records
–  Visualizing this data to identify points of interest is nearly impossible due to the “hairball
effect”, even after time windowing the data
–  These issues largely led to attempts at batch processing the data on large clusters of
machines using Hadoop and/or Spark
–  Pushing the problem into the batch space puts organizations in the unenviable position of
hoping to identify why and by whom their network was attacked last month rather than being
able to identify an attack as it is occurring.
http://blazegraph.com/
Initial Graph Filtered by PageRank using DASL
25
http://blazegraph.com/
Interactive Visual Query Session
26
http://blazegraph.com/
Visual Analysis - Identification of Exfiltrated Traffic
27
http://blazegraph.com/
Other new capabilities for high-speed, large-scale
graph processing
New technologies to deliver to deliver an HPC capability for
processing very large scale graphs at high speed.
•  Blazegraph DASL provides a high-level way to author graph
analytics and integrate with Apache Spark and Hadoop Data
ecosystems.
•  Power8 / OpenPOWER bring high speed interconnects for
moving data quickly (3X faster)
•  New NVIDIA P100 GPUs provide stacked memory, 720 GB/s
of memory bandwidth, and unified memory addressing.
SYSTAP™, LLC. © 2006-2015 All Rights Reserved 28
http://blazegraph.com/
Research Space
SYSTAP™, LLC. © 2006-2015 All Rights Reserved 29
http://blazegraph.com/
Users
Original
data
sources
FrontendBackend
Knowledge Graph Creation
Museum visitor
Museums and other sources
•  Data crawling
•  Data transformation (CIDOC-CRM)
•  Data Interlinking
•  Data enrichment / Information
extraction
Cards
Social networks
DBpedia	BriTsh	Museum	Data	 User	Data	
Mobile App
•  HTML5 Templates + CSS for mobile
devices
•  Google Glass App
•  QR Code recognition
•  Pattern / image recognition
•  Context reasoning
Knowledge Graph Exploration
•  Templates for visualization (CIDOC-CRM
and external data)
•  Timeline, Maps
•  PivotViewer for visual exploration
•  Semantic Search
•  Graph Analytics (GAS)
Website visitorData Expert
Research Space / British Museum Project Architecture
and Use Cases
http://blazegraph.com/
System will either have
their own terminology to
qualify place
Or
More commonly it will be
implicit and not explicit.
Relationships both
support and bypass this
heterogeneity at the same
time
http://blazegraph.com/
http://blazegraph.com/
FROM means
•  Thing has been
located at a place
or
•  Thing was created at a
place
or
•  Thing was created by
a person from a place
or
•  Thing was part of an
event that happened at
a place
or
•  Thing was acquired at
a place
Research Space / British Museum Video
http://blazegraph.com/
This subset of FROM is just “is/was located at” – These
are photographs that are located in India
http://blazegraph.com/
This subset of FROM is just “refers to” India – These
objects may not be in India, or from India, but make
some reference to India. India is a subject.
http://blazegraph.com/
Things from India created in the 17th century
http://blazegraph.com/
This is a life boat from a ship
(The Nancy Packet) wrecked
while retuning from India.
http://blazegraph.com/
Getting to 1 Trillion Edge Graphs on GPUs
-  1 GPU
-  K20 (6G) : 125M edges
-  K40 (12G) : 250M edges
-  K80 (24G) : 500M edges (Gemini)
-  M40 (24G) : 500M edges
-  Pascal (32G) : 750M edges (Future)
-  1 Node
-  8 x K80 : 4B edges
-  8 x Pascal : 6B edges
-  Cluster with 1T edges
-  2000 Pascal GPUs (242 nodes) (Future)
38

More Related Content

What's hot

Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Databricks
 
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UNMariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN✔ Eric David Benari, PMP
 
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemOptimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemDataWorks Summit
 
Druid Overview by Rachel Pedreschi
Druid Overview by Rachel PedreschiDruid Overview by Rachel Pedreschi
Druid Overview by Rachel PedreschiBrian Olsen
 
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseAttunity
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataDataWorks Summit/Hadoop Summit
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?Attunity
 
Qubole - Big data in cloud
Qubole - Big data in cloudQubole - Big data in cloud
Qubole - Big data in cloudDmitry Tolpeko
 
From Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache ApexFrom Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache ApexDataWorks Summit
 
Lessons Learned from Modernizing USCIS Data Analytics Platform
Lessons Learned from Modernizing USCIS Data Analytics PlatformLessons Learned from Modernizing USCIS Data Analytics Platform
Lessons Learned from Modernizing USCIS Data Analytics PlatformDatabricks
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesQubole
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkDatabricks
 
Building Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaBuilding Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaDatabricks
 
Big Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsBig Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsAshnikbiz
 

What's hot (20)

Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
 
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UNMariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
 
What's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and BeyondWhat's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and Beyond
 
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemOptimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystem
 
Druid Overview by Rachel Pedreschi
Druid Overview by Rachel PedreschiDruid Overview by Rachel Pedreschi
Druid Overview by Rachel Pedreschi
 
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data Warehouse
 
High-Scale Entity Resolution in Hadoop
High-Scale Entity Resolution in HadoopHigh-Scale Entity Resolution in Hadoop
High-Scale Entity Resolution in Hadoop
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
 
What database
What databaseWhat database
What database
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?
 
Qubole - Big data in cloud
Qubole - Big data in cloudQubole - Big data in cloud
Qubole - Big data in cloud
 
From Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache ApexFrom Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache Apex
 
Presto: SQL-on-anything
Presto: SQL-on-anythingPresto: SQL-on-anything
Presto: SQL-on-anything
 
Hadoop and HBase @eBay
Hadoop and HBase @eBayHadoop and HBase @eBay
Hadoop and HBase @eBay
 
Lessons Learned from Modernizing USCIS Data Analytics Platform
Lessons Learned from Modernizing USCIS Data Analytics PlatformLessons Learned from Modernizing USCIS Data Analytics Platform
Lessons Learned from Modernizing USCIS Data Analytics Platform
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
 
Building Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaBuilding Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks Delta
 
Big Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsBig Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and Blueprints
 

Similar to Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph

The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
DIY Netflow Data Analytic with ELK Stack by CL Lee
DIY Netflow Data Analytic with ELK Stack by CL LeeDIY Netflow Data Analytic with ELK Stack by CL Lee
DIY Netflow Data Analytic with ELK Stack by CL LeeMyNOG
 
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...confluent
 
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATLBryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATLMLconf
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysisAmazon Web Services
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...Lviv Startup Club
 
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4jWebinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4jNeo4j
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Databricks
 
Model-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsModel-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsCisco Canada
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 Databricks
 
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...DataWorks Summit
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at ScaleElasticsearch
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...DataStax
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemDan Eaton
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Scale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on AzureScale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on AzureAvi Networks
 

Similar to Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph (20)

The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
DIY Netflow Data Analytic with ELK Stack by CL Lee
DIY Netflow Data Analytic with ELK Stack by CL LeeDIY Netflow Data Analytic with ELK Stack by CL Lee
DIY Netflow Data Analytic with ELK Stack by CL Lee
 
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...
 
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATLBryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...Тарас Кльоба  "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
Тарас Кльоба "ETL — вже не актуальна; тривалі живі потоки із системою Apache...
 
AMD It's Time to ROC
AMD It's Time to ROCAMD It's Time to ROC
AMD It's Time to ROC
 
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4jWebinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
Model-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsModel-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data Analytics
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017
 
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at Scale
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Scale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on AzureScale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on Azure
 

More from ✔ Eric David Benari, PMP

SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...
SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...
SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...✔ Eric David Benari, PMP
 
Database Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTO
Database Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTODatabase Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTO
Database Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTO✔ Eric David Benari, PMP
 
Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...
Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...
Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...✔ Eric David Benari, PMP
 
NoSQL Object DB & NewSQL Columnar DB, A Tale of Two Databases
NoSQL Object DB & NewSQL Columnar DB, A Tale of Two DatabasesNoSQL Object DB & NewSQL Columnar DB, A Tale of Two Databases
NoSQL Object DB & NewSQL Columnar DB, A Tale of Two Databases✔ Eric David Benari, PMP
 
MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...
MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...
MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...✔ Eric David Benari, PMP
 
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...✔ Eric David Benari, PMP
 

More from ✔ Eric David Benari, PMP (6)

SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...
SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...
SVP of Couchbase: The Exciting World of NoSQL: Scaling NoSQL Data, N1QL vs. S...
 
Database Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTO
Database Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTODatabase Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTO
Database Camp 2016 @ United Nations, NYC - Javier de la Torre, CEO, CARTO
 
Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...
Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...
Database Camp 2016 @ United Nations, NYC - Minerva Tantoco, CTO of the City o...
 
NoSQL Object DB & NewSQL Columnar DB, A Tale of Two Databases
NoSQL Object DB & NewSQL Columnar DB, A Tale of Two DatabasesNoSQL Object DB & NewSQL Columnar DB, A Tale of Two Databases
NoSQL Object DB & NewSQL Columnar DB, A Tale of Two Databases
 
MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...
MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...
MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My! Scott Bonn...
 
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
 

Recently uploaded

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph

  • 1. http://blazegraph.com/http://blazegraph.com/ Global Knowledge Collaboration to Cure Cancer: How GPUs Impact Graph & Predictive Analytics Database Camp July 10, 2016 Blazegraph Brad Bebee, CEO
  • 2. http://blazegraph.com/ Exploding Data Volumes Requires New Approach for Relationship Insight SYSTAP™, LLC. © 2006-2015 All Rights Reserved Graph databases are designed to analyze diverse entities and relationships. Today’s datasets have billions of edges and nodes. A (very small) Knowledge Graph 2
  • 3. http://blazegraph.com/ Powers Business with Billion Edge Datasets ​ Information Management ​ Retrieval ​ Industrial ​ IoT ​ Cyber ​ Defense ​ Intelligence ​ Financial Services ​ Fraud Detection ​ Life Sciences ​ Precision Medicine SYSTAP™, LLC. © 2006-2016 All Rights Reserved 3 Blazegraph Vision: The Scalable Solution for Graphs 500+ weekly downloads Thousands of active deployments Blazegraph ​ Enterprise High Availability ​ Enterprise GPU Acceleration ​ Embedded and Single Server Deployments
  • 4. http://blazegraph.com/ Precision Medicine and Hospitals •  5,686 Hospitals in the United States •  Top 10% have 100B+ Edge Graph Problems •  256 K-80 GPUs for a 100B Edge Graph Integration Application •  Syapse Precision Medicine Video 4 3/10/15 5-10 B Edge Graph Problems
  • 5. http://blazegraph.com/ Uncovering influence links in molecular knowledge networks to streamline personalized medicine | Shin, Dmitriy et al.Journal of Biomedical Informatics , Volume 52 , 394 - 405" Finding the Next Cure for Cancer is a Billion+ Edge Graph Challenge
  • 7. http://blazegraph.com/ Graph Analytic Applications: Cyber Defense Example •  Path algorithms (BFS, SSSP, APSP, CC) and their extensions (closeness, betweenness, etc) •  Ranking algorithms (cardinality, PageRank, BadRank, SecureRank) •  Clustering algorithms (canopy, k-means, Jaccard, community detection, etc) •  Many others 7
  • 9. http://blazegraph.com/ Blazegraph™: Embedded and Single Server •  High performance, Scalable –  50B edges/node –  RDF/SPARQL level query language –  Efficient Graph Traversal –  High 9s solution •  Property graphs –  Blueprints, gremlin, rextser •  REST API (NSS) •  Extension points –  Stored queries for custom application logic on the server. –  Custom services & indices –  Custom functions –  Vertex-centric programs Embedded Server Standalone Server 9 JVM Journal WAR Journal
  • 10. http://blazegraph.com/ Blazegraph™: High Availability •  Shared nothing architecture –  Same data on each node –  Coordinate only at commit –  Transparent load balancing •  Scaling –  50 billion triples or quads –  Query throughput scales linearly •  Self healing –  Automatic failover –  Automatic resync after disconnect –  Online single node disaster recovery •  Online Backup –  Online snapshots (full backups) –  HA Logs (incremental backups) •  Point in time recovery (offline) 10 HAService Quorum k=3 size=3 follower leader HAService HAService
  • 11. http://blazegraph.com/ The Billion-Edge Graph Challenge: Scaling Up Requires the Right Paradigm and Hardware SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved 11 https://datatake.files.wordpress.com/2015/09/latency.png TypeofCacheorMemory Access Latency Per Clock Cycle 4.3 3 4 11 11 11 14 18 38 167 0 50 100 150 L1 Cache sequential access L1 Cache in Page Random access L1 Cache in Full Random access L2 Cache sequential access L2 Cache in Page Random access L2 Cache in Full Random access L3 Cache sequential access L3 Cache in Page Random access L3 Cache in Full Random access Main memory CPU Cache Access Latencies in Clock Cycles Graph Cache Thrash The CPU just waits for graph data from main memory...
  • 12. http://blazegraph.com/ Blazegraph Multi-GPU: Extreme Scale Traverse 1B Edges/Sec (GTEP) 40x more affordably! SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved Cray XMT-2 $~180K per GTEP COST COST Large Hadoop Cluster $~18M per GTEP 1 GTEP = 1 Billion Traversed Edges Per Second Blazegraph with GPU Clusters $16K per GTEP (K40) $4K per GTEP (Pascal)
  • 13. http://blazegraph.com/ With familiar graph APIs; 200-300x acceleration with almost no code changes. Enabling GPU Acceleration without Code Changes SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved 13 Blazegraph GPU Acceleration NVIDIA 
 Tesla GPU Graph DB Blazegraph Plug-in for GPU Acceleration Solves “Graph Cache Thrash” Funded by and co-developed with:
  • 14. http://blazegraph.com/ 172,632 results 187ms on Blazegraph GPU 172,632 results 53,960ms on Blazegraph CPU LUBM Query #9 U1000 (167,697,263 Edges w/Inference) Just Add GPUs –Testing Shows 200-300x Speed-up SYSTAP, LLC DBA Blazegraph. © 2006-2016 All Rights Reserved 14 Graph Database Graph Database 290X
  • 15. http://blazegraph.com/ 1403 1843 700 0 200 400 600 800 1000 1200 1400 1600 1800 2000 700M Edges Single Node Xeon 2650 (128G) vs 2 K80 (48G) 1.98 Edges 16 EC2 r3.xlarge (488G) vs 16 K40s (192G) 1.98 Edges 16 EC2 r3.4xlarge (1952G) vs 16 K40s (192G) 1.98 Edges Spark CPU Baseline GPUs 700x-1800x Faster for Graphs Compared to Apache Spark on 700M and 1.9B Edge Graph SYSTAP™, LLC. © 2006-2015 All Rights Reserved Speed-up(Higherisfaster) 1 15 Speed-up over Baseline Spark CPU Configuration
  • 16. http://blazegraph.com/ Did you hear the one about the Analytics Developer who is also a Parallel Programming expert? SYSTAP™, LLC © 2006-2015 All Rights Reserved 16 Need: Simpler, smarter algorithms that run efficiently on GPUs without requiring knowledge of the parallel programming or device-specific opTmizaTon. Algorithm Developer Parallel Programmer
  • 17. http://blazegraph.com/ Blazegraph DASL for GPU Acceleration SYSTAP™, LLC. © 2006-2015 All Rights Reserved 17 Blazegraph DASL – A Domain specific language for graphs with Accelerated Scala using Linear algebra …you can see why we call it “DASL” Automation Enabling Analytics Coders to Optimize for GPU Delivering ease of use of Spark and Scala for graph algorithms and predictive analytics… your analytic code can work on GPUs and Blazegraph. DASL Executor Multi-GPU Extension DASL Translator DASL Graph Algorithms …with the speed of CUDA… …Graph and Machine Learning Algorithms…
  • 18. http://blazegraph.com/ DASL your analytics •  DASL is functional, domain-specific language (DSL) for graph and machine learning algorithms •  Provides of linear algebra primitives supporting the DSL designed for efficient, multi-GPU execution •  Reduce the barrier of graphs on GPUS, enabling simpler and smarter algorithms •  Co-opt existing data ecosystems, such as Apache Spark and Hadoop, for ease of adoption and mission impact SYSTAP™, LLC © 2006-2015 All Rights Reserved 18 3/10/1 5 Graph and Machine Learning Algorithms 1000X DASL Executor Multi-GPU Extension DASL Translator DASL Graph Algorithms
  • 20. http://blazegraph.com/ GPU-Accelerated DASL Algorithms SYSTAP™, LLC. © 2006-2015 All Rights Reserved 20 Graph •  Hierarchical partitioning/graph coarsening •  Louvain modularity •  Jaccard Similarity •  Triangle counting Collaborative Filtering (Recommendation Systems) •  User-based, Item- based •  Matrix Factorization with ALS (alternating least squares) •  Weighted Matrix Factorization, SVD+ Supervised Neural Network Techniques •  Hidden Markov Models •  Multilayer Perceptron Clustering •  Canopy, k-Means, Fuzzy k-Means, Streaming k-Means •  Spectral Clustering Dimensionality Reduction •  Singular Value Decomposition (top-k) •  Lanczos Algorithm •  Stochastic SVD •  QR Decomposition •  Non-Negative matrix factorization (NNMF) •  Distance Geometry …and many others!
  • 21. http://blazegraph.com/ Stay in Touch Brad Bebee, CEO beebs@blazegraph.com http://blazegraph.com @blazegraph
  • 22. http://blazegraph.com/ Case Study - NetFlow Data Analysis •  What is NetFlow data? •  NetFlow is a network protocol developed by Cisco for collecting IP traffic information and monitoring network traffic. http://www.solarwinds.com/what-is-netflow.aspx •  Sample NetFlow record {"start_time":"2007-08-01 14:31:02.946”,”duration":0.000, “src_addr":"122.166.71.110", “dst_addr":"9.155.118.136", "src_port":10822,"dst_port":13567,"protocol":"UDP “,"tcp_flags":"......","input_packets":1,"input_bytes":46} •  How is NetFlow collected? •  Generated by routers and forwarded to collection point •  Processed out of pcap records using tools such as nfcapd/ nfdump 22
  • 23. http://blazegraph.com/ Graph Algorithms Applicable to NetFlow •  Path algorithms (BFS, SSSP, APSP, CC) and their extensions (closeness, betweenness, etc) •  Ranking algorithms (cardinality, PageRank, BadRank, SecureRank) •  Clustering algorithms (canopy, k- means, Jaccard, community detection, etc) •  Many others! 23
  • 24. http://blazegraph.com/ Challenges of Network Flow Analysis 24 •  The volume of NetFlow packets could not previously be efficiently analyzed or visualized to identify threats in real time –  One hour of collection on a modest network can easily generate more than 100M NetFlow records –  Visualizing this data to identify points of interest is nearly impossible due to the “hairball effect”, even after time windowing the data –  These issues largely led to attempts at batch processing the data on large clusters of machines using Hadoop and/or Spark –  Pushing the problem into the batch space puts organizations in the unenviable position of hoping to identify why and by whom their network was attacked last month rather than being able to identify an attack as it is occurring.
  • 27. http://blazegraph.com/ Visual Analysis - Identification of Exfiltrated Traffic 27
  • 28. http://blazegraph.com/ Other new capabilities for high-speed, large-scale graph processing New technologies to deliver to deliver an HPC capability for processing very large scale graphs at high speed. •  Blazegraph DASL provides a high-level way to author graph analytics and integrate with Apache Spark and Hadoop Data ecosystems. •  Power8 / OpenPOWER bring high speed interconnects for moving data quickly (3X faster) •  New NVIDIA P100 GPUs provide stacked memory, 720 GB/s of memory bandwidth, and unified memory addressing. SYSTAP™, LLC. © 2006-2015 All Rights Reserved 28
  • 29. http://blazegraph.com/ Research Space SYSTAP™, LLC. © 2006-2015 All Rights Reserved 29
  • 30. http://blazegraph.com/ Users Original data sources FrontendBackend Knowledge Graph Creation Museum visitor Museums and other sources •  Data crawling •  Data transformation (CIDOC-CRM) •  Data Interlinking •  Data enrichment / Information extraction Cards Social networks DBpedia BriTsh Museum Data User Data Mobile App •  HTML5 Templates + CSS for mobile devices •  Google Glass App •  QR Code recognition •  Pattern / image recognition •  Context reasoning Knowledge Graph Exploration •  Templates for visualization (CIDOC-CRM and external data) •  Timeline, Maps •  PivotViewer for visual exploration •  Semantic Search •  Graph Analytics (GAS) Website visitorData Expert Research Space / British Museum Project Architecture and Use Cases
  • 31. http://blazegraph.com/ System will either have their own terminology to qualify place Or More commonly it will be implicit and not explicit. Relationships both support and bypass this heterogeneity at the same time
  • 33. http://blazegraph.com/ FROM means •  Thing has been located at a place or •  Thing was created at a place or •  Thing was created by a person from a place or •  Thing was part of an event that happened at a place or •  Thing was acquired at a place Research Space / British Museum Video
  • 34. http://blazegraph.com/ This subset of FROM is just “is/was located at” – These are photographs that are located in India
  • 35. http://blazegraph.com/ This subset of FROM is just “refers to” India – These objects may not be in India, or from India, but make some reference to India. India is a subject.
  • 36. http://blazegraph.com/ Things from India created in the 17th century
  • 37. http://blazegraph.com/ This is a life boat from a ship (The Nancy Packet) wrecked while retuning from India.
  • 38. http://blazegraph.com/ Getting to 1 Trillion Edge Graphs on GPUs -  1 GPU -  K20 (6G) : 125M edges -  K40 (12G) : 250M edges -  K80 (24G) : 500M edges (Gemini) -  M40 (24G) : 500M edges -  Pascal (32G) : 750M edges (Future) -  1 Node -  8 x K80 : 4B edges -  8 x Pascal : 6B edges -  Cluster with 1T edges -  2000 Pascal GPUs (242 nodes) (Future) 38