SlideShare a Scribd company logo
1 of 21
Download to read offline
Lizard 
Clustering an RDF Triplestore 
Andy Seaborne 
andy@a.o
Why Cluster? 
➢ Service Resilience 
○ Failures 
○ Server admin and security patches 
➢ Performance / scale 
○ More hardware : CPU, RAM, system bus
Why Cluster? 
➢ Alternatives 
○ Full clustering 
○ Master-slave (load scale only ; update issues) 
○ Application visible partitioning
Who am I? 
➢ Committer on Apache Jena 
○ Deploy/operate Jena/TDB in ££job. 
➢ W3C 
○ Co-editor on SPARQ 1.0 and 1.1 query lang 
spaces 
○ RDF 1.1 (on syntax, inc. SPARQL alignment) 
○ ASF’s W3C AC representative
Acknowledgements 
➢ Apache 
➢ Partial funding : InnovateUK* 
➢ Users 
○ For the discussion and encouragement 
* Used to be the Technology Strategy Board. 
UK Department for Business, Innovation & Skills
Outline 
➢ TDB Design 
➢ SPARQL Execution 
➢ Lizard Design 
➢ Back to SPARQL
TDB 
➢ Custom RDF Database 
○ Quads + Triples 
➢ Custom index code 
○ Threaded B+Trees 
➢ Dictionary terms 
○ NodeId – 8 bytes 
○ Inline terms (numbers, dates, dateTimes)
TDB : Node Table
TDB : Indexes 
➢ Indexes are covering 
○ Range scans 
○ All key, no value 
○ No "triple table"
SPARQL Execution 
{ ?x :p 123 . } 
Convert to NodeIds 
Look in POS to get all PO?, assign S to ?x 
123 is an inline constant in TDB. 
{ ?x :p 123 . 
?x :q ?v . } 
A database join 
Index join (Loop+substitution) 
Index join (= loop) on 
:x1 :q ?v 
where :x1 is the value of ?x
Index Implementation 
➢ TDB uses threaded B+Trees for indexes 
○ 8K blocks 100-way B+Tree 
○ Threaded : scan only touches data blocks 
SPO SPO SPO ------ ------ ------ 
Ptr Ptr ------ ------ ------ 
SPO SPO SPO SPO ------ ------ 
Ptr Ptr Ptr ------ ------ 
SPO SPO SPO SPO SPO SPO SPO SPO SPO SPO ------ ------
Choices 
Query and Update 
Indexes / B+Trees Node table / Objects 
Blocks Key → Value Store 
Where to introduce distribution?
This Does Not Work (very well) 
Query and Update 
B+Trees Objects 
Blocks Key→Value 
➢ Impedance mismatch 
Distribute the storage 
K->V store 
Index access on query processor 
○ Too much data moving about 
○ Little parallelism 
○ Bad cold-start
Distribute 
Query and Update 
B+Trees Objects 
Blocks Key→Value 
➢ Distribute the indexes 
○ With modified index access 
➢ Distribute the nodes 
➢ Comms : Apache Thrift
Clustered Node Table 
➢ Node Table 
○ N replicas; Read R / Write W 
e.g. W=N and R =1 => 
Complete copies of node table on each data server 
○ Replaceable 
Requirement: NodeId for naming
Clustered Indexes 
➢ Indexes 
○ Shard by subject 
○ Replica shards 
○ Compound access operations
Clustered Indexes 
Index 
Shard 1 Shard 2 Shard 3 
Machine 1 Machine 2
Configuration 1
Configuration 2
Modified SPARQL Execution 
➢ Different unit of index access 
○ subject + several predicates 
(subj, pred1, pred2, pred3, …) 
➢ Different join algorithms 
○ Merge join 
○ Parallel hash join
Paul Hirst / CC-BY-SA-2.5

More Related Content

What's hot

First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmInfoFarm
 
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKVPresentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKVKevin Xu
 
Terark Product and Technology
Terark Product and TechnologyTerark Product and Technology
Terark Product and TechnologyXinyuan Fu
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsThe HDF-EOS Tools and Information Center
 
FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...
FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...
FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...FIWARE
 
Partitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph ExecutionPartitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph Execution Chen Wu
 
Presto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@olaPresto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@olaShubham Tagra
 
(JVM) Garbage Collection - Brown Bag Session
(JVM) Garbage Collection - Brown Bag Session(JVM) Garbage Collection - Brown Bag Session
(JVM) Garbage Collection - Brown Bag SessionJens Hadlich
 
Improve data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDFImprove data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDFKentaro Yoshida
 
Managing Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure DataManaging Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure DataAki Ariga
 
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17Muga Nishizawa
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebookelliando dias
 
Introducing Koalas 1.0 (and 1.1)
Introducing Koalas 1.0 (and 1.1)Introducing Koalas 1.0 (and 1.1)
Introducing Koalas 1.0 (and 1.1)Takuya UESHIN
 

What's hot (20)

Usage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 data
Usage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 dataUsage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 data
Usage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 data
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
 
Implementing HDF5 in MATLAB
Implementing HDF5 in MATLABImplementing HDF5 in MATLAB
Implementing HDF5 in MATLAB
 
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKVPresentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV
 
Data Are from Mars, Tools Are from Venus
Data Are from Mars, Tools Are from VenusData Are from Mars, Tools Are from Venus
Data Are from Mars, Tools Are from Venus
 
Terark Product and Technology
Terark Product and TechnologyTerark Product and Technology
Terark Product and Technology
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
 
FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...
FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...
FIWARE Wednesday Webinars - Interface With Machines and Robots: Building Inte...
 
Introduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming ModelsIntroduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming Models
 
HDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSSHDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSS
 
Partitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph ExecutionPartitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph Execution
 
Presto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@olaPresto Bangalore Meetup1 Presto Raptor@ola
Presto Bangalore Meetup1 Presto Raptor@ola
 
Moving form HDF4 to HDF5/netCDF-4
Moving form HDF4 to HDF5/netCDF-4Moving form HDF4 to HDF5/netCDF-4
Moving form HDF4 to HDF5/netCDF-4
 
(JVM) Garbage Collection - Brown Bag Session
(JVM) Garbage Collection - Brown Bag Session(JVM) Garbage Collection - Brown Bag Session
(JVM) Garbage Collection - Brown Bag Session
 
Improve data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDFImprove data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDF
 
Managing Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure DataManaging Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure Data
 
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
 
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited DimensionHDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
 
Introducing Koalas 1.0 (and 1.1)
Introducing Koalas 1.0 (and 1.1)Introducing Koalas 1.0 (and 1.1)
Introducing Koalas 1.0 (and 1.1)
 

Similar to 2014-11 ApacheConEU : Lizard - Clustering an RDF TripleStore

Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Danielle Womboldt
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Community
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdfhellobank1
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course PROIDEA
 
Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017Junping Du
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureCeph Community
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data WarehousesConnor McDonald
 
Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightDataWorks Summit
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataData Con LA
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox PosterDBOnto
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Cloudera, Inc.
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Ceph Community
 
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Ceph Community
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionKaran Singh
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
AltaVista Search Engine Architecture
AltaVista Search Engine ArchitectureAltaVista Search Engine Architecture
AltaVista Search Engine ArchitectureChangshu Liu
 

Similar to 2014-11 ApacheConEU : Lizard - Clustering an RDF TripleStore (20)

Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
 
Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data Warehouses
 
Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsight
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and Snappydata
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox Poster
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
AltaVista Search Engine Architecture
AltaVista Search Engine ArchitectureAltaVista Search Engine Architecture
AltaVista Search Engine Architecture
 

More from andyseaborne

SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020andyseaborne
 
2016-02 Graphs - PG+RDF
2016-02 Graphs - PG+RDF2016-02 Graphs - PG+RDF
2016-02 Graphs - PG+RDFandyseaborne
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
Graph Data -- RDF and Property Graphs
Graph Data -- RDF and Property GraphsGraph Data -- RDF and Property Graphs
Graph Data -- RDF and Property Graphsandyseaborne
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)andyseaborne
 
NoSQL and Triple Stores
NoSQL and Triple StoresNoSQL and Triple Stores
NoSQL and Triple Storesandyseaborne
 

More from andyseaborne (6)

SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020
 
2016-02 Graphs - PG+RDF
2016-02 Graphs - PG+RDF2016-02 Graphs - PG+RDF
2016-02 Graphs - PG+RDF
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
Graph Data -- RDF and Property Graphs
Graph Data -- RDF and Property GraphsGraph Data -- RDF and Property Graphs
Graph Data -- RDF and Property Graphs
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)
 
NoSQL and Triple Stores
NoSQL and Triple StoresNoSQL and Triple Stores
NoSQL and Triple Stores
 

Recently uploaded

Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goahorny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goasexy call girls service in goa
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirtrahman018755
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663Call Girls Mumbai
 

Recently uploaded (20)

Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goahorny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goa
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
 

2014-11 ApacheConEU : Lizard - Clustering an RDF TripleStore

  • 1. Lizard Clustering an RDF Triplestore Andy Seaborne andy@a.o
  • 2. Why Cluster? ➢ Service Resilience ○ Failures ○ Server admin and security patches ➢ Performance / scale ○ More hardware : CPU, RAM, system bus
  • 3. Why Cluster? ➢ Alternatives ○ Full clustering ○ Master-slave (load scale only ; update issues) ○ Application visible partitioning
  • 4. Who am I? ➢ Committer on Apache Jena ○ Deploy/operate Jena/TDB in ££job. ➢ W3C ○ Co-editor on SPARQ 1.0 and 1.1 query lang spaces ○ RDF 1.1 (on syntax, inc. SPARQL alignment) ○ ASF’s W3C AC representative
  • 5. Acknowledgements ➢ Apache ➢ Partial funding : InnovateUK* ➢ Users ○ For the discussion and encouragement * Used to be the Technology Strategy Board. UK Department for Business, Innovation & Skills
  • 6. Outline ➢ TDB Design ➢ SPARQL Execution ➢ Lizard Design ➢ Back to SPARQL
  • 7. TDB ➢ Custom RDF Database ○ Quads + Triples ➢ Custom index code ○ Threaded B+Trees ➢ Dictionary terms ○ NodeId – 8 bytes ○ Inline terms (numbers, dates, dateTimes)
  • 8. TDB : Node Table
  • 9. TDB : Indexes ➢ Indexes are covering ○ Range scans ○ All key, no value ○ No "triple table"
  • 10. SPARQL Execution { ?x :p 123 . } Convert to NodeIds Look in POS to get all PO?, assign S to ?x 123 is an inline constant in TDB. { ?x :p 123 . ?x :q ?v . } A database join Index join (Loop+substitution) Index join (= loop) on :x1 :q ?v where :x1 is the value of ?x
  • 11. Index Implementation ➢ TDB uses threaded B+Trees for indexes ○ 8K blocks 100-way B+Tree ○ Threaded : scan only touches data blocks SPO SPO SPO ------ ------ ------ Ptr Ptr ------ ------ ------ SPO SPO SPO SPO ------ ------ Ptr Ptr Ptr ------ ------ SPO SPO SPO SPO SPO SPO SPO SPO SPO SPO ------ ------
  • 12. Choices Query and Update Indexes / B+Trees Node table / Objects Blocks Key → Value Store Where to introduce distribution?
  • 13. This Does Not Work (very well) Query and Update B+Trees Objects Blocks Key→Value ➢ Impedance mismatch Distribute the storage K->V store Index access on query processor ○ Too much data moving about ○ Little parallelism ○ Bad cold-start
  • 14. Distribute Query and Update B+Trees Objects Blocks Key→Value ➢ Distribute the indexes ○ With modified index access ➢ Distribute the nodes ➢ Comms : Apache Thrift
  • 15. Clustered Node Table ➢ Node Table ○ N replicas; Read R / Write W e.g. W=N and R =1 => Complete copies of node table on each data server ○ Replaceable Requirement: NodeId for naming
  • 16. Clustered Indexes ➢ Indexes ○ Shard by subject ○ Replica shards ○ Compound access operations
  • 17. Clustered Indexes Index Shard 1 Shard 2 Shard 3 Machine 1 Machine 2
  • 20. Modified SPARQL Execution ➢ Different unit of index access ○ subject + several predicates (subj, pred1, pred2, pred3, …) ➢ Different join algorithms ○ Merge join ○ Parallel hash join
  • 21. Paul Hirst / CC-BY-SA-2.5