SlideShare a Scribd company logo
1 of 37
Public
11th May 2017
Syed Haniff
Creating a Data Distribution
Knowledge Base using Neo4j
Using graph technologies to map and manage data flows within the Bank
1
 Reference data at UBS
 Building an integrated data distribution platform
 Creating a Knowledge Base using Neo4j
Overview
2
 Founded 1854
 Headquarters: Zurich, Switzerland
 Operates in 50+ countries
 Around 60,000 employees
 6 Businesses
– Wealth Management
– Wealth Management Americas
– Personal & Corporate Banking
– Asset Management
– Investment Bank
– Corporate Centre
About UBS
3
GDS manages the mastering and distribution of reference data to consumers
within the Bank.
About Group Data Services
4
 Externally and internally sourced non-transactional data:
Reference Data at UBS
Account Book Calendar Client
Confirms
Financial
Instrument
Legal Entity
Group
Dictionary
Prices Product
Trading
Agreement
Settlement
Instruction
Account Book Calendar Client
Legal Entity
5
12 Data Domains
18 Datasets
7 Distribution Channels
400+ Integrations
000s Attributes
Group Data Services in Numbers
6
Providing timely, accurate, and complete reference data to users, systems, and
processes through a number of channels.
Reference Data Distribution
7
 Masters send normalized, canonical
datasets.
 Consumers land and join datasets
themselves
 Good for producers (master data
sources) … Not so good for consumers
FeaturesOverview
Data Distribution – Previously
8
Example – Consumer joins
Consumers store multiple messages from multiple domains and resolve joins
themselves
9
Driver Situation Impact
Simplification Multiple components doing the same
/ similar tasks.
Cost+
Complexity+
Risk Reduction Consumers have to store and join
reference data
Data Staleness+
Potential for errors+
Efficiency Consumers have to receive updates
where they are not interested
Storage volumes+
Processing volumes+
Business Drivers for Change
10
 Single platform consuming
data from masters
 Platform integrates datasets
 Custom or normalized
datasets sent via
standardized channels
FeaturesOverview
Distribution Platform – Blueprint
11
Example – Platform joins
Data joined at source and available for multiple consumers – simplifies consumption
12
 Single Platform
 Pre-joined datasets
 Flexible subscription to attributes
 More consumer-oriented …
But there are still things we'd like to know …
Platform Benefits
13
What datasets and
attributes do we
provide?
Data Distribution – Questions
14
What datasets and
attributes do we
provide?
How are the
different datasets
related?
Data Distribution – Questions
15
What datasets and
attributes do we
provide?
How are the
different datasets
related?
How are users
receiving our data?
16
What datasets and
attributes do we
provide?
How are the
different datasets
related?
How are users
receiving our data?
Which consumers
are using which
attributes?
Data Distribution – Questions
17
What datasets and
attributes do we
provide?
How are the
different datasets
related?
How are users
receiving our data?
Which consumers
are using which
attributes?
Knowledge
Base
Data Distribution – Questions
18
A system component that lets us describe the journey of the
datasets and attributes from master systems to consumers
What is the Knowledge Base?
19
Building the Knowledge Base – Example Model
20
 Initially, platform (not human) requirements
 XLS + custom DSL (Domain Specific Language)
 E.g. composite INSTRUMENT dataset
– BOND_BONDRATING, EQUITY_EQUITYRATING
,  union between two data sets
_ join between two datasets
  Innovative and allowed us to build platform
  Limited, Complex, Inflexible
Physical Model – 1.0
21
Can it answer our questions …?
22
 Challenging making a relational model that answers all the (diverse) questions
 Lots of different entities …
 Lots of different relationships …
 Not all data flows are the same …
 Tough to get performance needed with a generic relational model
… Not really or easily anyway
23
The "Eureka!" moment …
Looks like a graph …
maybe we should store
as a graph(!)
24
 Store the metamodel in a graph database
 Neo4j
– Used in the Bank
– Mature
– Comprehensive resources online
– Drivers / Adapters matching language choices
Physical Model – 2.0
25
Example Dataset – Equity
26
Answers to the questions …
What datasets
and attributes
do we provide?
MATCH
(d:Dataset)-[:OWNS]->(a:PhysicalAttribute)
RETURN d, a;
CYPHER QUERY
27
Answers to the questions …
How are the
different
datasets
related?
MATCH
(d1:Dataset)<-[:JOINS]-(j:JoinRelation),
(d2:Dataset)<-[:JOINS]-(j)
RETURN d1,j,d2;
CYPHER QUERY
28
Answers to the questions …
How are users
receiving our
data
MATCH (c:Consumer)-
[:RECEIVES_VIA|:INTERESTED_IN]->(v)
RETURN c, v
CYPHER QUERY
29
Answers to the questions …
Which
consumers are
using which
attributes?
MATCH (c:Consumer)-[:INTERESTED_IN]->(view:Dataset),
(view)-[:SELECTS]->(output:Dataset),
(output)<-[:TARGET_OF]-(aggregation:Transformer)-
[:SOURCE_OF]->(aggregate:Dataset),
(aggregate)-[:OWNS]->(parts:Dataset),
(parts)-[:OWNS]->(a:PhysicalAttribute)
RETURN c, view, output, aggregation, aggregate,
parts, a
CYPHER QUERY
30
 Single source of truth
 Governance and lineage easier
 New insights for consumers
 New insights for producers!
Knowledge Base – Benefits
31
 Coverage – not all datasets entered yet
 Lots of data – we store source, interim, target datasets
 Concept can be a bit intangible at times
Knowledge Base - Challenges
32
 Data Distribution is a natural "flow" from one processing node to another
 Ad-hoc relationship traversal difficult in relational databases
 Flexibility essential
– New sources, datasets, consumers, rules, …
 Everything is an instance
– Model very organic by focusing on relationship between processing nodes rather than structure
How did a graph database help?
33
 Answers our questions … and more
 Flexible schema  Can model different flows
 Easy(-ish) Query Language  Cypher
 Easy to create platform service layer
 Good performance
 Good support from vendor
Neo4j – Benefits
34
 Loading data required manual work
 No out-of-the-box tools to manage the data
 Skills rare … but easy to grow
Neo4j – Challenges
35
 Focus on human interactions
– Better search
– Better visualisation
 Widen coverage of datasets
 Offer to other parts of Bank
 Impact Analysis tools
 Self-service data integration
Next steps
36
Thank you!

More Related Content

What's hot

Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jNeo4j
 
Intro to Neo4j Webinar
Intro to Neo4j WebinarIntro to Neo4j Webinar
Intro to Neo4j WebinarNeo4j
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j
 
Neo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j
 
Translating the Human Analog to Digital with Graphs
Translating the Human Analog to Digital with GraphsTranslating the Human Analog to Digital with Graphs
Translating the Human Analog to Digital with GraphsNeo4j
 
GraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform OverviewGraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform OverviewNeo4j
 
GraphTalks Rome - Selecting the right Technology
GraphTalks Rome - Selecting the right TechnologyGraphTalks Rome - Selecting the right Technology
GraphTalks Rome - Selecting the right TechnologyNeo4j
 
The Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge GraphThe Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge GraphCambridge Semantics
 
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningRisk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningCambridge Semantics
 
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...Neo4j
 
GraphConnect Europe 2016 - Opening Keynote, Emil Eifrem
GraphConnect Europe 2016 - Opening Keynote, Emil EifremGraphConnect Europe 2016 - Opening Keynote, Emil Eifrem
GraphConnect Europe 2016 - Opening Keynote, Emil EifremNeo4j
 
A Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain OptimizationA Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain OptimizationNeo4j
 
Network and IT Operations
Network and IT OperationsNetwork and IT Operations
Network and IT OperationsNeo4j
 
Fireside Chat with Bloor Research: State of the Graph Database Market 2020
Fireside Chat with Bloor Research: State of the Graph Database Market 2020Fireside Chat with Bloor Research: State of the Graph Database Market 2020
Fireside Chat with Bloor Research: State of the Graph Database Market 2020Cambridge Semantics
 
Graphs in Action
Graphs in ActionGraphs in Action
Graphs in ActionNeo4j
 
Neanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeo4j
 
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4jGraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4jNeo4j
 
Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Cambridge Semantics
 
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data Virtuality
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data VirtualityBeyond the Data Lake - Matthias Korn, Technical Consultant at Data Virtuality
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data VirtualityDataconomy Media
 
GraphTalks - Einführung
GraphTalks - EinführungGraphTalks - Einführung
GraphTalks - EinführungNeo4j
 

What's hot (20)

Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
 
Intro to Neo4j Webinar
Intro to Neo4j WebinarIntro to Neo4j Webinar
Intro to Neo4j Webinar
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
 
Neo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j Graph Data Science - Webinar
Neo4j Graph Data Science - Webinar
 
Translating the Human Analog to Digital with Graphs
Translating the Human Analog to Digital with GraphsTranslating the Human Analog to Digital with Graphs
Translating the Human Analog to Digital with Graphs
 
GraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform OverviewGraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform Overview
 
GraphTalks Rome - Selecting the right Technology
GraphTalks Rome - Selecting the right TechnologyGraphTalks Rome - Selecting the right Technology
GraphTalks Rome - Selecting the right Technology
 
The Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge GraphThe Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge Graph
 
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningRisk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
 
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
 
GraphConnect Europe 2016 - Opening Keynote, Emil Eifrem
GraphConnect Europe 2016 - Opening Keynote, Emil EifremGraphConnect Europe 2016 - Opening Keynote, Emil Eifrem
GraphConnect Europe 2016 - Opening Keynote, Emil Eifrem
 
A Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain OptimizationA Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain Optimization
 
Network and IT Operations
Network and IT OperationsNetwork and IT Operations
Network and IT Operations
 
Fireside Chat with Bloor Research: State of the Graph Database Market 2020
Fireside Chat with Bloor Research: State of the Graph Database Market 2020Fireside Chat with Bloor Research: State of the Graph Database Market 2020
Fireside Chat with Bloor Research: State of the Graph Database Market 2020
 
Graphs in Action
Graphs in ActionGraphs in Action
Graphs in Action
 
Neanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeanex - Semantic Construction with Graphs
Neanex - Semantic Construction with Graphs
 
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4jGraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
 
Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?
 
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data Virtuality
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data VirtualityBeyond the Data Lake - Matthias Korn, Technical Consultant at Data Virtuality
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data Virtuality
 
GraphTalks - Einführung
GraphTalks - EinführungGraphTalks - Einführung
GraphTalks - Einführung
 

Similar to Creating a Data Distribution Knowledge Base using Neo4j, UBS

Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environmentSasha Citino
 
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesEducation Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesDenodo
 
Health Plan Survey Paper
Health Plan Survey PaperHealth Plan Survey Paper
Health Plan Survey PaperLisa Olive
 
UNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data MiningUNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data MiningNandakumar P
 
Jethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik QonnectionsJethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik QonnectionsRemy Rosenbaum
 
Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...Balvinder Hira
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your PortfolioDenodo
 
Incentius - Portfolio of Capabilities
Incentius - Portfolio of CapabilitiesIncentius - Portfolio of Capabilities
Incentius - Portfolio of CapabilitiesSujeet Pillai
 
CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...
CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...
CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...Capgemini
 
PIS Lecture notes principal of information systems
PIS Lecture notes principal of information systemsPIS Lecture notes principal of information systems
PIS Lecture notes principal of information systemsShukraShukra
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsDenodo
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIDenodo
 
Winning with data
Winning with dataWinning with data
Winning with dataNUS-ISS
 
Our Journey Implementing Business Intelligence
Our Journey Implementing Business IntelligenceOur Journey Implementing Business Intelligence
Our Journey Implementing Business IntelligenceDan Lantz
 
Business Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxBusiness Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxRupaRani28
 

Similar to Creating a Data Distribution Knowledge Base using Neo4j, UBS (20)

Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environment
 
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesEducation Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
 
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
 
Health Plan Survey Paper
Health Plan Survey PaperHealth Plan Survey Paper
Health Plan Survey Paper
 
UNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data MiningUNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data Mining
 
Sq lite module1
Sq lite module1Sq lite module1
Sq lite module1
 
Jethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik QonnectionsJethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik Qonnections
 
Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio
 
Incentius - Portfolio of Capabilities
Incentius - Portfolio of CapabilitiesIncentius - Portfolio of Capabilities
Incentius - Portfolio of Capabilities
 
CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...
CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...
CWIN17 san francisco-thomas dornis-2017 - Data concierge-The Foundation of a ...
 
PIS Lecture notes principal of information systems
PIS Lecture notes principal of information systemsPIS Lecture notes principal of information systems
PIS Lecture notes principal of information systems
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
Winning with data
Winning with dataWinning with data
Winning with data
 
Our Journey Implementing Business Intelligence
Our Journey Implementing Business IntelligenceOur Journey Implementing Business Intelligence
Our Journey Implementing Business Intelligence
 
Business Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxBusiness Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptx
 

More from Neo4j

EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AINeo4j
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignNeo4j
 

More from Neo4j (20)

EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by Design
 

Recently uploaded

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Recently uploaded (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Creating a Data Distribution Knowledge Base using Neo4j, UBS

  • 1. Public 11th May 2017 Syed Haniff Creating a Data Distribution Knowledge Base using Neo4j Using graph technologies to map and manage data flows within the Bank
  • 2. 1  Reference data at UBS  Building an integrated data distribution platform  Creating a Knowledge Base using Neo4j Overview
  • 3. 2  Founded 1854  Headquarters: Zurich, Switzerland  Operates in 50+ countries  Around 60,000 employees  6 Businesses – Wealth Management – Wealth Management Americas – Personal & Corporate Banking – Asset Management – Investment Bank – Corporate Centre About UBS
  • 4. 3 GDS manages the mastering and distribution of reference data to consumers within the Bank. About Group Data Services
  • 5. 4  Externally and internally sourced non-transactional data: Reference Data at UBS Account Book Calendar Client Confirms Financial Instrument Legal Entity Group Dictionary Prices Product Trading Agreement Settlement Instruction Account Book Calendar Client Legal Entity
  • 6. 5 12 Data Domains 18 Datasets 7 Distribution Channels 400+ Integrations 000s Attributes Group Data Services in Numbers
  • 7. 6 Providing timely, accurate, and complete reference data to users, systems, and processes through a number of channels. Reference Data Distribution
  • 8. 7  Masters send normalized, canonical datasets.  Consumers land and join datasets themselves  Good for producers (master data sources) … Not so good for consumers FeaturesOverview Data Distribution – Previously
  • 9. 8 Example – Consumer joins Consumers store multiple messages from multiple domains and resolve joins themselves
  • 10. 9 Driver Situation Impact Simplification Multiple components doing the same / similar tasks. Cost+ Complexity+ Risk Reduction Consumers have to store and join reference data Data Staleness+ Potential for errors+ Efficiency Consumers have to receive updates where they are not interested Storage volumes+ Processing volumes+ Business Drivers for Change
  • 11. 10  Single platform consuming data from masters  Platform integrates datasets  Custom or normalized datasets sent via standardized channels FeaturesOverview Distribution Platform – Blueprint
  • 12. 11 Example – Platform joins Data joined at source and available for multiple consumers – simplifies consumption
  • 13. 12  Single Platform  Pre-joined datasets  Flexible subscription to attributes  More consumer-oriented … But there are still things we'd like to know … Platform Benefits
  • 14. 13 What datasets and attributes do we provide? Data Distribution – Questions
  • 15. 14 What datasets and attributes do we provide? How are the different datasets related? Data Distribution – Questions
  • 16. 15 What datasets and attributes do we provide? How are the different datasets related? How are users receiving our data?
  • 17. 16 What datasets and attributes do we provide? How are the different datasets related? How are users receiving our data? Which consumers are using which attributes? Data Distribution – Questions
  • 18. 17 What datasets and attributes do we provide? How are the different datasets related? How are users receiving our data? Which consumers are using which attributes? Knowledge Base Data Distribution – Questions
  • 19. 18 A system component that lets us describe the journey of the datasets and attributes from master systems to consumers What is the Knowledge Base?
  • 20. 19 Building the Knowledge Base – Example Model
  • 21. 20  Initially, platform (not human) requirements  XLS + custom DSL (Domain Specific Language)  E.g. composite INSTRUMENT dataset – BOND_BONDRATING, EQUITY_EQUITYRATING ,  union between two data sets _ join between two datasets   Innovative and allowed us to build platform   Limited, Complex, Inflexible Physical Model – 1.0
  • 22. 21 Can it answer our questions …?
  • 23. 22  Challenging making a relational model that answers all the (diverse) questions  Lots of different entities …  Lots of different relationships …  Not all data flows are the same …  Tough to get performance needed with a generic relational model … Not really or easily anyway
  • 24. 23 The "Eureka!" moment … Looks like a graph … maybe we should store as a graph(!)
  • 25. 24  Store the metamodel in a graph database  Neo4j – Used in the Bank – Mature – Comprehensive resources online – Drivers / Adapters matching language choices Physical Model – 2.0
  • 27. 26 Answers to the questions … What datasets and attributes do we provide? MATCH (d:Dataset)-[:OWNS]->(a:PhysicalAttribute) RETURN d, a; CYPHER QUERY
  • 28. 27 Answers to the questions … How are the different datasets related? MATCH (d1:Dataset)<-[:JOINS]-(j:JoinRelation), (d2:Dataset)<-[:JOINS]-(j) RETURN d1,j,d2; CYPHER QUERY
  • 29. 28 Answers to the questions … How are users receiving our data MATCH (c:Consumer)- [:RECEIVES_VIA|:INTERESTED_IN]->(v) RETURN c, v CYPHER QUERY
  • 30. 29 Answers to the questions … Which consumers are using which attributes? MATCH (c:Consumer)-[:INTERESTED_IN]->(view:Dataset), (view)-[:SELECTS]->(output:Dataset), (output)<-[:TARGET_OF]-(aggregation:Transformer)- [:SOURCE_OF]->(aggregate:Dataset), (aggregate)-[:OWNS]->(parts:Dataset), (parts)-[:OWNS]->(a:PhysicalAttribute) RETURN c, view, output, aggregation, aggregate, parts, a CYPHER QUERY
  • 31. 30  Single source of truth  Governance and lineage easier  New insights for consumers  New insights for producers! Knowledge Base – Benefits
  • 32. 31  Coverage – not all datasets entered yet  Lots of data – we store source, interim, target datasets  Concept can be a bit intangible at times Knowledge Base - Challenges
  • 33. 32  Data Distribution is a natural "flow" from one processing node to another  Ad-hoc relationship traversal difficult in relational databases  Flexibility essential – New sources, datasets, consumers, rules, …  Everything is an instance – Model very organic by focusing on relationship between processing nodes rather than structure How did a graph database help?
  • 34. 33  Answers our questions … and more  Flexible schema  Can model different flows  Easy(-ish) Query Language  Cypher  Easy to create platform service layer  Good performance  Good support from vendor Neo4j – Benefits
  • 35. 34  Loading data required manual work  No out-of-the-box tools to manage the data  Skills rare … but easy to grow Neo4j – Challenges
  • 36. 35  Focus on human interactions – Better search – Better visualisation  Widen coverage of datasets  Offer to other parts of Bank  Impact Analysis tools  Self-service data integration Next steps