Metadata is everywhere yet traditionally approaches to managing it have been disparate, siloed and often ineffective.
In this talk James will discuss the opportunities for using graph technology to address the fundamental challenges and questions of metadata management such as impact analysis, data lineage and definitions.
Data to Value are a Data Consultancy based in London that specialise in applying lean and agile techniques to complex data requirements. Connected Data is a particular focus for the firm which they see as the new frontier for data leaders.
James Phare has over 15 years experience of creating and leading data teams in various roles in Financial Services. Prior to cofounding Data Consultancy Data to Value he was Head of Information Management and Data Architecture at Man Group – one of the world’s largest Hedge funds. James started his career at Thomson Reuters after graduating in Economics from the University of York.
2. We help organisations get more value from their Data
Architecture
Lean Data specialists
Service delivery through:
Systems Integration
Onsite Consulting
Onsite / Offsite Managed Services
Data Strategy consulting
Regulatory,
compliance &
Financial Crime
Investment Management,
Operations & Research
Risk
Management
KYC, SCV
6. Who’s worked on a metadata centric
project?
Data Governance
Enterprise Architecture
Master Data Management
Enterprise Data Modelling
Enterprise Data Definition
Enterprise Data Warehousing
Enterprise Data Quality
Enterprise Data Integration
Legal / Regulatory e.g. GDPR, BCBS 239
Enterprise Knowledge Management
7.
8. Why are we so bad at
Enterprise Metadata
Management?
9. Most organisations attempts at managing
metadata have failed. Why?
Past failures
Scope & definitions
Data people don’t play well with other
data people
Business case
Approach & tooling
10.
11. Where do we typically bury store
metadata?
Data & IT Governance
tools
Enterprise
Architecture
Business Process
modelling
Data Modelling
Log
storage
CMDB
Policy & Standards
documents
12. How do we typically try to integrate &
report on metadata?
13. The ‘so what’ questions of metadata
Tell me which Data
Elements are most
critical
Tell me where
this value
originated &
where it goes
Help me understand &
enforce business &
technical rules
Tell me to which level
standards & policies
are adhered to and
help me
Provide me with rich &
interactive visualisations
rather than long policies
that sit on shared drives…
Help me understand
the context &
meaning of my data
Tell me which people, processes
& IT components are impacted
by an IT event
15. There’s lots of exciting (& scary) stuff
happening in the world of metadata right now
Forward Engineering /
Metadata OLTP
Reverse Engineering /
Metadata OLAP
16. The era of cheap storage & Data Lakes
Why don’t we
just retain
EVERYTHING
to be on the
safe side?
17.
18. Why don’t we treat &
manage metadata like
‘real’ data!?
19. The ‘so what’ questions of metadata
Tell me which Data
Elements are most
critical
Tell me where
this value
originated &
where it goes
Help me understand &
enforce business &
technical rules
Tell me to which level
standards & policies
are adhered to and
help me
Provide me with rich &
interactive visualisations
rather than long policies
that sit on shared drives…
Help me understand
the context &
meaning of my data
Tell me which people, processes
& IT components are impacted
by an IT event
20. Which downstream databases & processes
are affected by this data event / defect?
MATCH (n:DQTest)-[l*]->(C:Column)-[b]-(t:Table)-[y]-
(d:Database)-[x*1..3]-(p)
where p:Database OR p:Process
AND n.name = 'Address Check' return p
DQTest Column Table
Database
Process
21. Neo4j for metadata OLTP & OLAP requirements
Architecture
Forward Engineering / OLTP
Schemaless Graph model offers
flexibility as metadata requirements
evolve
Suitable for complex business rules
& data structures – hierarchies,
taxonomies etc.
Suitable for real-time metadata
requirements – alerting, schema
validation, real-time MDM / ETL etc.
Highly scalable
Reverse Engineering / OLAP
Flexible data model makes defining
constraints simple
Cypher – very simple & intuitive
Can apply empirical techniques to
traditionally contentious issues:
E.g. Definitions
Community support & online content
is great
22. A Neo based metadata lake
Metadata
Scientist
Architects,
Modellers &
BA’s
Reports & self-
service visualisations
Harvesting Metadata
‘Data’ & ‘IT’
teams
Enrich
Analyse & build apps
Analyse
Ingest
DDL
Enhance
23. Demo – interesting
Open Data demo.
Got an idea? Speak to
us
Rapid POCs – often in
weeks
Connected Data –
July 12th in Mayfair
Questions
http://connected-data.london/