Watch this webinar and learn how Neo4j and ICC Technology can help you remove risk from your data governance by improving the way you approach data lineage. We’ll cover some of the common approaches, driving regulations and biggest risks for banks and finances services.
-Find out how Data Lineage is becoming more complex for Banks and Financial Services companies
-Learn how a native-graph model can improve tracing data sources to targets as well as store transformations.
-Watch a demonstration on how you might approach regulations such as BCBS 239
4. Protect Your Enterprise, Illustrate “Trustworthy”
You must show good data management practices, policies, processes (systems) and awareness
What data do you
have?
• Data Asset Inventory
Why do you have
this data?
• Trace data to its usage,
cleanse the data
Where is this
data?
• Which system(s), physical
location of data, data
movement
How did you get
this data?
• Traceability and irrefutable
proof of data source
When did you get
this data?
• Timestamped data
acquisition, access, transfer
Who has access to
this data?
• People (training), processes
& systems
Is the data Secure?
• Robust data management
lifecycle and security
practices
Do you maintain a
map of this data?
• Is all of this meta-data
available in a connected
fashion
Acquire/Use/Transfer with permission Respect ”right to be forgotten” / “right to modification” Plan for Data Breaches
5. Great! so what do we really need to do?
• Implement a data governance platform
• Data definition via business glossary mapped to
implementation detail
• Tracking create / update / access / deletes of data
• Tying relevant processes that operate on regulated data
• Building reverse lineage capability to map the data flow
• Define data lifecycle management process and policies
• Implement a visual dashboard of KPIs
• Provide a portal and programmatic interface for individuals
• access/update their data, provide/revoke consent, transfer data & view rights
• Create a regulatory governance steering group
6. Today’s Presenters
Senior Director – Global Solutions
Senior Data Scientist
Data Governance Analyst
Banking Principal
• Consultant generating business insights
and solutions through analytics
• Neo4j product lead at ICC
• Former Neuroscience professor and
researcher
• Consultant generating data
governance and data lineage solutions
• Consensus product lead at ICC
• CCAR & Regulatory Reporting
Solutions Specialist
• Consultant for Banking Data
Analysis (Finance & Risk)
• Process Automation Engineering
• Responsible for solutions
development to enhance the
core value of Neo4j connected
data platform
• 20+ years experience with
Solutions Development & Sales
with leading Consulting
companies
Neo4j, Inc.
Nav Mathur Kelsey Bieri
Jonathan Renner Lee Hong
7. Information, data, and graphics/drawings embodied in this document are strictly confidential and are supplied on the understanding that they will be held confidential and not disclosed to third parties without prior written consent of ICC.
Who is ICC?
A User Centered, Data Driven,
Technology Development Company
• Scale: 570+ Consultants
• 35+ years Tested Technology Consulting Platform
• End-to-End Technology Development
• Research and Design Discipline
• Comprehensive Analytics Practice
• Vertical Subject Matter Expertise and Accelerators
• Neo4j Solutions Partner
8. Agenda for Today’s Webinar
• Business Problem – Regulatory Compliance in Financial Services
• Specific focus – BCBS 239
• Comprehensive data lineage tracking as a cornerstone solution
• Data Lineage in Practice
• Pain points and hurdles of existing solutions
• Costs and inefficiencies
• Innovative Data Lineage with Neo4j
• Data modeling
• Getting Return on Investment
10. Business Challenges with Regulatory Compliance in Banking
• Compliance failure larger issue than merely meeting a rule
• Without the ability to understand exposures to risk (i.e. Credit Derivatives) the
ability to make timely decisions for a firm’s aggregate exposures can be
catastrophic
• Avoiding MRA’s = ROI improvement (allowing Banks to perform
desirable Capital Actions such as M&A)
• Cost – estimated to be upwards of $100 Billion in 2016 across all banks in
the U.S. specifically for Regulatory Compliance in general
• Additional regulatory requirements already include:
• Dodd-Frank ~20,000 pages of regulation comprising of granular financial
activity information requirement by the FRB
• Comprehensive Capital Analysis and Review (CCAR) - Top 34 Financial
Institutions in the U.S. (those with Assets >$50Billion)
• Basel III - common reference data to drive operational, market, credit, and
liquidity risk (data quality significant challenge)
11. • What is BCBS 239?
• The Principals for effective risk data aggregation and risk reporting (G-SIBS & D-SIBS)
• Highly Data focused
• Tieback/Traceability via Data Lineage is critical
• What is the cost of violating the rules?
• January 2016 was the deadline for G-SIBS, adoption rate is slow.
• Value Add to banks likely to be significant
• Banks stand to improve the bottom line from a variety of sources:
• Increased revenue from improved analytics (better data composite)
• Capital Management lift from reduced RWA buffers
• Operational Cost optimization through elimination of redundancy
• IT Cost reduction through data assets and tools streamlining
Specific Use Case: BCBS 239
14. Ideally, Data Lineage Should Completely Hierarchical
Easily represented as a tree
with a root and branches
Business
Area
Entity
Attribute
Column
Table
Database
15. Ideally, Data Lineage Should Completely Hierarchical
Easily represented as a tree with a root and branches
Domain
Report
Attribute
Column
Table
Database
Account
Report
Loan
Number
loan_number
Loan
Mortgage
Data Hub
16. In Reality, Not So Much…
Loan
Number
Mortgage
Data Hub
Loan
loan_number
Report
Account
17. Pain Points in Data Lineage
Lower levels map to multiple higher levels and
vice versa, tree traversal becomes impossible
Importing legacy ETL left behind – compatibility
issues across platforms
18. Pain Points in Data Lineage
Columns can be stored across
multiple tables
Single column can become many
columns and vice versa
19. Pain Points in Data Lineage
Transformation logic:
Difficult to store
Difficult to model in traditional RDBMS
Don’t know direction!
Source Target
?
20. Pain Points in Data Lineage
Transformation logic:
Difficult to store
Difficult to model in traditional RDBMS
Don’t know direction!
Source Target
?
21. Pain Points in Data Lineage
Transformation logic:
Difficult to store
Difficult to model in traditional RDBMS
Don’t know direction!
Source Target
Transformation
22. Pain Points in Data Lineage
Source Target
Transformation
TargetSource
23. Pain Points in Data Lineage
Transformation logic:
Difficult to store
Difficult to model in traditional RDBMS
Don’t know direction!
Source Target
24. Pain Points in Data Lineage
Expensive
Time Consuming
Manual Labor to Map Relationships
False and usually circular hierarchies
25. Data Lineage is a Network Graph
Many-to-many relationships that need
to be accounted for
Relationships between columns,
tables, databases
Easily trace source to target and
store transformations as properties
on relationships
46. Mapping and Tracking Data Type Changes
Data type changes are the norm in financial services
47. Mapping and Tracking Data Type Changes
Numeric and decimals are functionally equivalent
48. Mapping and Tracking Data Type Changes
Scale and precision transforms used for reporting purposes
Scale = 3 Scale = 2
Precision = 5 Precision = 3
49. Mapping and Tracking Data Type Changes
Changes in scale or precision could be a problem if untracked
Scale = 3 Scale = 2
Precision = 5 Precision = 3
99.999 Error
58. Additional Applications – Impact Analysis
Find all nodes connected to an technology or platform
Business Initiative
Column
Database
Subject Area
Canonical Attribute
Table
59. Additional Applications – Impact Analysis
Business Initiative
Column
Database
Subject Area
Canonical Attribute
Table
Quickly determine technological dependencies
60. Additional Applications – Fault Tolerance
J.-P. Onnela et al. PNAS 2007;104:7332-7336
Characterizing the large-scale structure and the tie
strengths of the mobile call graph.
Number of Relationships
Attached to Node
Probability of
Finding a
Node with k
Relationships
61. Additional Applications – Fault Tolerance
Example of a fault tolerant network
J.-P. Onnela et al. PNAS 2007;104:7332-7336
Characterizing the large-scale structure and the tie
strengths of the mobile call graph.
Lots of nodes
with few
relationships
Few nodes
with lots of
relationships
62. Additional Applications – Fault Tolerance
This network lets you take care of the big things…
J.-P. Onnela et al. PNAS 2007;104:7332-7336
Characterizing the large-scale structure and the tie
strengths of the mobile call graph.
Proportionally fewer nodes as
relationship count increases
64. ROI – Efficiency, Agility, Innovation
Beyond risk mitigation
Reduced time and personnel costs
Singular, 360 view of metadata and connected
components
Easy end-to-end exploration of the entire enterprise
data universe
Gaining lift with graph data lineage
70. Agile Core Architecture with Neo4j
Loans
Small Business
Commercial
Personal
Mortgages
Commercial
Retail
Jumbo
Deposits
Institutional
Retail
Retirement
Trusts
Cards &
Payments
Debit
Credit
ePayments
Online banking
Fraud Prevention
Risk Management
Regulatory
Compliance
New Service
New
Product
New Connections
Required
ServicesProducts
71. “Why Neo4j”: What We Hear From Users
ACID Transactions
• ACID transactions with causal
consistency
• Neo4j Security Foundation delivers
enterprise-class security and control
Performance
• Index-free adjacency delivers millions
of hops per second
• In-memory pointer chasing for fast
query results
Agility
• Native property graph model
• Modify schema as business changes
without disrupting existing data
Developer Productivity
• Easy to learn, declarative openCypher
graph query language
• Procedural language extensions
• Open library of procedures and
functions APOC
• Neo4j support and training
• Worldwide developer community
… all backed by Neo’s track record of
leadership and product roadmap
Hardware Efficiency
• Native graph query processing and
storage requires 10x less hardware
• Index-free adjacency requires 10x less CPU
74. How to get Started
- Suggested Approach
• Review regulations & 14
principles
• Enterprise view of risk data
• Documented aggregation
process, touchpoints &
systems involved
Determine
applicability
• Evaluate regulatory
requirements,
• Evaluate current process and
data movements to
determine control points
• Assess organizational
capabilities
• Understand the financial and
operational consequences
Assess
• Data Governance Platform
• Data Lineage Capability
• Connected Graph of People,
process, applications,
systems, locations, access
rights, etc.
• Track key risk KPIs
• CRO role & processes
Implement
• Publish summarized
assessment and
implementation report
• Provide management and
board risk and impact
assessment report
• Notify individuals
• Notify regulators
Comply
People Process Technology
75. How to get started
- Next Steps
• Register for a brown-bag graph talk with your team @
https://neo4j.com/brownbag/
• Spend 1 hr. to discuss your regulatory compliance
initiative with us and validate your solution / approach.
Email us, limited to first 5.
Thanks!
Nav Mathur
Sr. Director Global Solutions
nav@neo4j.com
Lee Hong
Sr. Data Scientist
lhong@icct.com
76. Summary of Services ICC core capabilities allow us to provide tailored, end-to-end solutions
around our clients’ specific needs.
Advanced Analytics
• Predictive Analytics
• Process Optimization
• Demand Forecasting
• Customer Analytics
• Supply Chain
Optimization
Foundational Analytics
• Reports
• Dashboards
• Scorecards
• On-Line Analytical
Processing
• Functional Solutions
Enterprise Data Management
• Data Integration
• Enterprise Data Assets
• Data Governance & Quality
• Analytic Cubes
• Master Data Management
81. Additional Applications – Fault Tolerance
Evenly distributed connections = Low fault tolerance
Same number of nodes as
relationships increase
If any one node goes offline, the
impact is unpredictable
Adapted from: J.-P. Onnela et al. PNAS 2007;104:7332-
7336
Characterizing the large-scale structure and the tie strengths of the
mobile call graph.
Editor's Notes
BCBS = Basel committee on banking supervision
Outcome of weakness identified in risk data aggregation and reporting capabilities from the global financial crisis
14 principles covering Governance, risk data aggregation, reporting, tools & supervision
Need to bring data silos together to assess enterprise risk
Opaque counterparty risk
Risk = Geo/Location exposure + Vertical exposure + GeoPolitical event + Commodity/asset price changes + Company financials + Investor profile + Lender profile + Dealmaker/Trader profile
How many people have exercised their rights
How much new data has been added
How many new process/systems are operating on GDPR data
How often does data pass country boundaries
Purpose:
Demonstrated the range of expertise we bring to an engagement
Demonstrate the thorough approach
False or Circular Hierarchies Exist
Neo4j allows flexible data models and easily extended to accommodate more table properties
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Blend projects together
Basic Nodes
Breaking the hierarchy
Data types
Transformation rules
Where did the data come from?
Add names
Add names
Cross channel solutions
Need to bring data silos together to assess enterprise risk