The London Financial Modelling Group meeting of 2015 04 30 - Model driven solutions to BCBS239.
You will learn how to:
– Demonstrate compliance to each of the principles by re-purposing your information architecture
– Meet the obligations more efficiently by leveraging FIBO and semantic technology
– Show your management team how data architecture helps meet BCBS239 using his “BCSB239 Model driven solutions checklist” tool.
3. You will learn how to:
• Show your management team how data architecture helps meet
BCBS239 using his “BCSB239 Model driven solutions checklist”
tool.
• Demonstrate compliance to each of the principles by re-purposing
your information architecture
• Meet the obligations more efficiently by leveraging FIBO and
semantic technology
4. Deutsche Bank Libor fine: Seven crazy things
traders said about Libor that cost the bank billions
4
On 21 February 2005, a trader requested of another trader who performed submitter duties on a back-up basis:
Trader 1: "can we have a high 6mth libor today pls gezzer?"
Trader 2: "sure dude, where wld you like it mate ?"
Trader 1: "think it shud be 095?"
Trader 2: "cool, was going 9, so 9.5 it is."
Trader 1: "super – don’t get that level of flexibility when [the usual submitter] is
in the chair fyg [for your guidance]!“
London CitiAm
2015-04-23
5. Merrill Lynch hit with £13.2m fine
• The watchdog said the amount levelled
at the firm “reflects the severity of
MLI's misconduct, failure to adequately
address the root causes over several
years despite substantial FCA guidance
to the industry, and a poor history of
transaction reporting compliance”.
• Georgina Philippou, FCA's acting
director of enforcement and market
oversight, said: “Proper transaction
reporting really matters. Merrill Lynch
International has failed to get this right
again – despite a private warning, a
previous fine, and extensive FCA
guidance and enforcement action in
this area.
London CitiAm
2015-04-23
6. Many sources
Multiple views of multiple sources is the data challenge
OpenGamma
(Postgress)
Reference
Data
Corep Finrep
(XBRL Taxonomy)
Spreadsheets
View 1
View 2
View 3
View 4
6
8. Data warehouse
Front office system
Client ID Name Client
type
1 Thatcher Gold
2 Blair Silver
3 Cameron Bronze
Risk system
Client
ID
LEI Name Client
type
Risk
Rating
1 987 Thatcher Gold High
2 654 Blair Silver Medium
3 654 Cameron Bronze Medium
321 Obama LowLEI Name Risk rating
987 Thatcher High
654 Blair Medium
321 Obama Low
8
Warehouse duplicates & transforms the data
9. Therefore BCBS239
• Why?
• What?
o 1 Governance
o 2 Data architecture and IT infrastructure
o 3 Accuracy and Integrity
o 4 Completeness
o 5 Timeliness
o 6 Adaptability
o 7 Accuracy
o 8 Comprehensiveness
o 9 Clarity and usefulness
o 10 Frequency
o 11 Distribution
o 12 Review
o 13 Remedial actions and supervisory measures
o 14 Home/host cooperation
9
• Who?
o Systemically important banks
o Others to follow
• When?
• Jan 2016 for SIB
• Others later
• How?
• Periodic review by supervisor
• Supervisor to supervisor
cooperation
http://www.bis.org/p
ubl/bcbs239.pdf
10. Show your management team how data architecture
helps meet BCBS239 using his “BCSB239 Model driven
solutions checklist” tool.10
11. BCBS239 Programme management tool
• Each row is a obligation paragraph
• Columns for
o Output
• Obligation #
• Section
• Description etc
• Evidence required
o Input
• Level of compliance
• Level of evidence
• Priority
• Owner
• Tool calculates
o Readiness score for each obligation
o Programme reediness score
o Averages
11
12. Breakouts
For your BCBS239 paragraphs report back on
• Appropriate evidence
• How it implement in Magicdraw
12
13. Meet the obligations more efficiently by leveraging FIBO
and semantic technology
13
14. 14
The Data Point Model
Value Value Set
Data Point
Aspect
Context
Many
Value
1
The data point meta model
separates the data points
and their values
And treats
concept
relationships
as data
15. Traditional strategy
15
Example DPM with instance data
DoBs Client
1925-10-13 Thatcher
1953-05-06 Blair
1953-05-06 Cameron
Data Point Model
Aspect
Values
1945-01-01
1953-05-06
Aspect
Values
Thatcher
Blair
Cameron
DPs
!6#^
7f%$
^rrA
FQ!@
~A¬0
Aspects
DoB
Client
Rules
16. ModelDRExisting systems
16
Data Point Model architecture
OpenGamma
(Postgress)
EBA COREP
(Access)
Corep Finrep
(XBRL Taxonomy)
Spreadsheets
Data point models
EBA COREP
Spreadsheets
Corep Finrep
FIBO
Open Gamma
In
place
access
Semantics
View 1
View 2
View 3
View 4
17. Demonstration subject matter
Para 33: Must have taxonomy
Para 34: Data is aligned to that
taxonomy
Para 40: Data quality
Para 50: On demand slice and
dice
…..
……
17
http://www.bankofengland.
co.uk/statistics/Documents
/about/dqf.pdf
http://www.edmcouncil.org
/dataquality
EDM Council Data
Quality standard
• Completeness,
• Coverage,
• Conformity,
• Consistency,
• Accuracy,
• Duplication,
• Timeliness
18. Demonstration
1) Model Spreadsheet report
2) Model database table
3) Enrich the data points of the reports with their Source system
4) Enrich the data points of the reports with their Time cycles
5) Synthesis a common view of the 2 data sources using FIBO
6) Assign the data points to the US GAAP
7) Provide analysis, report and data quality metrics
18
19. 19
1: Untangle a spreadsheet
Report NameY Axis Aspect Y Axis Value Set Name Y Axis Coordinate Aspect Values
Reportable
Aspect
X Axis Aspect
X Axis
Value Set
X Axis
Coordinate
Aspect Values
Reportable Aspect Values
Reverse
engineer
20. 20
Business Entity
Class Level Adaptor
Attribute Adaptors
Adaptors Class Level Aspect
Attribute Level Aspect
Attribute Level
Value Sets
Resources
2. Untangle an existing data base
Reverse
engineer
21. 3) Enrich the data points of the reports with their Source
System
21
Wire
together
22. 4) Enrich the data points of the reports with their Time
Cycles
22
Wire
together
23. 5) Synthesis a common view of the 2 data sources using
FIBO
23
Wire
together
FIBO Model
Spreadsheet
model
Integrated
model
24. 6) Assign the data points to the US GAAP
24
Reverse
engineer
Integrated
model
US GAAP
model
Wire
together
25. 7) Provide analysis, report and data quality metrics
You now have a complete database
• Reports
• Ad-hoc queries
• Inferencing queries
25
26. Systems
Data management tooling
Reporting and analytics Regulator
Duplication
Coverage
Consistency
Accuracy
completeness
conformity
26
The model in the BCBS239 ecosystem
Roles
modelDR
Trading
systems
Calculation
Engine
Reg Report
needs
(e.g. XBRL taxonomy)
Spreadsheets
Risk systems
Data
warehouses
Report
writer
Data
modelling
Ontology
management
Data Architect
Business Domain
Expert
Data Owner
FIBO
Cross DB
DQ Queries
Data quality
metrics
Meta
data
Analysis
Definition of Good
Data Quality
Report
Designs
27. Resource search terms
• Data Quality Webinar: May 19th “How can I measure the quality and integrity
of my data”
• Paper: BCBS239 Principle of Risk Data Aggregation
• Progress report: BCBS Progress in adopting the principles for effective risk data
aggregation and risk reporting
• Data Point Model: ECB Data Point Model
• XBRL: XBRL Abstract Model
27
Editor's Notes
It is really great to see you all here – to be talk 239 with you all makes the inner data nerd in me leave for joy.
I say this only because I have always wanted to
So why BCBS239?
9 years ago when I started with one of the big banks here in London I asked what data modelling tool did they use. “Whats data modelling?” was the answer.
I should not have been so surprised at the “what’s data modelling” line. I have been doing data models for 30 years. For 25 of those years it was always the same depressing story – for me it was obvious.
- “Separate logical of that you want to achieve from physical implementation !” – yes, of course, brilliant, that’s obvious!
“Give everyone in the firm fingertip access to 100% of the data architecture !” – wow – compared to all the IP of firm being looked up in the heads of developers that’s a total game changer – brilliant”.
Why? Why has data architecture been such a failure?
One common reason I hear is the “I am a simple person” reason.
Person 1 in front – would you chose a simple solution to a problem or a complex solution?
Person 2 in front – which is the simpler tool – Magicdraw or a spreadsheet?
You have just proved Magicdraw is a big waist of time because the best solution to the BCBS239 is a spreadsheet
2) Another is the mis-diagnosis of root cause of failure. It was a project on Sydney for a large insurance company…….
So if a project fails is fails because of a
But no. While every time I used data modelling and tooling it worked great, just as common sense and the theory they teach you at computer science 101 says it should, in 25 years I never saw a firm make a success of it.
Firstly I think public servent types are community mined. They think of bankers as ……
So they say “if we are going to indemnify you against loosing all the publics money then we want to monitor you very closely”.
Hence all the regulatory reporting. You have to understand the scale of reporting they are asking for
Corep and Finrep and the Data Point Model are not really reports – they are data cubes – little star schema databases.
So what do these community minded public servants get?
Next slides
Deutsche Bank has been slapped with a multi-billion pound fine by regulators, including the largest ever handed out by UK authorities, over Libor and Euribor rate-fixing.
Germany's biggest lender has been fined £227m by the Financial Conduct Authority (FCA) over Libor and Euribor misconduct and for "misleading" the regulator during its investigation.
Three regulators in the US have slapped the bank with an even larger fine of $2.175bn (£1.4bn), bringing the total penalty to more than £1.6bn, the largest total fine handed out to any financial institution over the scandal.
FCA
The FCA said it had increased the fine rate (Source: Getty)
Merrill Lynch International (MLI) has been fined £13.2m for failing to properly report a number of transactions between 2007 and 2014.
This is the highest fine imposed by the FCA for this type of failing. The watchdog said the amount levelled at the firm “reflects the severity of MLI's misconduct, failure to adequately address the root causes over several years despite substantial FCA guidance to the industry, and a poor history of transaction reporting compliance”.
The fine equates to £1.50 per line of incorrect or non-reported data, instead of the usual £1, because “past fines have not been high enough to achieve credible deterrence”.
The fine relates to two groups of failings – the first totalling 35,034,810 transactions and the second a further 121,387 transactions
Georgina Philippou, FCA's acting director of enforcement and market oversight, said: “Proper transaction reporting really matters. Merrill Lynch International has failed to get this right again – despite a private warning, a previous fine, and extensive FCA guidance and enforcement action in this area.
"The size of the fine sends a clear message that we expect to be heard and understood across the industry.
"Accurate and timely reporting of transactions is crucial for us to perform effective surveillance for insider trading and market manipulation in support of our objective to ensure that markets work well and with integrity."
MLI received a 30 per cent reduction in their overall fine because it agreed to an early settlement. Without this discount the fine would have been nearly £19m.
Traditional solutions to this challenge create the problems of data quality, slow rates and high costs of change.
Left hand side examples of As Is systems are from our demonstration site.
Thought experiment:
Number of databases: 500
Number of tables per database: 50
Number of columns per table: 30
Number of columns: 750,000
Number of relationships between columns = 750,000! = #NUM!
This example is from OpenGamma – which is a modern system. Uses a warehouse in the traditional approach.
The reason they are doing this is that a query in current approaches uses SQL.
SQL can only run against one database at a time. Therefore, for complex reg reports across multiple databases the data must be copied out to a data warehouse.
The source data is all in different formats – it must also be made congruent, which requires complex transforms during the load –. Expensive!
And then the data is in another hard coded format. New business requirement = new data warehouse. Oooh, that hurts!
How many cans do you need to open to test a batch of bake beans?
Secret sauce:
Separating each data point from its value is unique to the DPM approach.
Allows data points to be re-purposed without duplication = Agile,
Massively reduced data duplication.
It shares with rdf and other semantic technologies the idea of making structural and meta data explicit, in data,
Enable queries to reason based on “knowledge”,
New rules and relationships are introduced without database changes
Current architecture strategies treat the value as the data point. If there are 1,000,000 clients there are 1,000,000 DoB instances.
But between 1915 and 2015 there are only 36500 days to have birthday. A 30 fold duplication of data.
This 30 fold inefficiency is necessary as the Client and DoB concepts are hard wired together in the Client table. The DPM breaks the data down into its atomic level – negating this constraint. The rule that all clients must have a DoB is in a Rule, not hard coded into the database structure = agile.
In the Data Point Model based approach
Data congruence is created in the DPM layer. This is a design layer – no heavy weight technology = agile.
The data is used in place by preference, not transformed certainly. SPAQL query language a) allows access across multiple databases and B) has the extra power of semantics and inference.
MEANS No data warehouse
Likewise ModelDR can convert regulatory reports, spreadsheet and any other data format to a Data Point Model
modelDR can easily transform all kinds of data, in this case an Oracle database, into a Data Point Model
My 3 main points:
- The scale of data architecture demanded by BCBS239 is massive
Modelling is the only way to make it scale
Within modelling the only way to scale is with the semantic DPM approach
The benefit is fantastic agility and the emimination of the data warehouse
The seven key dimensions associated with data quality (completeness, coverage, conformity, consistency, accuracy, duplication, timeliness).
These dimensions facilitate better communications among and between stakeholders about data quality objectives, challenges and remediation approaches.
Input to ModelDR are
Schema from database, data models, FIBO, Reg report obligations etc
Definition of good from Business Domain experts
Output will be
Cross database querys
Data quality Report designs
Meta data analysis
Data quality metrics.
Definitions:
Core data Attributes - a composite set of all critical data attributes for all primary applications (i.e. research, trade, clear, settle, price, value, risk, compliance, books and records) throughout the full transactions lifecycle. Definitions of core attributes will be aligned with FIBO
Standard Measurement criteria - working on the development of standard data quality metric categories, standard root cause categories, standard set of data governance criteria and standard set of business rules for both reference and transactional data attributes.