Presentation on general use cases of MongoDB on Financial Services industry. Over this presentation we discussed why MongoDB is ideal to large datasets analytics, realtime processing, quants analysis and other interesting aspects that make it ideal for FS projects.
5. 5
The Database of the Post-Relational Era
Combines the foundation of relational
databases with the innovations of NoSQL
Flexible Data Model
Performance
Scalability
NoSQL
Strong Consistency
Powerful Query Language
Rich Indexes
RELATIONAL
6. 6
MongoDB Features
JSON Document Model
with Dynamic Schemas
Auto-Sharding for
Horizontal Scalability
Text Search
Aggregation Framework
and MapReduce
Full, Flexible Index Support
and Rich Queries
Built-In Replication
for High Availability
Advanced Security
Large Media Storage
with GridFS
11. 11
Relational Database Challenges
Data Types
Unstructured data
Semi-structured data
Polymorphic data
Agile Development
Iterative
Short development cycles
New workloads
Volume of Data
Petabytes of data
Trillions of records
Millions of queries/sec
New Architectures
Horizontal scaling
Commodity servers
Cloud computing
12. Ps(x, s, e) = eng^e * s / x * C
Application change is const in today's development process!
17. 17
Storage Engine API
• Allows to "plug-in" different storage engines
– Different work sets require different performance
characteristics
– mmapv1 is not ideal for all workloads
– More flexibility
• Can mix storage engines on same replica
set/sharded cluster
• Opportunity to integrate further ( HDFS, native
encrypted, hardware optimized …)
18. 18
What is WiredTiger?
• Storage engine company founded by BerkeleyDB alums
• Recently acquired by MongoDB
• Available as a storage engine option in MongoDB 3.0
19. 19
Improving Concurrency Control
• 2.2 – Global
• 2.4 – Database-level
• 3.0 MMAPv1 – Collection-level
• 3.0 WT – Document-level
– Writes no longer block all other writes
– Higher level of concurrency leads to more
CPU usage
20. 20
Compression
• WT uses snappy compression by default
• Data is compressed on disk
• 2 supported compression algorithms
– snappy: default. Good compression, relatively low
overhead
– zlib: Better
• Indexes are compressed using prefix
compression
– Allows compression in memory
21. 21
Consistency without Journaling
• MMAPv1 uses write-ahead log (journal) to
guarantee consistency
• WT doesn't have this need: no in-place updates
– Write-ahead log committed at checkpoints
• 2GB or 60sec by default – configurable!
– No journal commit interval: writes are written to
journal as they come in
– Better for insert-heavy workloads
• Replication guarantees the durability
24. 24
Wider Range of Use Cases
How: Flexible Storage Architecture
• Fundamental rearchitecture, with new pluggable storage engine API
• Same data model, same query language, same ops
• But under the hood, many storage engines optimized for many use
cases
Single View Content Management
Real-Time Analytics Catalog
Internet of Things (IoT)Messaging
Log Data Tick Data
33. 33
Risk Aggregation & Reporting
• Intraday Controls
– Less than 1minute reporting
• Aggregate vast amount of data from different
trading desks (asset classes)
• Manage exposure to counter-party entities
– Can be thousands depending on the trade
– Challenge for existing RDBMS systems
35. 35
Trade Repository
• Scalable Database
– Size
– Velocity
– Variety
• Regulatory Requirements
– Dodd-Frank and EMIR
• Any trade, any point in time
• Unified view of product and trades across time
42. 42
Retail Bank Transactions Log
• Data needs to be fetched from Mainframe
– That costs Money!
• Read Requests
– Mobile Apps
– Home Banking
– Analytics
– Marketing Workloads
47. Data
Securities Master, Corporate Actions, Market
Data, Counter-Party Information, Economic
Calendar, Legal Entity Identifier.
Problem
Replicating reference data across
geographies in a timely and efficient manner.
Ensuring that data replication meets with
service level agreements. Ensure a
congruent view across all trading entities in a
global organisation.
Business Benefit
Reduced cost in managing infrastructure.
Timely reference data replicated with SLA.
Company in question will save about $40m
in costs and penalties over 5 years. Only
charged once for data from TR / Bloomberg /
etc instead of regionally as before.
Reference Data Management
Why MongoDB?
Dynamic data model means no schema
changes across geographies, built-in robust
replication mechanism simplifies infrastructure
and removes requirement for additional
integration technologies. Data replicated for
each change, not batch orientated. Both cache
and database cache always up-to-date; simple
data modelling & analysis : easy changes and
understanding.
Case Studies: Large American Investment /
Retail bank
48. Data
Risk metrics from upstream systems. For
instance, data from front office system for
monitoring counter-party exposure.
Problem
Investment Banks need a congruent view of
exposures across their business in order to
effectively manage risk – need for Intraday controls
– risk measures less than 1 minute old. Could not
scale with RDBMS. Data distributed across
multiple silos and consequently needed to be
aggregated. Need for versioning for data lineage
and auditing. Auditors requiring longer time
window
Business Benefit
Single view of exposure / risk data across the
business. Can make applications changes much
faster. Can hedge / trade with more confidence
and be more competitive. Have less capital
reserves.
Why MongoDB?
Scalable, replicable, flexible (a quick time-to-
market). Can handle more data and users
easily.
Dynamic Schema: can store disparate data and
make changes easily.
Replication: local reads and high availability.
Sharding: can add data and users easily by
scaling out.
Case Studies: Tier-1 Bank - Prime Services;
LargeAmerican Banking Group, Swiss Bank
(Equity Derivatives)
Risk Aggregation & Reporting
49. Data
Trade data for each new or updated trade.
Problem
Dodd-Frank and EMIR (European Markets and
Infrastructure Regulation) have mandated firms
to store all trade data (including updates) for
seven years. Investment Banks also have the
requirement to be able to query and report at
any time to the regulators in a bi-temporal
manner. Each application builds its own
persistence and audit trail. As an example, one
customer wants one unified framework and
persistence for all trades and products. Found
it hard to find a solution that could handle the
many variable structures across all securities.
Business Benefit
Quick access to data and reporting to ensure
that the regulators have what they need in a
timely manner. Ensure compliance to regulatory
mandates, and help to avoid the consequences
of not complying.
Why MongoDB?
Scalable, dynamic schema - trade information
can vary over time, scalable cost structure as
the data volumes grow, “pay as you grow”.
Case Studies: Global leader in institutional
research and investment management. Large
Australian Bank
Trade Repository
50. Data
Market, client/customer, trade, any data
Problem
Wanted application groups in the bank to focus
on building apps, not data access logic. It
takes 6 months for apps groups to get new
infrastructure ordered/delivered. Application
developers not very interested in speaking with
Hardware/DBA groups. Horizontal scaling
done by each application.
Business Benefit
Time-to-market decreased by at least 50%.
Object persistence included in framework. DB
capacity added in minutes not months. Same
environment from prototype to production.
Why MongoDB?
For new datamarts, single views, flexible
schema allows integrating disparate systems to
be simplified and “loosely coupled”, i.e.
changes to upstream systems won't break
downstream applications. Native language
drivers: groups can focus on agile application
development. Auto-replication: data distributed
globally in real time.
Case Studies: Large US Investment and
Retail Bank.
DBaaS
51. Data
Client/Customer data, addresses, personal
details, purchase history, status, etc.
Problem
Siloed data across organisation, no consistent
view across the customer. Difficult to identify
needs of the customer for cross-sell / up-sell
opportunities. Not able to positively deal with
the customer as source systems are hard to
change/touch so the business and IT are
normally stuck. In the customer example, they
had 70 source systems and 20 screens to view
customer policies, so couldn’t feasibly see a
single view.
Business Benefit
Provide the business with an accurate view of
their customer base.
Why MongoDB?
Flexible schema schema allows integrating any
disparate systems to be simplified and "loosely
coupled”, i.e. changes to upstream systems
won't break downstream applications.
Performance: can handle all data in one DB.
Replication: local reads and high availability.
Sharding: can add more data and users
globally by scaling out
Case Studies: MetLife.
Single View of Customer
53. We’re Always Looking for Top Talent
What are employees saying?
“Working with a group of individuals who you know will have your back is
one of the reasons I love working at MongoDB”
“Every day, we get to solve hard problems that make distributed databases
more accessible to developers all over the world”
“MongoDB lets you tackle real problems that affect hundreds of thousands
of users”
Why work with us?
• We’re by developers for developers
• $311 MM in capital raised to date
• #4 on DB-Engines list of top Database
Management Systems… and climbing
• Scaling our EMEA/APAC operations
aggressively
Visit us at www.mongodb.com/careers to see a full list of opportunities or email your resume to
jobs@mongodb.com
What are we hiring for?
• Technical Services Engineers (Dublin)
• Consulting Engineers (UK OR France)
• Solution Architects (France, Spain, Germany)
• Enterprise Account Executives ( France, Italy, UK,
Germany)
• Corporate Account Executives (Dublin)
• Renewals Account Managers (Dublin)
54. 54
For More Information
Resource Location
Case Studies mongodb.com/customers
Presentations mongodb.com/presentations
Free Online Training education.mongodb.com
Webinars and Events mongodb.com/events
Documentation docs.mongodb.org
MongoDB Downloads mongodb.com/download
Additional Info info@mongodb.com
We are not in the business of doing things like before
We are in the disruptive technology business
MongoDB provides agility, scalability, and performance without sacrificing the functionality of relational databases, like full index support and rich queriesIndexes: secondary, compound, text search, geospatial, and more
x = number of features
s= size of the team
e = expertise
In 1985, storage was the key expense: $100,000 per GB; developer salary: $28,000 per year
So relational databases were built to optimize for storage
In 2013, storage is cheap: $0.05 per GB. Developers are expensive: $90,000 per year
So MongoDB was built to optimize for developer productivity
This is what the ratio of those expenses looks like, in 1985 and today
Assumptions:
3-year TCO
1985: 2 developers and 5 GB
2013: 2 developers and 5 TB
Developer costs comprise the lion’s share relative to storage today. So optimize for developer productivity
Analysis of large sets of information
New streams of data from big data scnearios and IoT
Data formats that are very variable and constant changing
Enrichment of an existing feed and feed onboarding takes months
Data updates reach the traders with intra-day frequency
Sub-optimal data access and global availability
Licensing agreements are not effective