In the world of big data, legacy modernization, siloed organizations, empowered customers, and mobile devices, making informed choices about your enterprise infrastructure has become more important than ever. The alternatives are abundant, and the successful Enterprise Architect must constantly discern which new technology is just a shiny object and which will add true business value.
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
Enterprise architectsview 2015-apr
1. An Enterprise Architect’s View of
MongoDB
Matt Kalan
Sr. Solution Architect
matt.kalan@mongodb.com
@matthewkalan
2. 2
• Modern drivers of change on enterprises
• Requirements these create
• How traditional databases are handling changes
• New capabilities needed
• How MongoDB provides these capabilities
• Case studies
• Enterprise adoption
Agenda
4. 4
More Technologies and Requirements
Than Ever
Big Data
NoSQL
Key-value
Wide-column
Document Data Stores
MongoDB
Mobile
Cloud Computing
Social networking
JSON
Internet of Things
Hadoop
Graph
Agile Development
ODS
Datawarehouse
Analytics
Consumerization
Gamification
5. 5
More Technologies and Requirements
Than Ever
Big Data
NoSQL
Key-value
Wide-column
Document Data Stores
MongoDB
Mobile
Cloud Computing
Social networking
JSON
Internet of Things
Hadoop
Graph
Agile Development
ODS
Datawarehouse
Analytics
Consumerization
Gamification
Globalization
Emerging markets
Faster Competition
Regulation
Cross-channel
Empowered customers
Lowering TCO
More with less
New Revenue Streams
Customer 360
Opportunity cost
Data Monetization
Common Services
6. 6
• What current and future requirements does all
this raise?
• How to prepare my enterprise to handle these?
• Which technologies and products will help me?
• How to bring them into my enterprise
successfully?
• How does old and new technology work together?
• What does the future state architecture look like?
Questions for Enterprise Architects
8. 8
The World Has Changed
Data
• Volume
• Velocity
• Variety
Time
• Iterative
• Agile
• Short Cycles
Risk
• Always On
• Scale
• Global
Cost
• Open-Source
• Cloud
• Commodity
15. 15
Documents Support Modern
Requirements
Relational Document Data Structure
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
location : [40.74, -73.97],
image : <binary>,
phones: [ {
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
{
number : “1-212-777-1213”,
type : “cell”
}]
}
16. 16
Basic Insert/Query Examples
Objective Java Example Command-line
Javascript Shell
Insert a map Map m;
…
collection.insert(
new BasicDBObject(m));
db.collection.insert(m);
Find all contacts
with at least one
mobile phone
DBObject expr = new
BasicDBObject();
expr.put(“phones.type”, “cell”);
List<DBObject> L =
collection.find(expr).toArray();
db.contact.find(
{"phones.type”:”cell”});
24. 24
MongoDB-Hadoop Connector
• Low latency
• Rich fast querying
• Flexible indexing
• Aggregations in database
• Known data relationships
• Great for any subset of data
• Longer jobs
• Batch analytics
• Highly parallel processing
• Unknown data relationships
• Great for looking at all data or
large subsets
Applications Distributed Analytics
MongoDB
Connector for
Hadoop
28. 28
1. Operational Data Store (ODS)
2. Enterprise Data Service
3. Datamart/Cache
4. Master Data Distribution
5. Single Operational View
Architecture Patterns
System of Record
System of Engagement
32. 32
Criteria for benefitting most from
MongoDB instead of RDBMS
Data
Variably or
unstructured
Hierarchical
Geo-coordinates
Disparate sources
Schema changes
often
Querying
Real-time analytics &
aggregations
Location-based
Lowest latency
Performance affects
user experience
Requirements
Agile development &
fastest time-to-market
Data will grow quickly
Best performance for
request/response
Lowest TCO
Multiple sources
aggregated
Challenges today with
RDBMS
33. 33
One of the world's largest providers of payments solutions
constructs a completely reliable and robust mobile
experience
ADP’s Global Mobile Platform
Problem Why MongoDB Results
• Needed a signature
mobile app for customers
• Must support millions of
users
• Needed to quickly change
features & functionality
• High availability was
critically important
• Built-in high availability
architecture optimized for
global, multi-data center
distribution
• Dynamic schema & rich
querying – deep
functionality from launch &
new features easily added
• Much lower TCO,
especially with commodity
hardware
• iTunes App Store “Top 15”
business app since 2012
launch
• Over 1 million active users, 17
countries, 23 languages
• Extremely high performance
through predictive caching
• Maintenance much easier =>
simple codebase, less
hardware
• New functionality easy and
quick to add
35. 35
Challenge: Siloed operational
applications
Silo 1 Data
Silo 2 Data
Silo N Data
… Impact
• Views are siloed
• Duplicate management
and data access layer
• Need another layer to
aggregate
Silo 1 systems
Silo 2 Systems
Silo N
Systems
…
ReportingReportingReporting
36. 36
Solution: Unified data service
… Benefit
• Each application can still
save its own data
• Data is already aggregated
for cross-silo reporting
• One cluster and data access
layer to manage
Silo 1 Systems
Silo 2 Systems
Silo N Systems
…
Reporting
……
37. 37
Distribute reference data globally in real-time for
fast local accessing and querying
Case Study: Global Broker Dealer
Trade Mart for all OTC Trades
Problem Why MongoDB Results
• Each application had its
own persistence and
audit trail
• Wanted one unified
framework and
persistence for all
trades and products
• Needed to handle many
variable structures
across all securities
• Dynamic schema: can
save trade for all products
in one data service
• Easy scaling: can easily
keep trades as long as
required with high
performance
• Rich querying: can query
on any fields each
business requires
• Fast time-to-market using
the persistence framework
• Store any structure of
products/trades without
changing a schema
• One consolidated trade
store for auditing and
reporting
39. 39
Challenge: Response From Data
Warehouse or Other System is Slow
Cards
Loans
Deposits
…
Data
Warehouse
Issues
• Data stored normalized
• Reports slow to generate
• Data updated daily but user
response must be fast
Impact
• Lost productivity
• Dissatisfied users and
business
Reporting
Cards
Silo 1
Loans
Silo 2
Deposits
Silo 3
40. 40
Solution: Optimize Data Structure as a
Datamart In-memory or On-disk
Cards
Loans
Deposits
…
Data
Warehouse
Solution
• Data stored in optimal
structure for reports
• Optionally in memory
Impact
• Response times is as fast
as possible
• Users and business
satisfied
FastReporting
Cards
Silo 1
Loans
Silo 2
Deposits
Silo 3
…
Datamart/Cache
41. 41
Needed fast reporting for finance on global
banking transaction data (about 2 petabytes)
Case Study: Tier 1 Global Bank -
Personalized In-memory Datamart
Problem Why MongoDB Results
• Data warehouse was
too slow for reporting
• No visibility into how
long reports took
• Could not generate
multiple ad hoc reports
• Users included
regulators so even
more demanding
• Dynamic schema: store
data in optimal structure
• Performance: storing
report results optimally
• In-memory caching of
results
• Rich querying: can query
on any field
• Easy scaling: results
spread across shards to
generate report in parallel
• Create a personalized in-
memory data mart
• Reports configured and
notified when results ready
• Data all in memory so fast
to manipulate
• Data spread across shards
for ultra-fast reporting
43. 43
Challenge: Master data can be hard
to change and distribute
Golden
Copy
Batch
Batch
Batch
Batch
Batch
Batch
Batch
Batch
Common issues
• Hard to change schema
of master data
• Data copied everywhere
and gets out of sync
Impact
• Process breaks from out
of sync data
• Business doesn’t have
data it needs
• Many copies creates
more management
44. 44
Solution: Persistent dynamic cache
replicated globally
Real-time
Real-time Real-time
Real-time
Real-time
Real-time
Real-time
Real-time
Solution:
• Load into primary with
any schema
• Replicate to and read
from secondaries
Benefits
• Easy & fast change at
speed of business
• Easy scale out for one
stop shop for data
• Low TCO
45. 45
Distribute reference data globally in real-time for
fast local accessing and querying
Case Study: Global bank
Reference Data Distribution
Problem Why MongoDB Results
• Delays up to 36 hours in
distributing data by batch
• Charged multiple times
globally for same data
• Incurring regulatory
penalties from missing
SLAs
• Had to manage 20
distributed systems with
same data
• Dynamic schema: easy to
load initially & over time
• Auto-replication: data
distributed in real-time,
read locally
• Both cache and database:
cache always up-to-date
• Simple data modeling &
analysis: easy changes
and understanding
• Will save about
$40,000,000 in costs and
penalties over 5 years
• Only charged once for data
• Data in sync globally and
read locally
• Capacity to move to one
global shared data service
49. 49
Insurance leader generates coveted 360-degree view of
customers in 90 days – “The Wall”
Case Study
Problem Why MongoDB Results
• No single view of
customer
• 145 yrs of policy data,
70+ systems, 15+ apps
• 2 years, $25M in failing
to aggregate in RDBMS
• Poor customer
experience
• Agility – prototype in 5
days; production in 90
days
• Dynamic schema:
Imperative to combine
disparate data
• Rich querying: necessary
for match data across silos
• Hot tech to attract top
talent
• Unified customer view
available to all channels
• Increased call center
productivity
• Better customer
experience, reduced
churn, more upsell opps
• Dozens more projects
on same data platform
50. 50
Expanded Single View of ….
…
Single CSR
Application
Unified
Customer Portal
Operational
Reporting
Cards …CardsSilo 1
…
Operational Data Layer
• Request/response
• Millisecond latency
• Easily scalable
• Flexible schema
• Low TCO
• Rich querying with indexes
DW/Data Lake
• Analytical/batch processing
• Seconds to hours latency
• Also scalable, low TCO, &
flexible schema
• Pre-defined slices of data
(limited indexes)
MongoDB
Hadoop Connector
…
CardsCardsSilo 2
CardsCardsSilo N
ETL
Pub-sub/ETL
Customer
Clustering
Churn
Analysis
Predictive
analytics
…
51. 51
Processing + Data Access Paradigm
Processing
model
Data access
model
Request/response
Map-reduce
Batch, ETL, etc.
Analytical Jobs
Latency important (e.g.
user waiting)
Milliseconds to seconds
Small to large subsets
of data
Indexes valuable
Multiple seconds to hours
Processing all or large sets
of data
Indexes not used
TypicalMongoDB
UseCase
TypicalHadoop
UseCase
52. 52
Processing + Data Access Paradigm
Processing
model
Data access
model
Request/response
Map-reduce
Batch, ETL, etc.
Analytical Jobs
Latency important (e.g.
user waiting)
Milliseconds to seconds
Small to large subsets
of data
Indexes valuable
Multiple seconds to hours
Processing all or large sets
of data
Indexes not used
TypicalMongoDB
UseCase
TypicalHadoop
UseCase
53. 53
Processing + Data Access Paradigm
Processing
model
Data access
model
Request/response
Map-reduce
Batch, ETL, etc.
Analytical Jobs
Latency important (e.g.
user waiting)
Milliseconds to seconds
Small to large subsets
of data
Indexes valuable
Multiple seconds to hours
Processing all or large sets
of data
Indexes not used
TypicalMongoDB
UseCase
TypicalHadoop
UseCase
Data
Discovery
56. 56
Traditional Data Integrity Enforcement
RDBMS
• Apps access DB directly
• Data Integrity must be in the RDBMS
• Schema implemented by a DBA
Application 1
Application 2
Application 3
57. 57
Modern Apps (SOA) - Data Access
Layer Should Enforce Data Integrity
Application 1
MongoDB Cluster
Application 2
Data
Access
Layer
Application N
…
…
REST/API/WS API on TCP/IP
• Data Integrity and validations done in
Data Access Layer
• Implemented in code
58. 58
• Greater adoption from offering an easy-to-use
developer framework on common data models
• Easier for master data or upstream changes to
flow into MongoDB-backed apps
• MongoDB useful for distributing master data
• ETL providers support MongoDB most in NoSQL
Data Governance Benefits
60. 60
• SDLC and data governance for an application
• Enterprise-wide data governance (inter-app)
• Enterprise-wide security
• Roles and responsibilities
• Training requirements
• Operations/production support
• Center of Excellence (COE)
• Process for choosing which DB to use
• How to work with other technologies in-house
Factors to Consider in Adoption
62. 62
• The world has changed dramatically in 40 years
• Old technologies not suited for many uses today
• MongoDB is purpose built for today’s and future applications
• And can help solve common architectural challenges
• Firms using MongoDB benefit from 50% time-to-market,
70% lower TCO, less risk, and substantial competitive
advantage
• MongoDB, Inc. can help optimize the value and adoption in
your enterprise
Summary
64. 64
For More Information
Resource Location
MongoDB Downloads mongodb.com/download
Free Online Training university.mongodb.com
Webinars and Events mongodb.com/events
White Papers mongodb.com/white-papers
Case Studies mongodb.com/customers
Presentations mongodb.com/presentations
Documentation docs.mongodb.org
Additional Info info@mongodb.com
Resource Location
Editor's Notes
Now that we understand some of the challenges you’re facing and where you’d like to get, perhaps I can tell you a bit about why MongoDB exists and where we might be able to help.
Our founders observed some technological and business changes in the market. We built MongoDB to address the way the world is changing…
Data [tie back to what you’ve heard from customer if possible]
90% data created in last 2 years
80% enterprise data is unstructured
Unstructured data growing 2X rate of structured data
Time [tie back to what you’ve heard from customer if possible]
Development methods shifted from waterfall (12-24 months) to iterative
Leading edge companies like Facebook + Etsy shipping code multiple times a day
Risk [tie back to what you’ve heard from customer if possible]
User bases shifted from internal (thousands) to external (millions)
Can’t go down
All across the globe
Cost [tie back to what you’ve heard from customer if possible]
Shift to open-source business models to pay for value over time
Ability to leverage cloud and commodity architectures to lower infrastructure costs
Looking at the other technologies in the market…
Relational databases laid the foundation for what you’d want out of your database
Rich and fast access to the data, using an expressive query language and secondary indexes
Strong consistency, so you know you’re always getting the most up to date version of the data
But they weren’t built for the world we just talked about
Built for waterfall dev cycles, structured data
Built for internal users, not large numbers of users all across the global
(From vendors who want large license fees upfront)
--> So what they have in data access and consistency, they lack in flexibility, scalability and performance
Here’s a relational model for an application. It has hundreds of tables.
If you are the new developer who just joined the team, congratulations!!
Here’s a map of the database, now go figure out how to add your new feature (or fix a bug).
Good luck!
NoSQL databases have tried to address the new world…
They all have relatively flexible data models
They were all built to scale out horizontall
And they were built for performance
But in doing so, they have sacrificed the core database capabilities you’ve come to expect and rely on in order to build fully functional apps, like rich querying, secondary indexes and strong consistency
MongoDB was built to address the way the world has changed while preserving the core database capabilities required to build functional apps
MongoDB is the only database that harnesses the innovations of NoSQL and maintains the foundation of relational databases
One of the main reasons is the data model.
Documents are just easier.
If my app tracks car collections, I don’t need to know dozens of tables – all the data for an individual and their collection is in one document. (Walk through this example)
Dynamic schema
Single view of a customer
Single view of a customer
Compared to distributed cache - $ and fixed schema
Single view of a customer
Can store all accounts in one table
Have performance capacity and easy scaling to to do real-time, not just batch
Can store all accounts in one table
Have performance capacity and easy scaling to to do real-time, not just batch
Single view of a customer
In terms of reporting, A number of Business Intelligence (BI) vendors have developed connectors to integrate MongoDB as a data source with their suites, alongside traditional relational dbs. This integration provides reporting, visualizations, dash-boarding of MongoDB data