The 10-step methodology provides a repeatable process for building a single view of data from multiple disconnected sources. The steps include defining the scope, identifying data producers and consumers, developing a data model, loading and standardizing the data, merging and reconciling records, designing infrastructure, modifying consuming systems, and implementing maintenance processes. Following this methodology allows organizations to improve business visibility, enable real-time analytics, and provide a foundation for further digital transformation.
10-Step Methodology to Building a Single View with MongoDB
1. 10-Step Methodology to
Building a Single View
MatKeep,DirectorofProduct&MarketAnalysis. mat.keep@mongodb.com @matkeep
JonRangel,DirectorofProfessionalServices,EMEA. jon.rangel@mongodb.com
2. What You
Will Learn
1. Single View: Opportunities & Challenges
2. Repeatable 10-Step Methodology
3. Required Technical Capabilities
4. Single View Defined
• What
– Single, real-time representation of a business entity or
domain
– Customer, product, supply chain, financial asset class,
& more
• How
– Gathers and organizes data from multiple,
disconnected sources;
– Aggregates information into a standardized format
and joint information model
• Why
– Improves business visibility
– Serve operational applications
– Foundation for analytics
5. Single View Use Cases
• Comparative view of
traders or products
• Firm-wide view of
asset exposure
• Aggregated
transactions for fraud
models
• Omni-channel view of
customers for
personalized marketing
• Inventory control &
management
• Single view of product
across channels &
demographics
• Management of patient
medical records for
treatment plans
• Macro-analysis view for
public health
• Medical history to
identify insurance risk
Finance Retail Healthcare
6. Challenges
• Current State
– Data dispersed across multitude of systems
– Different structures, different attributes
– Apps built to meet specific business requirements, not
integrated
– New data sources from new apps, M&A
• Governance Processes
– How to deliver & maintain single view in face of
constant business change
• Technology Limitations
– Traditional databases not well suited to single view
required capabilities
9. 10-Step Methodology
Step 1:
Define Scope
Step 4:
Appoint
Data Stewards
Step 5:
Develop
Data Model
Step 6:
Load &
Standardize
Step 7:
Merge,
Test & Reconcile
Step 8:
Infrastructure
Design
Step 3:
Identify
Data Producers
Step 2:
Identify
Data Consumers
Step 9:
Modify Consuming
Systems
Step 10:
Maintenance
Processes
Discover
Develop
Deploy
10. Step 1: Define Scope & Sponsorship
• Scope needs to be realistic, defined by specific success metric
– Long term: aggregate all customer data into a single view, serving all
business functions
– Initial phase: collecting all customer interactions on digital channels over
past 3-months to improve call center MTTR
• Appoint executive sponsors
– Senior: allocate resources and command credibility
– Combination of senior title from the business, and from the technology
group
Discover
11. Web
Mobile
CRM
Mainframe
Source Systems
Steps 2 & 3:
Identify Data Consumers & Producers
• Single View Consumers Define
– Typical queries and SLAs
– Required data attributes
– Current data sources
• Identify apps generating the source data
– Identify application owners + associated databases
– Profile apps: operational, analytical
Step 2: Data Consumers
Step 3: Data Producers
Discover
12. Step 4: Appoint Data Stewards
• Data steward appointed for each data
source.
• Deep knowledge of:
– Source system schema
– Which tables store required attributes, what format
– Clients and apps that generate & consume the
source data
• Advise on data loading strategies
Develop
13. Step 5: Develop Single View Data Model
• Key inputs
– Required data attributes
– Query patterns
• Define common fields & data types
– Create rules to validate common data
• Define primary & secondary indexes
• Identify dynamic fields
– No need to pre-declare when using a document database
• Localize data into a single document (where
appropriate)
{
_id : “mark.smith@mongodb.com”,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [ {
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
{
number : “1-212-777-1213”,
type : “cell”
}]}
Single View
Develop
14. Resources to Support Schema Design
MongoDB
Documentation
MongoDB
Development Rapid Start
Develop
15. Step 6: Load
2 phases: Initial Load & Delta Load
Emit JSON to preserve data types. Use Extended JSON
Load
ETLorMessageQueue
Single View
Develop
Initial Load
• ETL Tools
• Custom Loaders
Delta Load
• Batch loads: use tools above
• Real-time loads: Message queue
16. Step 6 (cont’d): Standardize
Data
Source
A
Data
Source
B
Data
Source
C
14
77
26
cust_id:
14
f_name:
James
l_name:
Bond
dob:
07/14/1968
eMail:
007@spook.com
fno:
77
first:
Jim
last:
Bond
born:
1968-‐07-‐14
email:
007@spook.com
xc_id:
26
name:
James
Bind
bdate:
July
14,
68
Email:
007@spook.com
Develop
17. Step 7: Match, Merge & Reconcile
Develop
cust_id:
14
f_name:
James
l_name:
Bond
dob:
07/14/1968
eMail:
007@spook.com
xc_id:
26
name:
James
Bind
bdate:
July
14,
68
Email:
007@spook.com
source_id:
A_14
first_name:
James
last_name:
Bond
dob:
1968-‐07-‐14
eMail:
007@spook.com
source_id:
B_77
first_name:
Jim
last_name:
Bond
dob:
1968-‐07-‐14
eMail:
007@spook.com
source_id:
C_26
first_name:
James
last_name:
Bind
dob:
1968-‐07-‐14
eMail:
007@spook.com
_id:
007@spook.com
first_name:
James
last_name:
Bond
dob:
1968-‐07-‐14
Source
Data
Standardized
Data
Field
names
&
data
types
Single
View
Data
merged,
tested
&
reconciled
fno:
77
first:
Jim
last:
Bond
born:
1968-‐07-‐14
email:
007@spook.com
18. Step 7 (cont’d): Match, Merge & Reconcile
• Use iterative grouping functions to cluster records with similar
attributes
1. Match against unique, authoritative attributes (email address, credit card #)
2. Match by combining attributes (last name, DoB, zip code)
3. Use fuzzy matching to catch errors in source data (i.e. different spellings of customer
name)
• Apply confidence factor to dictate merging
– Automatically merge records with 95%+ confidence
– Manually inspect records with lower confidence
Develop
19. Step 7 (cont’d): MongoDB Tools
• Workers framework to parallelize document comparisons
• Grouping tool to cluster documents based on attribute similarity
– Levenshtein to calculate distances, single-linkage clustering for matching
Develop
20. Step 8: Architecture Design
Deploy
• Deployment infrastructure
• MongoDB Production Readiness Consulting
Package provides recommendations:
– Hardware sizing
– HA/DR strategies
– Scaling
– Security for corporate and regulatory compliance
• Follow-on services for implementation
21. Step 9: Modify Consuming Systems
Deploy
• Modify the apps that consume the
single view
– Create an API that exposes the single view (i.e.
RESTful web service)
– Re-point apps to the web service (reads initially)
• Modify one consuming application at
time
Call Center
Analytics
Technical
Support
Billing
Consuming
Systems
Reads
Single View
22. Step 10: Implement Maintenance Processes
Deploy
• Frequency of application launch & evolution
is accelerating
• Impacts to single view
– Adding new attributes from source systems
– Onboarding new data sources or digital channels
– Creating new apps that consume the single view
• Single view team needs to institutionalize
governance around on-going maintenance
– Repeat the 10-step process
– Dynamic schema is HUGE!
24. Scope
BusinessBenefits
Transactions are written first to the single view, which
propagates the data back to the source system of record.
Writes are performed concurrently to the source systems as
well as the single view
The single view data model is enriched with additional
sources to serve more applications, including real-time
analytics. The single view becomes a platform serving
multiple applications
Single View
Platform
Records are copied via ETL or message queue
mechanisms from the source systems into the single view,
serving read queries. The single view serves one specific
application
Single View
Application
Single View First
Dual Writes
Read
Centric
Transforming the role of
the single view
Reads & Writes
Single View Maturity Model
• Advantages of writing to the single view
– Fresher data
– Reduced app complexity
– Improved application agility
25. Architecture for Writes to the Single View
ETLorMessageQueue
Web
Mobile
CRM
Mainframe
Single View Call Center
Analytics
Technical
Support
Billing
Update
Queue
Reads
Writes
Source Systems Consuming Systems
Load
28. Required Database Capabilities
• Data model flexibility with a dynamic schema
• Real-time analytics
• Performance, scale & always-on
• Enterprise deployment model
29. MongoDB Compass MongoDB Connector for BI
MongoDB Enterprise Server
Enterprise Deployment Model
24x7Support
(1hourSLA)
CommercialLicense
(NoAGPLCopyleftRestrictions)
Platform
Certifications
MongoDB Ops Manager
Monitoring
&
AlerBng
Query
OpBmizaBon
Backup
&
Recovery
AutomaBon
&
ConfiguraBon
Schema
VisualizaBon
Data
ExploraBon
Ad-‐Hoc
Queries
VisualizaBon
Analysis
ReporBng
AuthorizaBon
AudiBng
EncrypBon
(In
Flight
&
at
Rest)
AuthenBcaBon
REST
API
Emergency
Patches
Customer
Success
Program
On-Demand
Online Training
Warranty
Limitation of
Liability
Indemnification
31. Single View of Customer
Insurance leader generates coveted single view of
customers in 90 days – “The Wall”
Problem
Why
MongoDB
Results
Problem Solution Results
No single view of customer, leading
to poor customer experience and
churn
145 years of policy data, 70+
systems, 24 800 numbers, 15+
front-end apps that are not
integrated
Spent 2 years, $25M trying build
single view with RDBMS – failed
Built “The Wall,” pulling in disparate
data and serving single view to
customer service reps in real time
Flexible data model to aggregate
disparate data into single data
store
Expressive query language and
secondary indexes to serve any
field in real time
Prototyped in 2 weeks
Deployed to production in 90 days
Decreased churn and improved
ability to upsell/cross-sell
32. Single View of LHC Analytics
Data aggregation system to accelerate scientific research &
discovery
Problem
Why
MongoDB
Results
Problem Solution Results
Raw data from LHC & experiments
distributed across multitude of
source systems
Scientists don’t know location of
source data, or how to extract it
Relational databases rigid data
model prevented aggregation of
data from different sources
Data Aggregation System built on
MongoDB, consolidating analytics
into a single view
Dynamic schema represents data
of any structure
MongoDB query language
supports simple lookups to
complex search, traversals &
analytics
A single query to MongoDB can
return 10,000 documents from
different data sources for real time
analytics
Accelerates scientific time to
insight
Accessed by 3,000 physicists from
200 research institutions across
the globe
34. Where to Go from Here?
• Single view projects are challenging
– Partner with a vendor offering proven methodology,
tools & technologies
• Learn More
– Download the whitepaper
– 10-Step Methodology to Building a Single View
• Engage
– MongoDB Global Consulting Services can help you
scope the project and get started
– Book a workshop
36. Single View of the Customer
360° view of the customer increases customer satisfaction,
cross-sell & up-sell with MongoDB, Spark, & Hadoop
Problem
Why
MongoDB
Results
Problem Solution Results
Customer data scattered across
100+ different systems
Poor customer experience: no
personalization, no consistent
experience across brands or
devices
No way to analyze customer
behavior to deliver targeted offers
Single View application on MongoDB
flexible data model, expressive query
language, secondary indexes, &
horizontal scalability
Data from old relational systems fed
into Spark for analysis and then stored
in MongoDB to support real-time CRM
Customer data synced from MongoDB
to Hadoop for nightly batch jobs, then
fed back to MongoDB for personalized
recommendations
Single view serves customers from
any channel
Stores 10s of TBs of customer data
across multiple data centers
Increased revenues from improved
customer intimacy, driving cross-
sell and upsell
Global
Airline