2. Data Quality Facts
● Cost of poor data quality in US - $600 Billion
● Poor Data/Lack of visibility cited as #1 reason for
project cost overruns
● Poor data quality costs the US Economy $3.1 Trillion a
year
● Implementing data quality best practices boosts
revenue by 66%
● Median Fortune 1000 company could increase
revenue by $2.01 Billion if they improved usability of
data by 10%
Source: http://www.webmastat.com/blog/2012/09/07/7-facts-about-data-quality/
3. What is Data Quality?
Measuring data to determine if it is
“fit for purpose”
4. Fit For Purpose?
● “Bad” data is a myth!
● Two Questions
● What is the data used for?
● What can be measured to make sure it meets
the need?
● Application use vs. Reporting/Analysis
5. Data Quality Dimensions
● Consistency ● Accuracy
● Correctness ● Objectivity
● Timeliness ● Conciseness
● Precision ● Usefulness
● Unamiguous ● Usability
● Completeness ● Relevance
● Reliability ● Amount of data
Source: Data Quality Fundamentals, The Data Warehousing Institute
6. Measuring Data Quality
● Profiling – understanding metadata
● Point in time shows what data looks like now
● Automating shows trends
– Alert to new/potential issues as they happen
– Potentially fix issues in near real time
– Six Sigma Principals
8. Data Profiling Analysis
● Duplication ● Character Set
● Pattern matching ● Reference Data
● Boolean/String/Numb Matching
er ● Value Distribution
● Date Gap ● Inter-Data Set
● Date/time Comparisons
● Day of Week
9. Master Data Management
● Create a gold standard for data
● Distribute data so that all sources are uniform
● Names
● Addresses
● Phone Numbers
● Products
● Can hook into third party sources
10. Data Governance Program
● Central authority for data quality control
● Applies information collected from data
profiling, MDM, etc. Uniformly across the
business
● Communication channels between business
and IT groups