Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Modeling, Data Governance, & Data Quality

7,896 views

Published on

Data Governance is often referred to as the people, processes, and policies around data and information, and these aspects are critical to the success of any data governance implementation. But just as critical is the technical infrastructure that supports the diverse data environments that run the business. Data models can be the critical link between business definitions and rules and the technical data systems that support them. Without the valuable metadata these models provide, data governance often lacks the “teeth” to be applied in operational and reporting systems.

Join Donna Burbank and her guest, Nigel Turner, as they discuss how data models & metadata-driven data governance can be applied in your organization in order to achieve improved data quality.

Published in: Technology
  • Login to see the comments

Data Modeling, Data Governance, & Data Quality

  1. 1. Data Modeling, Data Governance & Data Quality Donna Burbank & Nigel Turner Global Data Strategy Ltd. Lessons in Data Modeling DATAVERSITY Series December 5th, 2017
  2. 2. Global Data Strategy, Ltd. 2017 Donna Burbank Donna is a recognised industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. Her background is multi- faceted across consulting, product development, product management, brand strategy, marketing, and business leadership. She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. In past roles, she has served in key brand strategy and product management roles at CA Technologies and Embarcadero Technologies for several of the leading data management products in the market. As an active contributor to the data management community, she is a long time DAMA International member, Past President and Advisor to the DAMA Rocky Mountain chapter, and was recently awarded the Excellence in Data Management Award from DAMA International in 2016. She was on the review committee for the Object Management Group’s Information Management Metamodel (IMM) and the Business Process Modeling Notation (BPMN). Donna is also an analyst at the Boulder BI Train Trust (BBBT) where she provides advices and gains insight on the latest BI and Analytics software in the market. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co-authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications. She can be reached at donna.burbank@globaldatastrategy.com Donna is based in Boulder, Colorado, USA. 2 Follow on Twitter @donnaburbank Today’s hashtag: #LessonsDM
  3. 3. Global Data Strategy, Ltd. 2017 Nigel Turner Nigel Turner has worked in Information Management (IM) and related areas for over 20 years. This experience has embraced Data Governance, Information Strategy, Data Quality, Data Governance, Master Data Management, & Business Intelligence. He spent much of his career in British Telecommunications Group (BT) where he led a series of enterprise wide IM & data governance initiatives. After leaving BT in 2010 Nigel became VP of Information Management Strategy at Harte Hanks Trillium Software, a leading global provider of Data Quality & Data Governance tools and consultancy. Here he engaged with over 150 customer organizations from all parts of the globe. Currently Principal Consultant for EMEA at Global Data Strategy, Ltd, he has been a principal consultant at such firms as FromHereOn and IPL, where he has led Data Governance engagement with customers such as First Great Western. Nigel is a well known thought leader in Information Management and has presented at many international conferences. He has also lectured part time at Cardiff University, where he taught Data Governance modules to both undergraduate and graduate students. In addition he was a part time Associate Lecturer at the UK Open University where he taught Systems & Management. Nigel is very active in professional Data Management organizations and is an elected Data Management Association (DAMA) UK Committee member. He was the joint winner of DAMA International’s 2015 Community Award for the work he initiated and led in setting up a mentoring scheme in the UK where experienced DAMA professionals coach and support newer data management professionals. Nigel is based in Cardiff, Wales, UK. Follow on Twitter @NigelTurner8 Today’s hashtag: #LessonsDM
  4. 4. Global Data Strategy, Ltd. 2017 DATAVERSITY Lessons in Data Modeling Series • January - on demand How Data Modeling Fits Into an Overall Enterprise Architecture • February - on demand Data Modeling and Business Intelligence • March - on demand Conceptual Data Modeling – How to Get the Attention of Business Users • April - on demand The Evolving Role of the Data Architect – What does it mean for your Career? • May - on demand Data Modeling & Metadata Management • June - on demand Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling • July - on demand Data Modeling & Metadata for Graph Databases • August - on demand Data Modeling & Data Integration • Sept - on demand Data Modeling & Master Data Management (MDM) • October - on demand Agile & Data Modeling – How Can They Work Together? • December Data Modeling, Data Quality & Data Governance 4 This Year’s Line Up
  5. 5. Global Data Strategy, Ltd. 2017 DATAVERSITY Data Architecture Strategies • January Panel: Emerging Trends in Data Architecture – What’s the Next Big Thing? • February Building an Enterprise Data Strategy – Where to Start? • March Modern Metadata Strategies • April The Rise of the Graph Database: Practical Use Cases & Approaches to Benefit your Business • May Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape • June Artificial Intelligence: Real-World Applications for Your Organization • July Panel: Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic Asset • August Data Lake Architecture – Modern Strategies & Approaches • Sept Master Data Management: Practical Strategies for Integrating into Your Data Architecture • October Business-Centric Data Modeling: Strategies for Maximizing Business Benefit • December Panel: Self-Service Reporting and Data Prep – Benefits & Risks 5 Next Year’s Line Up for 2018 – New, Broader Focus
  6. 6. Global Data Strategy, Ltd. 2017 What We’ll Cover Today • Data Governance is often referred to as the people, processes, and policies around data and information, and these aspects are critical to the success of any data governance implementation. • But just as critical is the technical infrastructure that supports the diverse data environments that run the business. • Data models can be the critical link between business definitions and rules and the technical data systems that support them. Without the valuable metadata these models provide, data governance often lacks the “teeth” to be applied in operational and reporting systems. • Self Service data prep and analytics add additional complexity, as a more diverse set of users has access to manipulate, model, and report on enterprise data • This presentation will offer some practical guidance on how to integrate governance to balance Enterprise Standards with Self-Service Agility 6
  7. 7. Global Data Strategy, Ltd. 2017 Business Drivers for Data Architecture • As more organizations see data as a strategic asset, and with the drive towards Digital Business Transformation on the rise, the need to analyze, understand & govern core data assets continue to be a key goal. 7 What’s Driving the Need? From Trends in Data Architecture 2017, by Donna Burbank & Charles Roe
  8. 8. Global Data Strategy, Ltd. 2017 Who is Responsible for Creating a Data Architecture? • With a greater business focus on data and a wider range of technologies associated with Data Management… • … it is not surprising that there is a concomitant rise in the diversity of roles responsible for developing a Data Architecture. • … the role of the data architect, not surprisingly, continues to play a large role. 8 Wide Range of Responses shows Need for Collaboration Collaboration is Key From Trends in Data Architecture 2017, by Donna Burbank & Charles Roe Wide range of roles
  9. 9. Global Data Strategy, Ltd. 2017 Data Modeling Data Quality Data Governance Data Modeling, Data Governance & Data Quality – the Virtuous Circle What is Data Quality? Data that is demonstrably fit for business purposes Provides the means to deliver Drives the need for What is Data Governance? A continuous process of managing and improving data for the benefit of all stakeholders What is Data Modeling? A process for translating business rules & definitions to the technical data systems & structures that support them Scopes & helps prioritize
  10. 10. Global Data Strategy, Ltd. 2017 How Data Modeling, Governance & Quality Interact DATA MODELING DATA QUALITY DATA GOVERNANCE Maps out the overall relationships between data entities and their attributes Data profiling identifies & baselines the current state of key data entities and attributes Provides an overarching strategic framework for data improvement Helps to scope and prioritize the data that really matters for Governance and DQ improvement Raises awareness of DQ issues and problems in source data, and their impact Assigns accountable data owners and data stewards to lead data improvement efforts Starts to identify the key data stakeholders who may become data owners & data stewards Delivers the real benefits of better data through data cleanse, enrichment & sustenance Ensures the business knowledge to define business rules and DQ thresholds Acts as a communication tool to improve understanding of the data estate Enables automation of business rules enforcement via the deployment of data quality tools Ensures data improvement aligns and evolves with changing business needs First step in defining DQ KPIs and metrics Provides an empirical foundation for action and improvement – KPIs and metrics Creates the cross-business teams needed to tackle data problems & issues Creates the link from business rules > data definitions > database design & implementation Helps build the business case for investment in a more strategic approach Helps to build and deliver the business case for improvement 10
  11. 11. Global Data Strategy, Ltd. 2017 Data Governance – Overarching Framework Organization & People Process & Workflows Data Management & Measures Culture & Communication Vision & Strategy Tools & Technology Business Goals & Objectives Data Issues & Challenges Managing the Complex Interactions between Technology, Process and People
  12. 12. Global Data Strategy, Ltd. 2017 Data Improvement - From Firefighting to Fire Prevention 12
  13. 13. Global Data Strategy, Ltd. 2017 What is a Data Model? 13 Translates Business Rules & Definitions… …to the Technical Data Systems & Structures that Support Them
  14. 14. Global Data Strategy, Ltd. 2017 Data Modeling is Hotter than Ever 14 In a recent DATAVERSITY survey, over 96% of were engaged in Data Modeling in their organizations.
  15. 15. Global Data Strategy, Ltd. 2017 What is a Data Model? 15 Translates Regulations, Policies & Procedures… …to the Technical Data Systems & Structures that Support Them Regulation - e.g. GDPR Policy “All Personally Identifiable Information (PII) must be anonymized for the purpose of information sharing between departments. “ Which data fields constitute PII in our databases?
  16. 16. Global Data Strategy, Ltd. 2017 Technical & Business Metadata • Technical Metadata describes the structure, format, and rules for storing data • Business Metadata describes the business definitions, rules, and context for data. • Data represents actual instances (e.g. John Smith) 16 CREATE TABLE EMPLOYEE ( employee_id INTEGER NOT NULL, department_id INTEGER NOT NULL, employee_fname VARCHAR(50) NULL, employee_lname VARCHAR(50) NULL, employee_ssn CHAR(9) NULL); CREATE TABLE CUSTOMER ( customer_id INTEGER NOT NULL, customer_name VARCHAR(50) NULL, customer_address VARCHAR(150) NULL, customer_city VARCHAR(50) NULL, customer_state CHAR(2) NULL, customer_zip CHAR(9) NULL); Technical Metadata John Smith Business Metadata Data Term Definition Employee An employee is an individual who currently works for the organization or who has been recently employed within the past 6 months. Customer A customer is a person or organization who has purchased from the organization within the past 2 years and has an active loyalty card or maintenance contract.
  17. 17. Global Data Strategy, Ltd. 2017 Business vs. Technical Metadata • The following are examples of types of business & technical metadata. 17 Business Metadata Technical Metadata • Definitions & Glossary • Data Steward • Organization • Privacy Level • Security Level • Acronyms & Abbreviations • Business Rules • Etc. • Column structure of a database table • Data Type & Length (e.g. VARCHAR(20)) • Domains • Standard abbreviations (e.g. CUSTOMER -> CUST) • Nullability • Keys (primary, foreign, alternate, etc.) • Validation Rules • Data Movement Rules • Permissions • Etc.
  18. 18. Global Data Strategy, Ltd. 2017 Human Metadata • Much business metadata and the history of the business exists in employee’s heads. • It is important to capture this metadata in an electronic format for sharing with others. • Avoid the dreaded “I just know” 18 Avoid the dreaded “I just know” Part Number is what used to be called Component Number before the acquisition. Business Glossary Metadata Repository Data Models Etc. Collaboration Tools
  19. 19. Global Data Strategy, Ltd. 2017 Business Definitions From Data Modeling for the Business by Hoberman, Burbank, Bradley, Technics Publications, 2009
  20. 20. Global Data Strategy, Ltd. 2017 Publishing Business Definitions in a Data Model 20 • Data Models are a great place to store business definitions • Display them on the model for a business audience • Store them in the model repository for reuse across the organization (various users, tools, etc.)
  21. 21. Global Data Strategy, Ltd. 2017 Marketing Database Netezza Creating a Technical Data Inventory • Data models & the associated metadata can create a real-world inventory of the data storage associated with key business data domains in the control of a data governance program. 21 Linking business definitions to technical implementations Customer Customer Database Oracle Sales Database DB2 SAP Data Lake on Hadoop Customer Database SQL Server CRM Database POS Data Store
  22. 22. Global Data Strategy, Ltd. 2017 Data Lineage • In the data warehouse example below, metadata for CUSTOMER exists in a number tools & data stores. • This lineage can be tracked in many data modeling tools & associated metadata & governance solutions. 22 Sales Report CUSTOMER Database Table CUST Database Table CUSTOMER Database Table CUSTOMER Database Table TBL_C1 Database Table Business Glossary ETL Tool ETL Tool Physical Data Model Physical Data Model Logical Data Model Dimensional Data Model BI Tool
  23. 23. Global Data Strategy, Ltd. 2017 Technical Metadata Makes Data Governance Actionable • Data models can help take the business rules & definitions defined in policies and make them actionable in physical systems, maintaining a lineage & audit trail. 23 Data models are a good vehicle for this Policies & Procedures Business Rules & Definitions Technical Implementation Audit & Lineage
  24. 24. Global Data Strategy, Ltd. 2017 Data Quality Improvement 24 Why bother? 90% OF ALL DATA HAS BEEN CREATED IN THE LAST 2 YEARS AVERAGE BUSINESS DATA VOLUMES DOUBLE EVERY 1.2 YEARS 2.5 QUINTILLION GRAINS OF SAND ON EARTH 7.5 QUINTILLION BYTES OF NEW DATA CREATED EVERY DAY
  25. 25. Global Data Strategy, Ltd. 2017 Data Quality Problems - Recent Evidence 25 Source: Only 3% of Companies’ Data Meets Basic Quality Standards Tadhg Nagle, Thomas C. Redman & David Sammon Harvard Business Review September 11 2017
  26. 26. Global Data Strategy, Ltd. 2017 Some Industry Statistics Raw data used in Self-Service Analytics and BI environments is often so poor that many data scientists and BI professionals spend an estimated 50 – 90% of their time cleaning and reformatting data to make it fit for purpose. Source: DataCenterJournal.com Correcting poor data quality is a Data Scientist’s least favorite task, consuming on average 80% of their working day Source: Forbes 2016 Lack of effective Data Governance and the absence of shared data definitions and metadata cited as main impediments to the success of Data Lakes Source: Radiant Advisors 2015 The US economy loses $3.1 trillion a year because poor data quality Source: Artemis Ventures
  27. 27. Global Data Strategy, Ltd. 2017 Traps for the Unwary – Why DQ & Data Governance Can Fail  Lack of business leadership and commitment  Failure to link DQ / DG to organizational goals and benefits  Failure to focus on the data that really matters  Giving people data responsibility but not equipping them to succeed  Placing too much emphasis on data monitoring and not data improvement  Thinking new technology alone will solve the problems  Forgetting DQ / DG must embrace all who use data across an organization  Not delivering business value early and regularly
  28. 28. Global Data Strategy, Ltd. 2017 Why It Can Be Hard - the Horizontal Data Flow Sales Operations Dispatch Finance CUSTOMER DATA PRODUCT DATA FINANCE DATA EMPLOYEE DATA
  29. 29. Global Data Strategy, Ltd. 2017 The Newton’s Cradle Effect 29 Problems often emerge far away from the cause
  30. 30. Global Data Strategy, Ltd. 2017 Creating the Data Improvement ‘Sweet Spot’ – Focus on Key Data 30 Data Governance Data Modeling Data Quality Improving core data through Data Modeling, Data Governance & Data Quality Core Data ‘Sweet Spot’ DATA GOVERNANCE A management framework for data accountability & data improvement DATA QUALITY Approaches & tools for improving data accuracy, completeness & consistency DATA MODELING The visual representation of data relationships & their physical storage in technical platforms CORE DATA Data which is widely used by many people & processes across the business and which is critical to business success
  31. 31. Global Data Strategy, Ltd. 2017 Implement “Just Enough” Data Governance • Know what to manage closely and what to leave alone • As a general rule, the more the data is shared across & beyond the organization, the more formal governance needs to be 31 Core Enterprise Data Functional & Operational Data Exploratory Data Reference & Master Data Core Enterprise Data • Common data elements used by multiple stakeholders across Bus, LOBs, functional areas, applications, etc. • Highly governed • Highly published & shared Functional & Operational Data • Lightly modeled & prepared data for limited sharing & reuse • Collaboration-based governance • May be future candidates for core data Exploratory Data • Raw or lightly prepped data for exploratory analysis • Mainly ad hoc, one-off analysis • Light touch governance Examples • Operational Reporting • Non-productionized analytical model data • Ad hoc reporting & discovery Examples • Raw data sets for exploratory analytics • External & Open data sources Examples • Common Financial Metrics: for Financial & Regulatory Reporting • Common Attributes: Core attributes reused across multiple areas (e.g. Customer name, Account ID, Address) Master & Reference Data • Common data elements used by multiple stakeholders across functional areas, applications, etc. • Highly governed • Highly published & shared Examples • Reference Data: Procedure codes, Country Codes, etc. • Master Data: Location, Customer, Product
  32. 32. Global Data Strategy, Ltd. 2017 The Rise of Self-Service BI, Analytics, & Data Prep • The interest in self-service data reporting has increased among data-savvy business users. • The availability of tools & data sets has made it easier for business people to do their own data manipulation & reporting • Self Service BI & Data Manipulation – the tools are slick! • Accessible Data & Open Data Sets – the amount of data available is amazing! • Tech-Savvy Business Users – this isn’t any harder than a spreadsheet! • While this offers great opportunities, it can also be fraught with challenges. • Data modelers and the models & metadata they create can make the job of business intelligence easier for both BI professionals and the casual BI reporting user • Particularly for enterprise-wide, standardized data • But what about non-standard, non-relational, and discovery data? 32
  33. 33. Global Data Strategy, Ltd. 2017 The Self-Service User 33 “If there are standardized data sets, I’d love to use them!” e.g. Master Data, Data Warehouse “Published documentation, metadata, & standard definitions are super-helpful!” e.g. Glossaries, data models, etc. “I want to integrate these data sets with my own exploratory data for analysis & modeling!” e.g. Self-Service Data Prep & Analysis Tools “How can I leverage what other people have done, and see what is most relevant? e.g. Data Cataloguing & Crowdsourcing
  34. 34. Global Data Strategy, Ltd. 2017 Crowdsourcing Governance & Metadata Definitions • Many data governance projects (& vendors) are embracing the concept of “crowdsourcing”. i.e. The Wikipedia vs. Encyclopedia approach • Open editing • Popularity & Usage Rankings • Dynamically changing 34 Encyclopedia Wikipedia • Created by a few, then published as read-only • Single source of “vetted” truth • Static • Created by a by many, edited by many • Eventual consistency with multiple inputs • Dynamic For Standardized, Enterprise Data Sets For Self-Service Data Prep & Analytics
  35. 35. Global Data Strategy, Ltd. 2017 Harnessing “Tribal Knowledge” 35 Usage Ranking • Which: • Definitions are most complete & helpful? • Algorithms offer a helpful starting point? • Queries offer great logic to share? • Etc. Helpfulness Ranking • Which: • Queries are others using? • Tables are accessed the most? • Glossary terms are most often searched? • Etc. Collaboration & Crowdsourcing Term: Part Number Alternate Names: Component Number Definition: A part number is an 8 digit alphanumeric field that uniquely identifies a machine part used in the manufacturing process. Is this truly the same as the old Component Number? That was a 10 digit numeric field. It didn’t have letters. Yes, it is. I had the same problem for the finance app, and I wrote a quick program to convert the numbers. We just strip off the first two chars now. Click here to find it.
  36. 36. Global Data Strategy, Ltd. 2017 Finding the Right Balance 36 • When implementing successful data governance in today’s rapidly-changing, self-service data landscape, it is important to find a balance between: Standards-based Governance The two methods work well together, using the right approached depending on the data usage. Collaboration-based Governance • Well-suited for enterprise-wide data standards • Well-suited for self-service data preparation & analytics
  37. 37. Global Data Strategy, Ltd. 2017 Summary • Data governance requires a mix of people, processes, and technologies • Data models & metadata support the policies & procedures defined by data governance • Data model metadata supports actionable data governance through • Linking business & technical definitions & business rules • Providing standardization & consistency • Supporting data lineage & audit trails • It is important to establish the right level of governance for each unique data use case • Self-Service data prep & analytics require a new paradigm for “crowdsourcing” metadata • A combination of standards-driven + collaborative governance provides a powerful mix that offers value across the organization.
  38. 38. Global Data Strategy, Ltd. 2017 About Global Data Strategy, Ltd • Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. • Our passion is data, and helping organizations enrich their business opportunities through data and information. • Our core values center around providing solutions that are: • Business-Driven: We put the needs of your business first, before we look at any technology solution. • Clear & Relevant: We provide clear explanations using real-world examples. • Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of technical expertise in the industry. 38 Data-Driven Business Transformation Business Strategy Aligned With Data Strategy Visit www.globaldatastrategy.com for more information
  39. 39. Global Data Strategy, Ltd. 2017 DATAVERSITY Data Architecture Strategies • January Panel: Emerging Trends in Data Architecture – What’s the Next Big Thing? • February Building an Enterprise Data Strategy – Where to Start? • March Modern Metadata Strategies • April The Rise of the Graph Database: Practical Use Cases & Approaches to Benefit your Business • May Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape • June Artificial Intelligence: Real-World Applications for Your Organization • July Panel: Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic Asset • August Data Lake Architecture – Modern Strategies & Approaches • Sept Master Data Management: Practical Strategies for Integrating into Your Data Architecture • October Business-Centric Data Modeling: Strategies for Maximizing Business Benefit • December 5 Panel: Self-Service Reporting and Data Prep – Benefits & Risks 39 Next Year’s Line Up for 2018 – New, Broader Focus
  40. 40. Global Data Strategy, Ltd. 2017 White Paper: Trends in Data Architecture 40 Free Download • Available for download on dataversity.net
  41. 41. Global Data Strategy, Ltd. 2017 White Paper: Emerging Trends in Metadata Management • Download from www.globaldatastrategy.com • Under ‘Whitepapers’ 41 Free Download
  42. 42. Global Data Strategy, Ltd. 2017 Questions? 42 Thoughts? Ideas?

×