Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sq lite module4

10 views

Published on

Module 4 out of 9; SQLite training slides, databases, SQL, ERD, software design, database

Published in: Software
  • Be the first to comment

  • Be the first to like this

Sq lite module4

  1. 1. MODULE 4: DATA MODELING AND THE ERD SQL and Scripting Training (C) 2020-2021 Highervista, LLC 1
  2. 2. 2 TOPICS Questions from Prior Day Getting Started with a Database Project (the Capstone) Data Modeling Concepts: Part 2 Cardinality Steps for Creating an ERD Normalization Getting Started with a Database Project (the Capstone) Draw.io Preview of Afternoon Activities
  3. 3. DATA MODELING CONCEPTS: PART 2 SQL and Scripting Training
  4. 4. 4 DATA MODELING OBJECTIVES Understand concepts and purpose of data modeling Learn how relationships between entities are defined and refined, and how such relationships are incorporated into the database design process Learn how ERD components affect database design and implementation Learn how to interpret the modeling symbols
  5. 5. 5 PROCEDURE OF ERD Relatively simple representations of complex real-world data structures Data modeling is an iterative process A “complete” and “100% error-free” data model is not possible! Only an optimized data model is possible
  6. 6. 6 DATA MODEL: REVIEW Model: an abstraction of a real-world object or event ­ Useful in understanding complexities of the real- world environment Data model ­ A diagram that displays a set of tables and the relationships between them ­ A foundation! ­ Next slides: Draw.io entity relationship diagram (ERD) examples West and Fowler (1999)
  7. 7. 7 REVIEW: WHAT IS AN ENTITY RELATIONSHIP DIAGRAM (ERD)? ERD is a data modeling technique used in software engineering to produce a conceptual data model of an information system. ERDs illustrate the logical structure of databases. ERDs represent business or use cases. Source: Data Model (McFarland, 2020)
  8. 8. 8 EXAMPLE ERD (CHEN NOTATION) Source: Data Model (McFarland, 2020)
  9. 9. 9 THE IMPORTANCE OF DATA MODEL Blueprint: official documentation ­ Blueprint of house Employee’s w/o DB knowledge can understand ­ a data model diagram vs. a list of tables ­ Used as an effective Communication Tool ­ Improve interaction among the managers, the designers, and the end users Independence from a particular DBMS ­ Network DB, Object-oriented DB, etc. Source: Data Model (McFarland, 2020)
  10. 10. 10 DATA MODEL (CON’T) The data modeling revolves around discovering and analyzing organizational and user's data requirements (use cases). Requirements based on policies, meetings, procedures, system specifications, etc. • Identify what data is important • Identify what data should be maintained Source: Data Model (McFarland, 2020)
  11. 11. 11 ERD The major activity of this phase is identifying entities, attributes, and their relationships to construct model using the Entity Relationship Diagram. “Logical” (or design) names include: Entity/Attribute/Relationship “Physical” implementation names include: Table, Column, Line Entity à table Attribute à column Relationship à line Source: ERD (McFarland, 2020)
  12. 12. CARDINALITIES SQL and Scripting Training
  13. 13. 13 CARDINALITY (AND OPTIONALITY): CROW'S FOOT
  14. 14. 14 CLARIFICATION: HOW TO FIND CARDINALITIES? Cardinality: ­ The cardinality is the number of occurrences in one entity which are associated to the number of occurrences in another. ­ There are three basic cardinalities (degrees of relationship). ­ one-to-one (1:1) ­ one-to-many (1:M) ­ many-to-many (M:N) Note: In the Crow’s foot notation, the relationship lines between entities is collapsed into 1 bi-directional line Source: Example of Cardinality (McFarland, 2020)
  15. 15. 15 CROW’S FOOT: OPTIONALITY The Optionality is a property of an attribute which specify if a value is mandatory or optional. To identify optional relationship, look for auxiliary verb such as can or may
  16. 16. 16 DEGREE OF RELATIONSHIP Degree of a Relationship describes the number of entity participation ­Unary (Recursive) Relationship: One instance related to another of the same entity type ­Binary Relationship: Instances of two different entities related to each other ­Ternary Relationship: Instances of three different types related to each other
  17. 17. 17 Degree of Relationship … Source: McFarland, 2020
  18. 18. 18 BINARY RELATIONSHIP
  19. 19. 19 UNARY (RECURSIVE) RELATIONSHIP It is possible for an entity to have a relationship to itself—this is called a recursive relationship. supervises Is supervised by
  20. 20. 20 TERNARY RELATIONSHIP In Ternary relationship three different Entities takes part in a Relationship
  21. 21. 21 CROW’S FOOT: WEAK ENTITY RELATIONSHIP A weak entity is an entity that cannot be uniquely identified and existed by itself alone. A weak entity is an entity that exists only if it is related to a set of uniquely determined entities (owners of the weak entity). ­ More examples on the textbook Each employee might have none or multiple dependents. However, dependents must belong to at least one employee. EMP DEP weak entity notation
  22. 22. 22 TRANSFORMATION OF M:N A logical model will contain many M:N relationships These will need to be transformed when moving to a physical model When transform to relational model, many redundancies can be generated. ­ The relational operations become very complex and are likely to cause system efficiency errors and output errors. ­ Break the M:N down into 1:N and N:1 relationships using bridge entity (weak entity). CLASS STUDEN T ENROLL
  23. 23. 23 CONVERTING ONE M:N RELATIONSHIP TO TWO 1:M RELATIONSHIPS Association Entity (Join Table) Converting M:N Relationships (FileMaker, 2020)
  24. 24. 24 BRIDGE (ASSOCIATIVE) ENTITY ENROLL entity becomes a weak entity of both STUDENT entity and CLASS entity MUST have a composite (unique) identifier ­ STU_NUM (from STUDENT entity) and CLASS_CODE (from CLASS entity)
  25. 25. 25 M:N WITH OPTIONALITY ON BOTH SIDE A person might or might not work for an employer, but could certainly moonlight for multiple companies. An employer might have no employees, but could have any number of them. After broken down, optional relationship notation on both side of associative entity Association
  26. 26. 26 RECURSIVE RELATIONSHIP Each student is taught by a STA (student teaching assistant). Each STA can teach several students. A recursive relationship is an entity is associated with itself. Student teaches is taught by
  27. 27. 27 RECAP: EXAMPLE ERD MODEL (CROW'S FOOT) Example ERD Model (McFarland, 2020)
  28. 28. 28 RECAP: CHEN STYLE ERD
  29. 29. STEPS FOR CREATING AN ERD SQL and Scripting Training
  30. 30. 30 STEPS FOR CREATING AN ERD 1 Identify entity: look for singular nouns (but avoid a noun w/o attributes) and also avoid proper nouns Identify attribute: look for a descriptor whose values are associated with individual entities of a specific entity type Identify relationship: typically, a relationship is indicated by a verb connecting two or more entities. Identify cardinality: look for the number of occurrences in one entity which are associated to the number of occurrences in another
  31. 31. 31 WHAT ARE THE ENTITIES? ATTRIBUTES? ANG Laboratory has several chemists who work on one or more projects. Chemists also may use certain kinds of equipment on each project. The organization would like to store the chemist’s employee identification number, his/her name, up to three phone numbers, his/her project identification number and the date on which the project started. Every piece of equipment, the chemist uses, has a serial number and a cost.
  32. 32. 32 ENTITIES Chemist Project Equipment
  33. 33. 33 ATTRIBUTES? ANG Laboratory has several chemists who work on one or more projects. Chemists also may use certain kinds of equipment on each project. The organization would like to store the chemist’s employee identification number, his/her name, up to three phone numbers, his/her project identification number and the date on which the project started. Every piece of equipment, the chemist uses, has a serial number and a cost.
  34. 34. 34 ENTITIES, ATTRIBUTES AND IDENTIFIERS (IN CHEN NOTATION) Project Proj# Start-Date Chemist Phone# Emp# Equipment Serial# cost Phone#
  35. 35. 35 RELATIONSHIPS? ANG Laboratory has several chemists who work on one or more projects. Chemists also may use certain kinds of equipment on each project. The organization would like to store the chemist’s employee identification number, his/her name, up to three phone numbers, his/her project identification number and the date on which the project started. Every piece of equipment, the chemist uses, has a serial number and a cost.
  36. 36. 36 ENTITIES/RELATIONSHIPS & THEIR ATTRIBUTES Chemist Phone# Project Proj# Start-Date Equipment Works-On Uses Date-Assigned Emp# Serial# cost Assign-Date Phone#
  37. 37. 37 CARDINALITY The organization would like to store the date the chemist was assigned to the project and the date an equipment item was assigned to a particular chemist working on a particular project. A chemist must be assigned at least to one (or more) project and one (or more) equipment. Projects and equipment must be managed by only one chemist. A given project need not be assigned an equipment.
  38. 38. 38 COMPLETE ER DIAGRAM (CHEN) Chemist Phone# Project Proj# Start-Date Equipment Works-On Uses Date-Assigned N1 N1 Emp# Serial# cost Assign-Date Phone#
  39. 39. NORMALIZATION SQL and Scripting Training
  40. 40. 40 NORMALIZATION DEFINED Normalization is a process for evaluating and correcting relational structures to minimize data redundancies, reducing the likelihood of data inconsistencies or anomalies.
  41. 41. 41 DATABASE NORMALIZATION
  42. 42. 42 NORMALIZATION During the design process, we often create entities (tables) with inconsistencies and anomalies. ­ Anomaly: An inconsistent, incomplete or contradictory issue with data in a database. Anomalies can cause significant issues in running the database including the incorrect deletion or inappropriate updating of data within a table. Normalization is a process that we can step through to reduce anomalies in the relational database.
  43. 43. 43 WELL-STRUCTURED RELATIONS What constitutes a well-structured relation? Intuitively, a well-structured relation contains minimal redundancy and allows users to insert, modify, and delete rows in a table without errors or inconsistencies. EmpID Name Dept Salary 230 Pillsbury Marketing 58,000 241 Marshall Finance 68,400 277 Marco Accounting 66,000 279 Gunston Marketing 42,400 290 Jaffe Planning 49,000 EMPLOYEE1 Table
  44. 44. 44 EMPLOYEE1 is a well-structured relation. Each row of the table contains data describing one employee, and any modification of an employee’s data (such as a change in salary) is confined to one row in the table. EmpID Name Dept Salary 230 Pillsbury Marketing 58,000 241 Marshall Finance 68,400 277 Marco Accounting 66,000 279 Gunston Marketing 42,400 290 Jaffe Planning 49,000 EMPLOYEE1 Table Well-Structured Relations
  45. 45. 45 In contrast, EMPLOYEE2 is not a well-structured relation. Notice the redundancy. For example, values for EmpID, Name, Dept, and Salary appear in two separate rows for employees 241 and 290. EmpID Name Dept Salary Course Date 230 Pillsbury Marketing 58,000 C++ 2/12/06 241 Marshall Finance 68,400 SPSS 5/30/07 241 Marshall Finance 68,400 Web Design 11/2/08 277 Marco Accounting 66,000 C# 12/8/07 279 Gunston Marketing 42,400 Java 9/10/06 290 Jaffe Planning 49,000 Tax Acct 4/22/06 290 Jaffe Planning 49,000 Bus Adm 6/6/08 EMPLOYEE2 Table Well-Structured Relations
  46. 46. 46 EXAMPLE TABLE: DENTIST-PATIENT (WITH ANOMALIES) Insert anomaly: No new dentist or patient record can be added unless an appointment has been made for that patient or dentist. Delete anomaly: If the appointment on 1/9/05 at 10 is canceled and deleted, the information about patient P100 would be gone, as the patient has only one appointment. Update anomaly: If patient P108 has a name change, it is possible only row 2 will get updated, not row 4.
  47. 47. 47 NORMALIZING TABLES On the previous four slides we presented an intuitive discussion of well- structured relations. We need a more formal procedure for designing them. Normalization is the process of successively reducing relations with anomalies to produce smaller, well-structured relations. Some of the goals are §Minimize data redundancy, thereby avoiding anomalies and conserving storage space. §Simplify the enforcement of referential integrity constraints. §Make it easier to maintain data (insert, delete, update). §Provide a better design that is an improved representation of the real world and a stronger basis for future growth.
  48. 48. 48 NORMAL FORMS: 1NF, 2NF, AND 3NF We work through the ‘normal’ forms, successively, through each table (entity) in our model. While there are more ‘normal forms’ in addition to 1NF, 2NF, and 3NF, these three are essential. Work through 1NF first. Progressive: 1NF then 2NF then 3NF There are 11 normal forms (we’ll focus on 1NF, 2NF, 3NF only)
  49. 49. 49 1NF (FIRST NORMAL FORM) For 1NF, ensure the following: ­Every attribute (or field) is a single value for each table. ­There are no repeating attributes. ­Each attribute is ‘atomic’ (as small as it can get).
  50. 50. 50 FIRST NORMAL FORM All fields describe the entity represented by the table. All fields contain simplest possible values. No multivalued attributes (also called repeating groups). Home Town Chicago, IL NOT City State Chicago IL 1st NORMAL
  51. 51. 51 NOT 1NF EmpID Dept CourseName DateCompleted 203 Finance Tax Accounting 6/22/07 421 Info Systems Java Database Mgt 10/7/07 6/4/06 666 Marketing Another multivalued attribute A multivalued attribute
  52. 52. 52 1NF - ELIMINATING MULTIVALUED ATTRIBUTES EmpID Dept CourseName DateCompleted 203 Finance Tax Accounting 6/22/07 421 Info Systems Java 10/7/07 421 Info Systems Database Mgt 6/4/06 666 Marketing This new table does have only single-valued attributes and so satisfies 1NF. However, as we saw, the table still has some undesirable properties.
  53. 53. 53 UN-NORMALIZED ORDERS TABLE (FROM 1NF TO 2NF) Issues: • Making a change to a part description … • A part that appears in many rows . • The primary key in this table is (OrderNum, PartNum). So if we wanted to insert a new part into the table… • What if we deleted an order? Un-normalized Orders Table (McFarland, 2020)
  54. 54. 54 When transitioning from 1NF to 2NF, what happens to the number of tables? Transitioning from 1NF to 2NF (McFarland, 2020) 1NF to 2NF
  55. 55. 55 SECOND NORMAL FORM Table must be in 1st normal form first. 2nd Normal form: No partial dependencies exist. (No non-key fields are determined by only part of a multiple-field primary key, i.e., non-keys are identified by the whole primary key) *Primary key NOT2nd NORMAL Course#* Grade CIS 101 B Student ID* 12345 Course#* Name CIS 101 Higgins Student ID* 12345 DeterminesDetermines
  56. 56. 56 THIRD NORMAL FORM Table must be in 2nd Normal Form. No transitive dependencies (no non-key fields are determined by other non-key fields, i.e., non-keys are identified by only the primary key). Course# * Textbook CIS 101 Intro to CIS Credits 3 NOT3rd NORMAL *Primary key Course# * Textbook CIS 101 Intro to CIS Book Price $45.99 Determines Determines
  57. 57. 57 FOURTH AND FIFTH NORMAL FORM Fourth Normal Form – Table is 3NF and has at most one multivalued dependency. Can produce records with many blank values. • Fifth Normal Form – Table is in 4NF and the table cannot be split into further tables.
  58. 58. 58 HOW TO GET STARTED WITH A DATABASE PROJECT 1. Explore the project a) Size, scope, depth and breath b) Executive sponsor and/or funding 2. Capstone Specific: Review data to be modeled (for Capstone, NIH database) 3. Develop Statement of Work (define scope, depth, limits) 4. Develop ERD (entities, attributes, relationships) - Crow’s foot
  59. 59. 59 HOW TO GET STARTED WITH A DATABASE PROJECT 6. Normalize the Model (remove anomalies from the model) 7. Apply the Normalized Model a. Create the Database (create database) b. Create tables with fields using Data Definition Language (DDL) c. Create Data Manipulation Language (DML) to query (question) the data 8. Implement Functionality a. Use Python to extract data from a data source b. Load extracted data into the database c. Be able to report on the data loaded into the database
  60. 60. 60 REFERENCES Draw.io. (2020). Diagrams.net - free flowchart maker and diagrams online. Retrieved November 23, 2020, from https://app.diagrams.net/ FileMaker. (2020). File Maker Pro 16: Many-to-many relationships. Retrieved December 10, 2020, from https://fmhelp.filemaker.com/help/16/fmp/en/index.html SQLite Browser. (2020, November 09). DB Browser for SQLite. Retrieved November 23, 2020, from https://sqlitebrowser.org/ SQLite. (2020). SQLite Main Website. Retrieved November 23, 2020, from https://sqlite.org/index.html McFarland, R. (2020). Published Articles: Ron McFarland. Retrieved December 03, 2020, from https://medium.com/@highervista Tutorialspoint. (2020). SQLite Tutorial. Retrieved November 23, 2020, from https://www.tutorialspoint.com/sqlite/index.htm Matthew West and Julian Fowler (1999). Developing High Quality Data Models. The European Process Industries STEP Technical Liaison Executive (EPISTLE).
  61. 61. 61 INTRODUCTION Ron McFarland Technologist, Educator Source: Microsoft Images
  62. 62. 62 ABOUT THIS COURSE This course is distributed free. I use several sources. But importantly, I use the book noted on the next slide. If you are using these PowerPoints, please attribute Highervista, LLC and me (Ron McFarland). IN ADDITION, please attribute the author noted on the next slide, as the author’s textbook provides essential information for this course. Source: Microsoft Images
  63. 63. 63 INTRODUCTION This course is offered to you free. HOWEVER, please purchase the following book, as it is a primary resource for this course. I do not make any $ from this course or this book. So, since a handful of good content is derived from the following text, please support this author! Title: SQL Quickstart Guide Author: Walter Shields Available: Amazon, B&N, and through ClydeBank media website at: https://www.clydebankmedia.com/books/programming- tech/sql-quickstart-guide

×