2. The Data Dictionary
• Is a reference work of data about data (metadata), one that is compiled by
the systems analyst to guide them through the analysis and design.
• It is where the systems analyst goes to define or look up information about
entities, attributes and relationships on the ERD (Entity Relationship
Design).
Is the information you see in the data dictionary.
3. Importance of a Data Dictionary
• Avoid duplication
• Allows better communication between organizations who shares the same
database.
• Makes maintenance straightforward
• It is valuable for their capacity to cross-referencing data items.
Enables one description of a data item to be stored and accessed by all
individuals so that definition for a data item is established and used.
4. Uses of Data Dictionary
• Validates the date flow diagram for completeness and accuracy
• Provides starting point for developing screen and reports.
• Determine the contents of data stored files
• Develop the logic for data flow diagram processes.
5. The Data Repository
• Repository – it is a larger collection of project information.
It contains the ff:
• Information about the data maintained by the system.
• Procedural logic
• Screen and report design
• Data relationships
• Project requirements and the final system deliverables.
• Project management information.
9. Four Categories of Data Dictionary
• Data Flows
• Data Structures
• Data Elements
• Data Stores
10. Defining the Data Flow
• Data flow is a collection of data elements
• It is the first component to be defined.
• Elements / Fields – used to describe details of each data flow.
• Data Structure – group of elements.
11. ID
Description
Source of the
Data Flow
Type of
Data Flow
Name of Data
Structure
Comments /
Notations
Destination of
the Data Flow
Volume per
unit of time
Name
12. Describing Data Structures
• Data structures and usually described using algebraic notations.
• An equal sign = means “is composed of”.
• A plus sign + means “and”.
• Braces { } indicates repetitive elements also called repeating groups or tables.
• Brackets [ ] represent an either/or situation.
• Parentheses ( ) represent an optional element.
• Each structural record must be further defined until the entire set is broken
down into its component elements.
13. How are the symbols used ?
Repeating items
Optional element
“and”
“is composed of ”
“either/or” situation
15. Logical and Physical Data Structures
• Logical Data Structure – shows what data the business needs for its day-
to-day operations. Ex. Name, Address, Orders.
• Physical Data Structure – includes additional elements necessary for
implementing the system.
Examples of physical design elements:
• Key fields used to locate records.
• Codes to identify the status of master records.
• Transaction codes
• Repeating group entries containing a count of how many items are in the
group.
• Limits on the number of items in a repeated group.
• A password
16. Other examples:
•12{Monthly Sales} – indicates 12
months in a year.
•Customer Master File = {Customer
Records} – means indefinitely.
• 5
1{Order Line} – both means as a
structural record and a repeating item
based on Figure.
17. Data Elements
• Data elements definitions describe a data type.
• Each element should also be defined to indicate specifically what it
represents. It should be specific.
26. Analyzing Input and Output
an important step in creating the data dictionary is to identify and
categorize system input and output data flow.
27. Different fields for Input and
Output Analysis:
1. Descriptive Name
2. User Contact
3. File Type (is it an Input or Output?)
4. File Format
5. Sequencing Elements
6. List of Elements
7. Comments
28. Developing Data Stores
Data flows represent data in motion data stores represent at rest
Data stores contain information of a permanent or semi permanent nature.
When data stores are created for only one report screen we refer them as
“user views”
29.
30. Conclusions
The ideal data dictionary is automated, interactive, online and
evolutionary.
The data dictionary should be tied into a number of systems programs so
that when an item is updated or deleted from the data dictionary, it is
automatically updated or deleted from the data base.
The data dictionary may also be used to create screens, reports and
forms.
-Fin.-
Editor's Notes
Information in the Data Flow.
ID – optional identification number. Sometimes the ID is coded using a scheme to identify the system and the application within the system.
Name – the text that appear on the DFD and be referenced in all descriptions using the data flow.
Description – what the data flow is all about.
Source of the data flow – it could be an entity, a process or a data flow coming from a data store. From where the data flow is coming from.
Destination of the data flow – it could be also an entity, a process or a data flow coming from a data store. To where the data is flowing through.
Type of data flow
- it could be a record entering or leaving a file. (File).
- a record containing a report or flow (Report, Flow).
- it could be a GUI or a Web Page (Screen).
- a data that are used between processes (Internal).
Name of the Data Structure – describing the elements found in this data flow.
Volume per unit of time – the data could be records per day or any other unit of time.
Comments/Notations – for further clarification of the data flow.
Optional elements and other elements such as (MIDDLE INITIAL), (APARTMENT) and (ZIP EXPANSION) does not reflect the functional area that they are used. Elements like this allows analyst to define these records once and use them in many different applications.
Information in the Logical and Physical Data Structures.
Key fields used to locate records – like item number, which is not required for a business to function but is necessary for identifying and locating computer records.
Codes to identify the status of master records – like whether and employee is active or inactive. Such codes can be maintained on files that produce tax information.
Transaction codes - used to identify types of records when a file contains different records.
Repeating group entries containing a count of how many items are in the group.
Limits on the number of items in a repeated group.
A password – used by a customer accessing a secure Web site.
Information in the Data Elements.
ID – optional identification number. Sometimes the ID is coded using a scheme to identify the system and the application within the system.
Name – the text that appear on the DFD and be referenced in all descriptions using the data flow.
Aliases - synonyms or other names for the element.
Description – what the data flow is all about.
Length of an element – stored length of an element.
*Numeric amount lengths - summation of numbers
*Names and address fields
*If the element is too small, the data that need to be entered will be truncated.
Type of data – Numeric, date, alphabetic or character, which is sometimes called alphanumeric or text data.
Zoned Decimal – used for printing and displaying data.
Packed Decimal – commonly used to save space on file layouts and for elements that require a high level of arithmetic to be performed on them.
Binary Format – suitable for the same purposes as packed decimal format but is less commonly used.
Base / Derive
Base - initially keyed into the system. It must be stored in files.
Derived – created by processes as the result of calculations or logic.
Input and Output Formats – should be included, using special coding symbols to indicate how the data should be presented
Example:
XXXXXXXXX = X(8) – If the same character repeats several times, the character followed by a number in parentheses indicating how many times the character repeats is substituted for the group.
Default Value – used to reduce the amount of keying that the operator may have to do.
Comments/Notations – for further clarification of the data element.
Information in the Data Elements.
ID – optional identification number. Sometimes the ID is coded using a scheme to identify the system and the application within the system.
Name – the text that appear on the DFD and be referenced in all descriptions using the data flow.
Aliases - synonyms or other names for the element.
Description – what the data flow is all about.
Length of an element – stored length of an element.
*Numeric amount lengths - summation of numbers
*Names and address fields
*If the element is too small, the data that need to be entered will be truncated.
Type of data – Numeric, date, alphabetic or character, which is sometimes called alphanumeric or text data.
Zoned Decimal – used for printing and displaying data.
Packed Decimal – commonly used to save space on file layouts and for elements that require a high level of arithmetic to be performed on them.
Binary Format – suitable for the same purposes as packed decimal format but is less commonly used.
Validation Criteria – for ensuring that accurate data are captured by the system.
Two Classifications
Discrete – certain fixed values. There is a list of values.
Continuous – smooth range values.
Base / Derive
Base - initially keyed into the system. It must be stored in files.
Derived – created by processes as the result of calculations or logic.
Comments/Notations – for further clarification of the data element.
Information in the Data Store.
ID – optional identification number. Sometimes the ID is coded using a scheme to identify the system and the application within the system.
Name – the text that appear on the DFD and be referenced in all descriptions using the data flow.
Aliases - synonyms or other names for the element.
Description – what the data flow is all about.
File Type – computerized or manual
File Format- whether a database (table) or format of a traditional file.
Max and Ave. number of Records –information that helps the analyst to predict the amount of disk space required for the application and is necessary for hardware acquisition planning.
Data Set Name – specifies the file name.
Data Structure – should use a name found in the data dictionary, providing a link the elements for this data store.
Primary and Secondary Keys – used to locate records directly.
Comments/Notations – for further clarification of the data element.