DBMS Intro - Collection of Info Organized for Easy Access

Introduction to DBMS
Definition Of Database –
A database is a collection of information that is organized so that it can easily be
accessed, managed, and updated. In one view, databases can be classified
according to types of content: bibliographic, full-text, numeric, and images.
In computing, databases are sometimes classified according to their
organizational approach. The most prevalent approach is the relational database,
a tabular database in which data is defined so that it can be reorganized and
accessed in a number of different ways.
 A distributed database is one that can be dispersed or replicated among
different points in a network.
 An object-oriented programming database is one that is congruent with the
data defined in object classes and subclasses.
 Computer databases typically contain aggregations of data records or files,
such as sales transactions, product catalogs and inventories, and customer
profiles. Typically, a database manager provides users the capabilities of
controlling read/write access, specifying report generation, and analyzing
usage

Database –
Often abbreviated DB. A collection of information organized in such a way that a
computer program can quickly select desired pieces of data. You can think of a
database as an electronic filing system.
To access information from a database, you need a database management system
(DBMS). This is a collection of programs that enables you to enter, organize, and
select data in a database.
Database Management System (DBMS) -
A database management system (DBMS) is a software package designed to define,
manipulate, retrieve and manage data in a database. A DBMS generally
manipulates the data itself, the data format, field names, record structure and file
structure. It also defines rules to validate and manipulate this data. A DBMS
relieves users of framing programs for data maintenance. Fourth-generation query
languages, such as SQL, are used along with the DBMS package to interact with a
database.

Data Base Management System
DBMS is a collection of programs that enables users to create and maintain a
data base. DBMS is a general purpose software system that provides the
process of (i) Defining (ii) Constructing & (iii) manipulating database for
applications.
• Defining : It involves specifying the data type structures, and constraints
for the data to be stored in the data base.
• Constructing : It is the process of storing the data itself on some storage
medium that is controlled by the DBMS.
• Manipulating : It is a data base includes function as querying the database
to retrieve specific data, updating the database to reflect changes.

Types Of Database organization
There are four main types of database organization:
Relational Database: Data is organized as logically independent tables.
Relationships among tables are shown through shared data. The data in one table
may reference similar data in other tables, which maintains the integrity of the
links among them. This feature is referred to as referential integrity - an important
concept in a relational database system. Operations such as select and join can be
performed on these tables. This is the most widely used system of database
organization.
Flat Database: Data is organized in a single kind of record with a fixed number of
fields. This database type encounters more errors due to the repetitive nature of
data.
Object Oriented Database: Data is organized with similarity to object oriented
programming concepts. An object consists of data and methods, while classes
group objects having similar data and methods.
Hierarchical Database: Data is organized with hierarchical relationships. It
becomes a complex network if the one-to-many relationship is violated.

Database Environment
CASE
Tools
User
Interface
DBMS
Application
Programs
Repository Database

Database Components
DBMS
===============
Design tools
Table Creation
Form Creation
Query Creation
Report Creation
Procedural
language
compiler (4GL)
=============
Run time
Form processor
Query processor
Report Writer
Language Run time
Application
Programs
User
Interface
Applications
Database
Database contains:
User’s Data
Metadata
Indexes
Application Metadata

Types of Database Systems
PC Databases Centralized Databases
Central
Computer

Types of Database Systems
• Client Server Databases
Network
Client
Client
Client
Database
Server

From File Systems to DBMS
Problems with file processing systems
• Inconsistent data
• Inflexibility
• Limited data sharing
• Poor enforcement of standards
• Excessive program maintenance
DBMS Benefits
• Minimal data redundancy
• Consistency of data
• Integration of data
• Sharing of data
• Ease of application development
• Uniform security, privacy, and integrity controls
• Data accessibility and responsiveness
• Data independence
• Reduced program maintenance

Database Management System Vs File
Management System
• File Management Systems
Advantages Disadvantages
Simpler to use Typically does not support multi-user access
Less expensive· Limited to smaller databases
Fits the needs of many small businesses
and home users
Limited functionality (i.e. no support for
complicated transactions, recovery, etc.)
Popular FMS’s are packaged along with the
operating systems of personal computers
(i.e. Microsoft Cardfile and Microsoft
Works)
Decentralization of data
Good for database solutions for hand held
devices such as Palm Pilot
Redundancy and Integrity issues

Database Management System Vs File
Management System
Database Management Systems
Advantages Disadvantages
Simpler to use Typically does not support multi-user access
Less expensive· Limited to smaller databases
Fits the needs of many small businesses and
home users
Limited functionality (i.e. no support for
complicated transactions, recovery, etc.)
Popular FMS’s are packaged along with the
operating systems of personal computers
(i.e. Microsoft Cardfile and Microsoft
Works)
Decentralization of data
Good for database solutions for hand held
devices such as Palm Pilot
Redundancy and Integrity issues

File Oriented Approach
The file processing system or File Management System is used to store data in a computerized
data base. Before the advent of DBMS, the application programs are defined & maintained in a
master file and other supporting transaction files. Hence one master file with one or more
transaction files are used. Here each user defines & implements the files needed for specific
applications as part of programming application. The draw backs of file oriented approach are
following:
(1) It leads to Data redundancy : The same data may be present in more
than one file.
(2) Wasted Memory Space : Redundant or duplicate copies of the same data
results in wastage of storage space.
(3) Loss of data Integrity : Data redundancy leads to inconsistency problems.
It results that the same data item has a different value in different files.
(4) Difficulty in accessing data : To access data from the files, programming
language programs must be written.
(5) Information is available in reports : Information is available only through
reports, in between queries are not supported.
(6) Data Isolation : Data are accessed through reports. This results in data
isolation.
(7) Inadequate Security : Only operating system level security is provided. In
depth security is not provided.
(8) High Maintenance Cost : Software maintenance cost will be very expense.
(9) Each data file of an application is a separate entity.

Traditional Approach for Data Storage and the Need of DBMS
• Traditional Data Storage Model
In traditional approach, information is stored in flat files which are
maintained by the file system under the operating system’s control.
Application programs go through the file system in order to access these flat
files

Traditional Approach for Data Storage and the Need of DBMS
How data is stored in flat files
• Data is stored in flat files as records.
• Records consist of various fields which are delimited by a space, comma,
pipe, any special character etc.
• End of records and end of files will be marked using any predetermined
character set or special characters in order to identify them
• Example: Storing employee data in flat files

Problems with traditional approach for storing data
• Data Security
The data stored in the flat file(s) can be easily accessible and hence it is not secure.
Example: Consider an online banking application where we store the account
related information of all customers in flat files. A customer will have access only to
his account related details. However from a flat file, it is difficult to put such
constraints. It is a big security issue.
• Data Redundancy
In this storage model, the same information may get duplicated in two or more files.
This may lead to higher storage and access cost. it also may lead to data
inconsistency.
For Example, assume the same data is repeated in two or more files. If a change
is made to data stored in one file, other files also needs to be change accordingly.
Example: Assume employee details such as first name, last name, emailed are
stored in employee details file and employee salary file. If a change needs to be
made to emailed, both employee details file and emplyee_salary file need to be
updated otherwise it will lead to inconsistent data.

• Data Isolation
Data Isolation means that all the related data is not available in one file. Usually the
data is scattered in various files having different formats. Hence writing new
application programs to retrieve the appropriate data is difficult.
• Lack of Flexibility
The traditional systems are able to retrieve information for predetermined requests for
data. If we need unanticipated data, huge programming effort is needed to make the
information available, provided the information is there in the files. By the time the
information is made available, it may no longer be required or useful.
Example :
Consider a software application which is able to generate employee salary report.
Assume that all the data is stored in flat files. Suppose we now have a requirement to
retrieve all the employee details whose salary is greater than Rs.10000. It is not easy to
generate such on-demand reports and lot of time is needed for application developers to
modify the application to meet such requirements.

• Program/Data Dependence
In traditional file approach, application programs are closely dependent on the files in
which data is stored. If we make any changes in the physical format of the file(s), like
addition of a data field , etc, all application programs needs to be changed accordingly.
Consequently, for each of the application programs that a programmer writes or maintains,
the programmer must be concerned with data management. There is no centralized
execution of the data management functions. Data management is scattered among all the
application programs.
Example: Consider the banking system. An employee salary file exists which has details
about the salary of employees. An employee salary record is described by
• employee_id
• firstname
• lastname
• salary_amount
• An application program is available to display all the details about the salary of all
employees. Assume a new data field, the date_of_joining is added to the employee
salary file. Since the application program depends on the file, it also needs to be altered.
• If the physical format of the employee salary file for example the field delimiter, record
delimiter, etc. are changed, it necessitates that the application program which depends
on it, also be altered.

• Concurrent Access Anomalies
Many traditional systems allow multiple users to access and update the same
piece of data simultaneously. However this concurrent updates may result in
inconsistent data. To guard against this possibility, the system must maintain
some form of supervision. But supervision is difficult because data may be
accessed by many different application programs and these application
programs may not have been coordinated previously.
• Example :
Consider a personal information system which has the data of all employees.
Now there may be an employee updating his address details in the system
and at the same time, an administrator may be taking a report containing the
data of all employees. This is called concurrent access. Since the employee's
address is being updated at the same time, there is a possibility of the
administrator reading an incorrect address.
These difficulties lead to the development of database systems.

Master and Transaction Files
• Master File
 The data stored in these files are permanent by nature
 This file is empty while nature
 This files are updated only through recent transactions
 This file stores large amount of data
Example : customer ledgers, student database
• Transaction File
 The data stored in these files are temporary by nature
 This file contains data only for period of time and send to the master file
 Any data to be modified is done in this file
 In this file the data to be modified is stored .
Example : price of the products, customers order for the products, inserting new
data to the database etc.

Components of DBMS Environment
A database management system (DBMS) consists of several components.
Each component plays very important role in the database management
system environment. The major components of database management
system are:
 Software
 Hardware
 Data
 Procedures
 Database Access Language

Software
• The main component of a DBMS is the software. It is the set of programs
used to handle the database and to control and manage the overall
computerized database
• DBMS software itself, is the most important software component in the
overall system
• Operating system including network software being used in network, to
share the data of database among multiple users.
• Application programs developed in programming languages such as
C++, Visual Basic that are used to access database in database
management system. Each program contains statements that request
the DBMS to perform operation on database. The operations may
include retrieving, updating, deleting data etc. . The application
program may be conventional or online workstations or terminals.

Hardware
Hardware consists of a set of physical electronic devices such as computers
(together with associated I/O devices like disk drives), storage devices, I/O
channels, electromechanical devices that make interface between
computers and the real world systems etc. and so on. It is impossible to
implement the DBMS without the hardware devices, In a network, a
powerful computer with high data processing speed and a storage device
with large storage capacity is required as database server.
Data
Data is the most important component of the DBMS. The main purpose of
DBMS is to process the data. In DBMS, databases are defined, constructed
and then data is stored, updated and retrieved to and from the databases.
The database contains both the actual (or operational) data and the
metadata (data about data or description about data).

Procedures
• Procedures refer to the instructions and rules that help to design the
database and to use the DBMS. The users that operate and manage
the DBMS require documented procedures on hot use or run the
database management system. These may include.
• Procedure to install the new DBMS.
• To log on to the DBMS.
• To use the DBMS or application program.
• To make backup copies of database.
• To change the structure of database.
• To generate the reports of data retrieved from database.

Database Access Language
• The database access language is used to access the data to and from the database. The
users use the database access language to enter new data, change the existing data in
database and to retrieve required data from databases. The user write a set of
appropriate commands in a database access language and submits these to the DBMS.
The DBMS translates the user commands and sends it to a specific part of the DBMS called
the Database Jet Engine. The database engine generates a set of results according to the
commands submitted by user, converts these into a user readable form called an Inquiry
Report and then displays them on the screen. The administrators may also use the
database access language to create and maintain the databases.
• The most popular database access language is SQL (Structured Query Language).
Relational databases are required to have a database query language.

• Users
The users are the people who manage the databases and perform different operations
on the databases in the database system. There are three kinds of people who play
different roles in database system
 Application Programmers
 Database Administrators
 End-Users

• Application Programmers
The people who write application programs in programming languages (such as
Visual Basic, Java, or C++) to interact with databases are called Application
Programmer.
• Database Administrators
A person who is responsible for managing the overall database management system
is called database administrator or simply DBA.
• End-Users
The end-users are the people who interact with database management system to
perform different operations on database such as retrieving, updating, inserting,
deleting data etc.

Database Schema versus Database Instance
While working with any data model, it is necessary to distinguish between the overall
design or description of the database (database schema) and the database itself. As
discussed earlier, the overall design or description of the database is known
as database schema or simply schema. The database schema is also known
as intension of the database, and is specified while designing the
database.
Figure 1.6 shows the database schema for Online Book database.

The data in the database at a particular point of time is known as database instance or
database state or snapshot. The database state is also called an extension of the
schema.
The various states of the database are given here.
• Empty state: When a new database is defined, only its schema is specified. At this
point, the database is said to be in empty state as it contains no data.
• Initial state: When the database is loaded with data for the first time, it is said to
be in initial state.
• Current state: The data in the database is updated frequently. Thus, at any point
of time, the database is said to be in the current state

The DBMS is responsible to check whether the database state is valid state. Thus, each time the database is
updated, DBMS ensures that the database remains in the valid state. The DBMS refers to DBMS catalog
where the metadata is stored in order to check whether the database state satisfies the structure and
constraints specified in the schema. Figure 1.7 shows an example of an instance for PUBLISHER schema.
Fig. 1.7. An example of instance - PUBLISHER instance
P_ID Pname Address State Phone Email_id
P001 Hills Publications 12, Park street, Atlanta Georgia 7134019 h_pub@hills.com
P002 Sunshine
Publishers Ltd.
45, Second street, Newark New Jersey 6548909 Null
P003 Bright Publications 123, Main street, Honolulu Hawai 7678985 bright@bp.com
P004 Paramount
Publishing House
789, Oak street, New York New York 9254834 param_house@para
m.com
P005 Wesley
Publications
456, First street, Las Vegas Nevada 5683452 Null
The schema and instance can be compared with a program written in a programming language. The
database schema is similar to a variable declared along with the type description in a program. The
variable contains a value at a given point of time. This value of a variable corresponds to an instance of a
database schema

DBMS Architecture and Data Independence
The DBMS architecture describes how data in the database is viewed by the users. It is
not concerned with how the data is handled and processed by the DBMS. The database
users are provided with an abstract view of the data by hiding certain details of how
data is physically stored. This enables the users to manipulate the data without
worrying about where it is located or how it is actually stored.
In this architecture, the overall database description can be defined at three levels,
namely, internal, conceptual, and external levels and thus, named three-level DBMS
architecture.
The three levels are discussed here.
Internal level:
It is the lowest level of data abstraction that deals with the physical representation of
the database on the computer and thus, is also known as physical level. It
describes how the data is physically stored and organized on the storage medium. At
this level, various aspects are considered to achieve optimal runtime performance and
storage space utilization. These aspects include storage space allocation techniques for
data and indexes, access paths such as indexes, data compression and encryption
techniques, and record placement

Conceptual level:
This level of abstraction deals with the logical structure of the entire database and thus, is
also known as logical level. It describes what data is stored in the database, the
relationships among the data and complete view of the user’s requirements without any
concern for the physical implementation. That is, it hides the complexity of physical
storage structures. The conceptual view is the overall view of the database and it includes
all the information that is going to be represented in the database.
External level:
It is the highest level of abstraction that deals with the user’s view of the database and
thus, is also known as view level. In general, most of the users and application programs
do not require the entire data stored in the database. The external level describes a part of
the database for a particular group of users. It permits users to access data in a way that is
customized according to their needs, so that the same data can be seen by different users
in different ways, at the same time. In this way, it provides a powerful and flexible
security mechanism by hiding the parts of the database from certain users, as the user is
not aware of existence of any attributes that are missing from the view.

These three levels are used to describe the schema of the database at various levels.
Thus, the three-level architecture is also known as three-schema architecture. The
internal level has an internal schema, which describes the physical storage structure
of the database. The conceptual level has a conceptual schema, which describes the
structure of entire database. The external level has external schemas or user
views, The three-schema architecture is shown in Figure 1.1.

DBMS Architecture Example
Online Book database as shown in Figure 1.2. In this figure, two views (view 1 and view 2) of
the BOOK file have been defined at the external level. Different database users can see these views. The
details of the data types are hidden from the users. At the conceptual level, the BOOK records are
described by a type definition. The application programmers and the DBA generally work at this level of
abstraction. At the internal level, the BOOK records are described as a block of consecutive storage
locations such as words or bytes. The database users and the application programmers are not aware of
these details; however, the DBA may be aware of certain details of the physical organization of the data.

Advantages of Database System
Database approach came into existence due to the bottlenecks of file processing system. In the
database approach, the data is stored at a central location and is shared among multiple users.
Thus, the main advantage of DBMS is centralized data management. The centralized nature of
database system provides several advantages, which overcome the limitations of the
conventional file processing system. These advantages are listed here.
• Controlled data redundancy:
During database design, various files are integrated and each logical data item is stored at
central location. This eliminates replicating the data item in different files, and ensures
consistency and saves the storage space. Note that the redundancy in the database systems
cannot be eliminated completely as there could be some performance and technical reasons
for having some amount of redundancy. However, the DBMS should be capable of controlling
this redundancy in order to avoid data inconsistencies.
• Enforcing data integrity:
In database approach, enforcing data integrity is much easier. Various integrity constraints are
identified by database designer during database design. Some of these data integrity
constraints can be enforced automatically by the DBMS, and others may have to be checked by
the application programs.
• Data sharing:
The data stored in the database can be shared among multiple users or application programs.
Moreover, new applications can be developed to use the same stored data. Due to shared data,
it is possible to satisfy the data requirements of the new applications without having to create
any additional data or with minimal modification.

• Ease of application development
The application programmer needs to develop the application programs according to the users’
needs. The other issues like concurrent access, security, data integrity, etc., are handled by the
DBMS itself. This makes the application development an easier task.
• Data security
Since the data is stored centrally, enforcing security constraints is much easier. The DBMS
ensures that the only means of access to the database is through an authorized channel. Hence,
data security checks can be carried out whenever access is attempted to sensitive data. To
ensure security, a DBMS provides security tools such as user codes and passwords. Different
checks can be established for each type of access (addition, modification, deletion, etc.) to each
piece of information in the database.
• Multiple user interfaces
In order to meet the needs of various users having different technical knowledge, DBMS
provides different types of interfaces such as query languages, application program interfaces,
and graphical user interfaces (GUI) that include forms-style and menu-driven interfaces. A
form-style interface displays a form to each user and user interacts using these forms. In menu-driven
interface, the user interaction is through lists of options known as menus.

• Backup and recovery:
The DBMS provides backup and recovery subsystem that is responsible for recovery from hardware and
software failures. For example, if the failure occurs in between the transaction, the DBMS recovery
subsystem either reverts back the database to the state which existed prior to the start of the transaction or
resumes the transaction from the point it was interrupted so that its complete effect can be recorded in the
database.
• Data abstraction:
The property of DBMS that allows program-data independence is known as data abstraction. Data
abstraction allows the database system to provide an abstract view of the data to its users without giving
the physical storage and implementation details.
• Supports multiple views of the data:
A database can be accessed by many users and each of them may have a different perspective or view of the
data. A database system provides a facility to define different views of the data for different users. A view is
a subset of the database that contains virtual data derived from the database files but it does not exist in
physical form. That is, no physical file is created for storing the data values of the view; rather, only the
definition of the view is stored.

Disadvantage of DBMS
Although there are many advantages of DBMS, the DBMS may also have some minor
disadvantages. These are:
• Cost of Hardware and Software
A processor with high speed of data processing and memory of large size is required to run the DBMS software. It means that you have
to up grade the hardware used for file-based system. Similarly, DBMS software is also very costly,.
• Cost of Data Conversion
When a computer file-based system is replaced with database system, the data stored into data file must be converted to database file. It
is very difficult and costly method to convert data of data file into database. You have to hire database system designers along with
application programmers. Alternatively, you have to take the services of some software house. So a lot of money has to be paid for
developing software.
• Cost of Staff Training
Most database management system are often complex systems so the training for users to use the DBMS is required. Training is required
at all levels, including programming, application development, and database administration. The organization has to be paid a lot of
amount for the training of staff to run the DBMS.
• Appointing Technical Staff
The trained technical persons such as database administrator, application programmers, data entry operations etc. are required to handle
the DBMS. You have to pay handsome salaries to these persons. Therefore, the system cost increases.
• Database Damage
In most of the organization, all data is integrated into a single database. If database is damaged due to electric failure or database is
corrupted on the storage media, the your valuable data may be lost forever.

Difference between Centralized and Distributed databases
Centralized database is a database in which data is stored and maintained in a single location.
This is the traditional approach for storing data in large enterprises. Distributed database is a
database in which data is stored in storage devices that are not located in the same physical
location but the database is controlled using a central Database Management System (DBMS).
In a centralized database, all the data of an organization is stored in a single place such as a
mainframe computer or a server. Users in remote locations access the data through the Wide
Area Network (WAN) using the application programs provided to access the data. The
centralized database (the mainframe or the server) should be able to satisfy all the requests
coming to the system, therefore could easily become a bottleneck. But since all the data reside in
a single place it easier to maintain and back up data. Furthermore, it is easier to maintain data
integrity, because once data is stored in a centralized database, outdated data is no longer
available in other places

Distributed database
• In a distributed database, the data is stored in storage devices that are located in
different physical locations. They are not attached to a common CPU but the database is
controlled by a central DBMS. Users access the data in a distributed database by
accessing the WAN. To keep a distributed database up to date, it uses the replication and
duplication processes. The replication process identifies changes in the distributed
database and applies those changes to make sure that all the distributed databases look
the same. Depending on the number of distributed databases, this process could become
very complex and time consuming. The duplication process identifies one database as a
master database and duplicates that database. This process is not complicated as the
replication process but makes sure that all the distributed databases have the same data
.

What is the difference between Distributed distributed-database-
and-vs-centralized-database
 While a centralized database keeps its data in storage devices that are in a single
location connected to a single CPU, a distributed database system keeps its data in
storage devices that are possibly located in different geographical locations and
managed using a central DBMS.
 A centralized database is easier to maintain and keep updated since all the data are
stored in a single location. Furthermore, it is easier to maintain data integrity and
avoid the requirement for data duplication. But, all the requests coming to access
data are processed by a single entity such as a single mainframe, and therefore it
could easily become a bottleneck. But with distributed databases, this bottleneck can
be avoided since the databases are parallelized making the load balanced between
several servers.
 keeping the data up to date in distributed database system requires additional
work, therefore increases the cost of maintenance and complexity and also requires
additional software for this purpose.
 Furthermore, designing databases for a distributed database is more complex than
centralized data base .

Users are of 4 types:
Types of users in DBMS
1. Application programmers or Ordinary users
2. End users
3. Database Administrator (DBA)
4. System Analyst
Application programmers or Ordinary users:
These users write application programs to interact with the database. Application
programs can be written in some programming language such a COBOL, PL/I, C++,
JAVA or some higher level fourth generation language. Such programs access the
database by issuing the appropriate request, typically a SQL statement to DBMS.

2. End Users:
End users are the users, who use the applications developed. End users need not know about the
working, database design, the access mechanism etc. They just use the system to get their task
done. End users are of two types:
a) Direct users b) Indirect users
a) Direct users:
Direct users are the users who se the computer, database system directly, by following
instructions provided in the user interface. They interact using the application programs already
developed, for getting the desired result. E.g. People at railway reservation counters, who directly
interact with database.
b) Indirect users:
Indirect users are those users, who desire benefit form the work of DBMS indirectly. They use the
outputs generated by the programs, for decision making or any other purpose. They are just
concerned with the output and are not bothered about the programming part.

3. Database Administrator (DBA):
Database Administrator (DBA) is the person which makes the strategic and policy decisions
regarding the data of the enterprise, and who provide the necessary technical support for
implementing these decisions. Therefore, DBA is responsible for overall control of the system
at a technical level. In database environment, the primary resource is the database itself and
the secondary resource is the DBMS and related software administering these resources is the
responsibility of the Database Administrator (DBA).
4. System Analyst:
System Analyst determines the requirement of end users, especially naïve and parametric
end users and develops specifications for transactions that meet these requirements. System
Analyst plays a major role in database design, its properties; the structure prepares the system
requirement statement, which involves the feasibility aspect, economic aspect, technical aspect
etc. of the system.

Introduction to the Data Dictionary
One of the most important parts of an Oracle database is its data dictionary, which is a read-only set
of tables that provides information about the database. A data dictionary contains:
• The definitions of all schema objects in the database (tables, views, indexes, clusters, synonyms,
sequences, procedures, functions, packages, triggers, and so on)
• How much space has been allocated for, and is currently used by, the schema objects
• Default values for columns
• Integrity constraint information
• The names of Oracle users
• Privileges and roles each user has been granted
• Auditing information, such as who has accessed or updated various schema objects
• Other general database information
• The data dictionary is structured in tables and views, just like other database data. All the data
dictionary tables and views for a given database are stored in that
database's SYSTEM tablespace.

Structure of the Data Dictionary
The data dictionary consists of the following:
 Base Tables
The underlying tables that store information about the associated database. Only Oracle
should write to and read these tables. Users rarely access them directly because they are
normalized, and most of the data is stored in a cryptic format.
 User-Accessible Views
The views that summarize and display the information stored in the base tables of the data
dictionary. These views decode the base table data into useful information, such as user or
table names, using joins and WHERE clauses to simplify the information. Most users are
given access to the views rather than the base tables.
 SYS, Owner of the Data Dictionary
The Oracle user SYS owns all base tables and user-accessible views of the data dictionary.
No Oracle user should ever alter (UPDATE, DELETE, or INSERT) any rows or schema
objects contained in the SYS schema, because such activity can compromise data integrity.
The security administrator must keep strict control of this central account.

How the Data Dictionary Is Used
The data dictionary has three primary uses:
• Oracle accesses the data dictionary to find information about users, schema objects,
and storage structures.
• Oracle modifies the data dictionary every time that a data definition language (DDL)
statement is issued.
• Any Oracle user can use the data dictionary as a read-only reference for information
about the database.

Data Model
A database model is a type of data model that determines the logical structure of a database
and fundamentally determines in which manner data can be stored, organized, and
manipulated. The most popular example of a database model is the relational model (or the
SQL approximation of relational), which uses a table-based format.
Common logical data models for databases include:
 Hierarchical database model
 Network model Relational model
 Entity–relationship model
 Enhanced entity–relationship model
 Object model

Data Model
• A hierarchical database model is a data model in which the data is organized into a tree-like structure.
The structure allows representing information using parent/child relationships: each parent can have
many children, but each child has only one parent (also known as a 1-to-many relationship). All
attributes of a specific record are listed under an entity type.
Example of a hierarchical model
• In a database an entity type is the equivalent of a table. Each individual record is represented as a row,
and each attribute as a column. Entity types are related to each other using 1:N mappings, also known
as one-to-many relationships.
• Mother–child relationship: Child may only have one mother but a mother can have multiple children.
Mothers and children are tied together by links called "pointers". A mother will have a list of pointers
to each of her children.
• A hierarchical data model is a data model that the data is organized into a tree like structure. The
structure permits repeating information using parent/child relationships: Every parent can have many
children but each child only has one parent. All attributes of a precise record are listed under an entity
type.

Data Model
Hierarchical Model
Advantages:
a) The representation of records is completed using an ordered tree which is natural
method of implementation of one–to-many relationships.
b) Proper ordering of the tree results in easier as well as faster retrieval of records.
c) Permits the use of virtual records. This result in a steady database especially when
modification of the data base is made.

The E-R Model
• The entity-relationship model (or ER model) is a way of graphically representing the
logical relationships of entities (or objects) in order to create a database
• The entity-relationship model is based on a perception of the world as consisting of a
collection of basic objects (entities) and relationships among these objects.
• An entity is a distinguishable object that exists.
• Each entity has associated with it a set of attributes describing it.
• E.g. number and balance for an account entity.
• A relationship is an association among several entities.
• e.g. A cust_acct relationship associates a customer with each account he or she has.
• The set of all entities or relationships of the same type is called the entity set or
relationship set.
• Another essential element of the E-R diagram is the mapping cardinalities, which
express the number of entities to which another entity can be associated via a
relationship set.

The E-R Model
• The overall logical structure of a database can be expressed graphically by an E-R
diagram:
• Rectangles: represent entity sets.
• Ellipses: represent attributes.
• Diamonds: represent relationships among entity sets.
• Lines: link attributes to entity sets and entity sets to relationships.

Graphical Representation in E-R diagram
 Rectangle -- Entity
 Ellipses -- Attribute (underlined attributes are [part of] the
primary key)
 Double ellipses -- multi-valued attribute
 Dashed ellipses-- derived attribute, e.g. age is derivable from
birthdate and current date.

The E-R Model
Entities and Attributes
Entity: an object that is involved in the enterprise and that be distinguished from other
objects. (not shown in the ER diagram--is an instance)
Can be person, place, event, object, concept in the real world
Can be physical object or abstraction
Ex: "John", "CSE305“
Entity Type: set of similar objects or a category of entities; they are well defined
A rectangle represents an entity set
Ex: students, courses
We often just say "entity" and mean "entity type"

Entities and Attributes
Attribute: describes one aspect of an entity type; usually [and best when] single
valued and indivisible (atomic)
Represented by oval on E-R diagram
Ex: name, maximum enrollment
May be multi-valued – use double oval on E-R diagram
May be composite – attribute has further structure; also use oval for composite
attribute, with ovals for components connected to it by lines
May be derived – a virtual attribute, one that is computable from existing data in
the database, use dashed oval. This helps reduce redundancy

Relationships
Relationship: connects two or more entities into an association/relationship
"John" majors in "Computer Science“
Relationship Type: set of similar relationships
Student (entity type) is related to Department (entity type) by MajorsIn
(relationship type).
Relationship Types may also have attributes in the E-R model. When they are mapped to
the relational model, the attributes become part of the relation. Represented by a diamond
on E-R diagram.
Relationship types can have descriptive attributes like entity sets
Relationships tend to be verbs or verb phrases; attributes of relationships are again nouns

Cardinality of Relationships
• Cardinality is the number of entity instances to which another entity set can map
under the relationship. This does not reflect a requirement that an entity has to
participate in a relationship. Participation is another concept.
• One-to-one: X-Y is 1:1 when each entity in X is associated with at most one entity in Y,
and each entity in Y is associated with at most one entity in X.
• One-to-many: X-Y is 1:M when each entity in X can be associated with many entities
in Y, but each entity in Y is associated with at most one entity in X.
• Many-to-many: X:Y is M:M if each entity in X can be associated with many entities in
Y, and each entity in Y is associated with many entities in X ("many" =>one or more
and sometimes zero)

Generalization, Specialization and Aggregation are the ways to represent
special relationships between entities and attribute
Generalization
• Generalization is a bottom-up approach in which two lower level entities combine to
form a higher level entity. In generalization, the higher level entity can also combine
with other lower level entity to make further higher level entity.

Specialization
Specialization is opposite to Generalization. It is a top-down approach in which one
higher level entity can be broken down into two lower level entity. In specialization,
some higher level entities may not have lower-level entity sets at all.

Aggregation
Aggregation is a process when relation between two entity is treated as a single entity. Here
the relation between Center and Course, is acting as an Entity in relation with Visitor.
Generalization, Specialization and Aggregation Database technology Notes, RDBMS,
SQL Query, E-R diagram, Generalization, Specialization, Aggregation, Database Model,
Normalization, SQL Sequence, SQL Constraints, Database View, table, row, SQL Join
Generalization, Specialization and Aggregation are the ways to represent special
relationships between entities and attributes

Transforming a Logical Data Model to a Physical Database Design (
Refer Till Slide Number -74 )
The following procedures can be successfully used to transform a Logical Analytical Data
Model into a physical database, such as a data warehouse or a data mart. By following these
procedures, one can ensured that:
• The data model has integrity (e.g., that business rules have been honored),
• The process has integrity (i.e., that it accounts for all the necessary steps and feedback
loops), and
• The overall process is pragmatic (e.g., that it addresses issues of query, report and load
performance). This chapter describes a process that is generic enough to be used as a guide
for different projects.

Transforming a Logical Data Model to a Physical Database Design
Overview
• Let us start with an overview of the overall process for creating logical and physical data models. Here
are the major steps in the transformation of the model:
1-The business authorization to proceed is received.
3-Business requirements are gathered and represented in a logical data model, which will completely
represent the business data requirements and will be non-redundant.
5-The logical model is transformed into a first-cut physical model by applying several simple modifications,
such as splitting a large table or combining entities in a 1:1 relationship.
7-The logical model is then transformed into a second-cut physical model by iteratively applying three levels
of optimizations or compromises. The outcome from this is a physical database design.
b. Apply safe compromises to the model, such as splitting a table or combining two tables.
d. Apply aggressive compromises to the model, such as adding redundant data.
e. Apply technical optimizations to the model, such as indices, referential integrity or partitioning.
9- The physical database design is then converted to a physical structure by generating or writing the DDL
and installing the database.
11- Steps 4 and 5 are iteratively performed so that the database can be tested before going into production.
Sometime aggressive compromises are done so that the effect of them can be properly tested. Aggressive
compromising is an iterative process. If views are used up-front, it may even be transparent to the
application teams and business users.

Analysis Phase - Develop a Logical Data Model
This step is obviously the crucial starting point. There is no substitute for business knowledge. Any
team that does not know the business has no business building the logical data model. The key
characteristics of this model are that it:
• Must declare the grain.
• Should be atomic to the appropriate level.
• Should fully satisfy user requirements.
• Should be non-redundant.
• Should be independent of technology, implementation and organizational constraints.
• Should also contain any external data that is essential to the business.

Information Gathering
The most critical task in the development of the data model is requirements
gathering. There are three general ways to approach model expansion:
• Top-down,
• Inside-out and
• Bottom-up.
• Top-down expansion essentially starts with a high level model to represent the breadth of
data in the DW. This model is rough cut and coarse but represents the major subject areas of
data that the DW must support. Top-down expansion starts 3with this rough-cut model and
progressively enlarges it by adding data elements to it.
• Inside out expansion starts with an initial high level model, which expands by adding details.
For example, we can start with Customer and a few attributes, and eventually populate it
with many attributes.
• Bottom-up expansion proceeds by collection of views or use cases. The data from these is
incorporated into the model. A typical view could be as simple as a query. A typical query
could be "How much volume did we sell last month (and this year to date) versus a year ago
last month (and last year to date).“

Information Gathering
There are several vehicles to use for information gathering:
• user interviews,
• facilitated sessions,
• examination of current data and queries and
• direct observation.
Scenarios
Queries and reports can be used to verify that the database will support them. From these it is best to
identify a set of key user scenarios that can be used to pass against the model. A scenario is a sequence of
activities that respond to a major business event. A scenario is actually a test case -- but a business test
case. It consists of two parts: the definition of the business case and a script of values. The purpose of
scenarios is to test the correctness and completeness of the model. A major business event is something of
importance that happens to the business. Usually, but not always, the event is something that is initiated
by some agent who is outside of the business area. A scenario should be defined for each major business
event. Some people call scenarios Use Case

How to Use Scenarios
• The simplest way to use scenarios is to manually walk them through the model. It is
easiest to do in a facilitated session. The facilitator takes the model and the scenario.
Each activity, data element and sample value in a scenario are compared to the model
to ensure that the model is complete and correct. One wants to ensure the data is
there and that the model will allow one to navigate properly. Usually, the first 3 or 4
scenarios disrupt the model. This trauma is essential for the health of the model.
• Using scenarios and queries with actual sample values validates the data model,
ensures that the model satisfies the user needs, guarantees model completeness and
simplifies database changes.
• It is important to collect whatever volumetric information you can. Volumetric
information includes the number of occurrences of data, the frequency of query
execution, the ratio between tables, and such things. These factors substantially
influence the physical design and should be a part of the process for gathering
business requirements.

Finalization of the Logical Model
• Two important additions need to be made to the logical model before it is
complete:
• Completion of definitions of all entities and attributes. In a data warehouse
environment, this is especially critical. Users will need to understand the data in
order to use it. Do not underestimate the magnitude of this task. Do yourself a
favor and compile the definitions as the model grows.
• Addition of domain information. A domain is the set of physical characteristics
that one or more columns can have in common, such as length and data type.
• In the next step, do not neglect to forward engineer the definitions to the physical
model.

Design Phase - First Cut Physical Model
Before doing extensive usage analysis, it is possible and even advisable to construct a first-cut
physical model. This can be done quickly by applying the following modifications to the data
model. These seven steps represent what is considered the first cut physical model:
• Ensure physical columns characteristics
Use domains or assign data type, length, null ability, optionality, etc.
• Ensure proper physical database object names are used
Follow standard (enforced by CASE tool) for physical names.
• Add necessary constraints and business rules
Take example of an Order Item that must belong to Order and that an Order must have Order Item. The first
is enforced by referential integrity (RI); the second requires a trigger or stored procedure
• Resolve implementation of subtypes
Decide how the entire hierarchy will be implemented.
• Perform any obvious denormalization or other safe compromises
Safe compromises are those trade-offs that do not compromise the integrity and non-redundancy of the data,
e.g., splitting or partitioning a table based on usage. To store mailing address both in both Customer and in
Customer Address table is an example.
• Adjust keys
Ensure keys truly provide uniqueness and stability (even considering anomalies over time, such as Order
Numbers getting reused). Add surrogate keys as justified.
• Add indexes on primary keys, major foreign keys only
Introduce indexing sparingly at first to determine raw capabilities of the design.
• Add any production objects
• Example of these are transient tables, processing tables/columns (e.g. batch control tables).

• Forward Engineer Logical Model to a Physical Database Design
Using these scenarios, the entire data model should be reviewed table by table
and column by column to confirm what is really required, what could be
eliminated, which data elements should be combined and which should be
broken into smaller tables to improve ease of use and reduce query and report
response time.
For example-it
is conceivable to reduce the number of tables from 100 to 60. It is also
possible to eliminate unnecessary columns when many of the remaining tables
are code tables, used only to look up descriptions. Consequently, the number of
actual fact tables and dimensions could significantly be reduced. Also,
information gathered during this activity is useful in determining index
deployment, load balancing across devices and optimal data clustering. All
optimization changes based on discoveries made during this process should be
implemented in the physical model. Any business rule changes or additional
columns should be incorporated into the logical model and then forward
engineered again.

Remember to forward engineer all relevant data, including definitions. A data
warehouse is useless without these.
• Design Data Model Optimization
There are three general ways to optimize a database:
 get better hardware (or take better advantage of it),
 get better software (or take better advantage of it), or
 optimize the data structure itself.
 Before the database can be generated a variety of optimizations need to be
evaluated for their applicability to the logical model. These optimizations
are actually trade-offs. A trade-off is the emphasis of one feature, which is
considered to be an advantage, while compromising another, which is
considered to be less important.
 These trade-offs are of two types:
safe trade-offs and aggressive trade-offs.

Safe Trade-Offs
• Safe trade-offs do not compromise the integrity of the data. They are exemplified by
the following model changes:
• Collapse some or all subtypes into a supertype
• Collapse a supertype into one or more subtypes
• Split one entity into multiple tables,
• Merge multiple entities into a single table,
• Collapsing certain code tables,
• Converting sub-types to tables (there are various ways)
• Violate first normal form, among others.
Aggressive Trade-Offs
• Aggressive trade-offs compromise the integrity of the model. They are exemplified by
several main modifications:
• Storing derived data and
• Adding redundant data or relationships and
• Adding surrogate keys.

Volume Determination and Database Sizing
Volume determination and database sizing is facilitated by determining the best source for
data knowledge and getting a best estimate for each table. The numbers can be plugged into a
tool, such as Platinum Desktop DBA, that can generate space requirements, or into a
spreadsheet designed to perform the same estimations, or manually. It is important to pad
your estimates in case actual row counts exceed what was anticipated. An additional 20
percent when sizing each table is reasonable. Determine Hardware Requirements
• Subsequent Steps
While the completion of the above step signifies the end of the logical model to physical
database transformation, the task of delivering a production database includes several
additional activities that could result in modifications. Issues like load file content, new query
requirements, additional data requirements or new user requirements can result in
necessitating database modifications. It is recommended that all database changes are
implemented through a physical design tool, as this will provide a central authoritative source
for the physical database environment and allow the generation of scripts that can recreate the
database server if necessary.

Merits and Demerits of E-R Modeling.
Advantages and Disadvantages of E-R Data Model
Following are advantages of an E-R Model:
• Straightforward relation representation: Having designed an E-R diagram
for a database application, the relational representation of the database model
becomes relatively straightforward.
• Easy conversion for E-R to other data model: Conversion from E-R diagram
to a network or hierarchical data model can· easily be accomplished.
• Graphical representation for better understanding: An E-R model gives
graphical and diagrammatical representation of various entities, its attributes
and relationships between entities. This is turn helps in the clear
understanding of the data structure and in minimizing redundancy and other
problems.
• Disadvantages of E-R Data Model
Following are disadvantages of an E-R Model:
• No industry standard for notation: There is no industry standard notation for
developing an E-R diagram.
• Popular for high-level design: The E-R data model is especially popular for high
level.

Record Based Logical Models
Record based logical models are used in describing data at the logical and view
levels. In contrast to object based data models, they are used to specify the
overall logical structure of the database and to provide a higher-level
description of the implementation. Record based models are so named because
the database is structured in fixed format records of several types. Each record
type defines a fixed number of fields, or attributes, and each field is usually of
a fixed length.
The three most widely accepted record based data models are:
• Hierarchical Model
• Network Model
• Relational Model

Hierarchical Database Model
• Hierarchical Database model is one of the oldest database models, dating from late
1950s. One of the first hierarchical databases Information Management System (IMS)
was developed jointly by North American Rockwell Company and IBM. This model is
like a structure of a tree with the records forming the nodes and fields forming the
branches of the tree.

Hierarchical Database Model
• In the hierarchical data model, records are linked with other superior records on which
they are dependent and also on the records, which are dependent on them. A tree structure
may establish one-to-many relationship. Figure illustrates the structure of a family. Great
grandparent is the root of the structure. Parents can have many children exhibiting one to
many relationships. The great grandparent record is known as the root of the tree. The
grandparents and children are the nodes or dependents of the root. In general, a root may
have any number of dependents. Each of these dependent may have any number of lower
level dependents, and so on, with no restriction of levels.
The different elements
(e.g. records) present in
the hierarchical tree
structure have Parent-
Child relationship. A
Parent element can have
many children elements
but a Child element
cannot have many parent
elements. That is,
hierarchical model
cannot represent many to
many relationships
among records.

Operations on Hierarchical Model
There are four basic operations Insert, Update, Delete and Retrieve that can
be performed on each model. Now, we consider in detail that how these
basic operations are performed in hierarchical database model.
• Insert Operation:
It is not possible to insert the information of the supplier e.g. S4 who does not
supply any part. This is because a node cannot exist without a root. Since, a part P5
that is not supplied by any supplier can be inserted without any problem, because a
parent can exist without any child. So, we can say that insert anomaly exists only
for those children, which has no corresponding parents.
• Update Operation:
Suppose we wish to change the city of supplier S1 from Qadian to Jalandhar, then
we will have to carry out two operations such as searching S1 for each part and then
multiple updations for different occurrences of S1. But, if we wish to change the city
of part P1 from Qadian to Jalandhar, then these problems will not occur because
there is only a single entry for part P I and the problem of inconsistency will not
arise. So, we can say that update anomalies only exist for children not for parent
because children may have multiple entries in the database.

Operations on Hierarchical Model
• Delete Operation:
In hierarchical model, quantity information is incorporated into supplier record. Hence, the
only way to delete a shipment (or supplied quantity) is to delete the corresponding supplier
record. But such an action will lead to loss of information of the supplier, which is not desired.
For example: Supplier S2 stops supplying 250 quantity of part PI, then the whole record of S2
has to be deleted under part PI which may lead to loss the information of supplier. Another
problem will arise if we wish to delete a part information and that part happens to be only
part supplied by some supplier. In hierarchical model, deletion of parent causes the deletion of
child records also and if the child occurrence is the only occurrence in the whole database,
then the information of child records will also lost with the deletion of parent. For example: if
we wish to delete the information of part P2 then we also lost the information of S3, S2 and S1
supplier. The information of S2 and Sl can be obtained from PI, but the information about
supplier S3 is lost with the deletion of record for P2.
• Record Retrieval:
Record retrieval methods for hierarchical model are complex and asymmetric which can be
clarified with the following queries:

Advantages Of Hierarchical Model
1. Simplicity
Data naturally have hierarchical relationship in most of the practical situations. Therefore, it is
easier to view data arranged in manner. This makes this type of database more suitable for the
purpose.
2. Security
These database system can enforce varying degree of security feature unlike flat-file system.
3. Database Integrity
Because of its inherent parent-child structure, database integrity is highly promoted in these
systems.
4. Efficiency:
The hierarchical database model is a very efficient, one when the database contains a large
number of I: N relationships (one-to-many relationships) and when the users require large number
of transactions, using data whose relationships are fixed.

Disadvantages Of Hierarchical Model
1. Complexity of Implementation:
The actual implementation of a hierarchical database depends on the
physical storage of data. This makes the implementation complicated.
2. Difficulty in Management:
The movement of a data segment from one location to another cause all the
accessing programs to be modified making database management a
complex affair.
3. Complexity of Programming:
Programming a hierarchical database is relatively complex because the
programmers must know the physical path of the data items.
4. Poor Portability:
The database is not easily portable mainly because there is little or no standard
existing for these types of database.
5. Database Management Problems:
If you make any changes in the database structure of a hierarchical
database, then you need to make the necessary changes in all the
application programs that access the database. Thus, maintaining the
database and the applications can become very difficult.

Network Model
The Network model replaces the hierarchical tree with a graph thus allowing more general
connections among the nodes. The main difference of the network model from the hierarchical
model, is its ability to handle many to many (N:N) relations. In other words, it allows a record to
have more than one parent. Suppose an employee works for two departments. The strict
hierarchical arrangement is not possible here and the tree becomes a more generalized graph - a
network. The network model was evolved to specifically handle non-hierarchical relationships. As
shown below data can belong to more than one parent. Note that there are lateral connections as
well as top-down connections. A network structure thus allows 1:1 (one: one), l: M (one: many), M:
M (many: many) relationships among entities.
• In network database terminology, a relationship is a set. Each set is made up of at least two
types of records: an owner record (equivalent to parent in the hierarchical model) and a
member record (similar to the child record in the hierarchical model).
• The database of Customer-Loan, which we discussed earlier for hierarchical model, is now
represented for Network model as shown.
• In can easily depict that now the information about the joint loan L1 appears single time, but in
case of hierarchical model it appears for two times. Thus, it reduces the redundancy and is
better as compared to hierarchical model.

Operations on Network Model
• Detailed description of all basic operations in Network Model is as under:
To insert a new record containing the details of a new supplier, we simply create a
new record occurrence. Initially, there will be no connector. The new supplier's chain
will simply consist of a single pointer starting from the supplier to itself.
• For example, supplier S4 can be inserted in network model that does not supply
any part as a new record occurrence with a single pointer from S4 to itself. This is
not possible in case of hierarchical model. Similarly a new part can be inserted
who does not supplied by any supplier.
• Consider another case if supplier S 1 now starts supplying P3 part with quantity
100, then a new connector containing the 100 as supplied quantity is added in to
the model and the pointer of S1 and P3 are modified as shown in the below.

Unlike hierarchical model, where updation was carried out by search and had many
inconsistency problems, in a network model updating a record is a much easier process. We
can change the city of S I from Qadian to Jalandhar without search or inconsistency
problems because the city for S1 appears at just one place in the network model. Similarly,
same operation is performed to change the any attribute of part.

• Delete operation:
If we wish to delete the information of any part say PI, then that record occurrence can be
deleted by removing the corresponding pointers and connectors, without affecting the
supplier who supplies that part i.e. P1, the model is modified as shown. Similarly, same
operation is performed to delete the information of supplier.

• Retrieval Operation:
Record retrieval methods for network model are symmetric but complex. In order to
understand this considers the following example queries:
Advantages and Disadvantages of Network Model
The Network model retains almost all the advantages of the hierarchical model
while eliminating some of its shortcomings.
The main advantages of the network model are:
Conceptual simplicity:
Just like the hierarchical model, the network model IS also conceptually simple
and easy to design.
Capability to handle more relationship types:
The network model can handle the one to- many (l:N) and many to many (N:N)
relationships, which is a real help in modeling the real life situations.

• Ease of data access: The data access is easier and flexible than the
hierarchical model.
• Data Integrity: The network model does not allow a member to exist
without an owner. Thus, a user must first define the owner record and
then the member record. This ensures the data integrity.
• Data independence: The network model is better than the hierarchical
model in isolating the programs from the complex physical storage
details.
• Database Standards: One of the major drawbacks of the hierarchical
model was the non-availability of universal standards for database
design and modeling. The network model is based on the standards
formulated by the DBTG and augmented by ANSI/SP ARC (American
National Standards Institute/Standards Planning and Requirements
Committee) in the 1970s. All the network database management systems
conformed to these standards. These standards included a Data
Definition Language (DDL) and the Data Manipulation Language
(DML), thus greatly enhancing database administration and portability.

Disadvantages of Network Model
• Even though the network database model was significantly better than the
hierarchical database model, it also had many drawbacks. Some of them are
• System complexity:
All the records are maintained using pointers and hence the whole database
structure becomes very complex.
• Operational Anomalies:
As discussed earlier, network model's insertion, deletion and updating
operations of any record require large number of pointer adjustments, which
makes its implementation very complex and complicated.
• Absence of structural independence:
Since the data access method in the network database model is a navigational
system, making structural changes to the database is very difficult in most
cases and impossible in some cases. If changes are made to the database
structure then all the application programs need to be modified before they can
access data. Thus, even though the network database model succeeds in
achieving data independence, it still fails to achieve structural independence.

Relational Model
• Relational model stores data in the form of tables. This concept purposed
by Dr. E.F. Codd, a researcher of IBM in the year 1960s. The relational
model consists of three major components:
1. The set of relations and set of domains that defines the way data can be
represented (data structure).
2. Integrity rules that define the procedure to protect the data (data integrity).
3. The operations that can be performed on data (data manipulation).
A rational model database is defined as a database that allows you to group its
data items into one or more independent tables that can be related to one
another by using fields common to each related table.

• Basic Terminology used in Relational Model
The figure shows a relation with the. Formal names of the basic components
marked the entire structure is, as we have said, a relation.
Tuples of a Relation
Each row of data is a tuple. Actually, each row is an n-tuple, but the "n-" is usually
dropped.
Cardinality of a relation: The number of tuples in a relation determines its
cardinality. In this case, the relation has a cardinality of 4.
Degree of a relation: Each column in the tuple is called an attribute. The number
of attributes in a relation determines its degree. The relation in figure has a degree
of 3.
Domains: A domain definition specifies the kind of data represented by the
attribute.

Operations in Relational Model
The four basic operations Insert, Update, Delete and Retrieve operations are shown below on
the sample database in relational model:
Suppose we wish to insert the information of supplier who does not supply any part, can be
inserted in S table without any anomaly e.g. S4 can be inserted in Stable. Similarly, if we wish
to insert information of a new part that is not supplied by any supplier can be inserted into a P
table. If a supplier starts supplying any new part, then this information can be stored in
shipment table SP with the supplier number, part number and supplied quantity. So, we can
say that insert operations can be performed in all the cases without any anomaly.
Suppose supplier S1 has moved from Qadian to Jalandhar. In that case we need to make
changes in the record, so that the supplier table is up-to-date. Since supplier number is the
primary key in the S (supplier) table, so there is only a single entry of S 1, which needs a single
update and problem of data inconsistencies would not arise. Similarly, part and shipment
information can be updated by a single modification in the tables P and SP respectively
without the problem of inconsistency. Update operation in relational model is very simple and
without any anomaly in case of relational model.

• Delete Operation:
Suppose if supplier S3 stops the supply of part P2, then we have to delete the shipment connecting part P2
and supplier S3 from shipment table SP. This information can be deleted from SP table without affecting the
details of supplier of S3 in supplier table and part P2 information in part table. Similarly, we can delete the
information of parts in P table and their shipments in SP table and we can delete the information suppliers in
S table and their shipments in SP table.
• Record Retrieval:
Record retrieval methods for relational model are simple and symmetric which can be clarified with the
following queries:
• Query1: Find the supplier numbers for suppliers who supply part P2.
Solution
In order to get this information we have to search the information of part P2 in the SP table (shipment table).
For this a loop is constructed to find the records of P2 and on getting the records, corresponding supplier
numbers are printed.
Algorithm
do until no more shipments;
get next shipment where PNO=P2;
print SNO;
end;

Advantages and Disadvantages of Relational Model
The major advantages of the relational model are:
Structural independence:
In relational model, changes in the database structure do not affect the data access. When it is
possible to make change to the database structure without affecting the DBMS's capability to
access data, we can say that structural independence has been achieved. So, relational
database model has structural independence.
• Conceptual simplicity:
We have seen that both the hierarchical and the network database model were conceptually
simple. But the relational database model is even simpler at the conceptual level. Since the
relational data model frees the designer from the physical data storage details, the designers
can concentrate on the logical view of the database.
• Design, implementation, maintenance and usage ease:
The relational database model achieves both data independence and structure independence
making the database design, maintenance, administration and usage much easier than the
other models.

Disadvantages of Relational Model
The relational model's disadvantages are very minor as compared to the advantages and their capabilities
far outweigh the shortcomings Also, the drawbacks of the relational database systems could be avoided if
proper corrective measures are taken. The drawbacks are not because of the shortcomings in the database
model, but the way it is being implemented.
Some of the disadvantages are:
• Hardware overheads:
Relational database system hides the implementation complexities and the physical data storage details
from the users. For doing this, i.e. for making things easier for the users, the relational database systems
need more powerful hardware computers and data storage devices. So, the RDBMS needs powerful
machines to run smoothly. But, as the processing power of modem computers is increasing at an
exponential rate and in today's scenario, the need for more processing power is no longer a very big issue.
• Ease of design can lead to bad design: .
The relational database is an easy to design and use. The users need not know the complex details of
physical data storage. They need not know how the data is actually stored to access it. This ease of design
and use can lead to the development and implementation of very poorly designed database management
systems. Since the database is efficient, these design inefficiencies will not come to light when the database
is designed and when there is only a small amount of data. As the database grows, the poorly designed
databases will slow the system down and will result in performance degradation and data corruption.
•

Comparison between Hierarchical, Network and Relational Model
Characteristic Hierarchical model Network model Relational model
Data structure
One to many or one to one
relationships
Allowed the network model to
support many to many relationships
One to One,One to many, Many to many
relationships
In hierarchical model only one-to-many or one-to-one relationships can be exist. But in network data model makes it possible to map many to
many relationships In relational each record can have multiple parents and multiple child records. In effect, it supports many to many
relationships
Data structure
Based on parent child
relationship
A record can have many parents as
well as many children.
Based on relational data structures
In hierarchical model relationship based in terms of parent child. So a child may only have one parent but a parent can have multiple
children. But in network data model a record can have many parents as well as many children. Relational data model is based on relational
data structures

Comparison between Hierarchical, Network and Relational Model
Characteristic Hierarchical model Network model Relational model
Data manipulation
Does not provide an independent
stand alone query interface
CODASYL (Conference on Data Systems Languages)
Relational databases are what brings many
sources into a common query (such as SQL)
In relational database it use powerful operations such as SQL languages or query by example are used to manipulate data stored in the database But in hierarchical data
model it does not
provide an independent stand alone query interface while network model uses CODASYL
Data manipulation
retrieve algorithms are complex and
asymmetric
Retrieve algorithms are complex and symmetric
Retrieve algorithms are
simple and symmetric
In hierarchical data model and network model retrieve algorithms are complex and symmetric But in relational data model retrieve algorithms are simple and symmetric
Data integrity
Cannot insert the information of a
child who does not have any parent.
Does not suffer form any insertion anomaly.
Does not suffer from any
insert anomaly.
In hierarchical data model we cannot insert the information of a child who does not have any parent. But in network model does not suffer form any insertion anomaly.
relational model does not suffer from any insert anomaly
Data integrity
Multiple occurrences of child
records which lead to problems of
inconsistency during the update
operation
Free from update anomalies.
Free form update
anomalies
In network model it is free from update anomalies because there is only a single occurrence for each record set.In relational model it also free form update anomalies because
it removes the redundancy of data by proper designing through normalization process. But in hierarchical model there are multiple occurrences of child records. which lead
to problems of inconsistency during the update operation
Data intergirty
Deletion of parent results in deletion
of child records
Free from delete anomalies
Free from delete
anomalies
In hierarchical model it is based on parent child relationship and deletion of parent results in deletion of child records .But in network model and in relational model it is free
from deletion anomalies. Because information is stored in different tables.

DBMS Intro - Collection of Info Organized for Easy Access

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to DBMS Intro - Collection of Info Organized for Easy Access

Similar to DBMS Intro - Collection of Info Organized for Easy Access (20)

More from Vaibhav Kathuria

More from Vaibhav Kathuria (6)

Recently uploaded

Recently uploaded (20)

DBMS Intro - Collection of Info Organized for Easy Access