SlideShare a Scribd company logo
1 of 33
Course: MCA
Subject: Data and File Structure
Unit-5
Sorting and Searching
What is a Hash Table ?
 The simplest kind of hash table is an array of
records.
 This example has 701 records.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]
An array of records
. . .
[ 700]
What is a Hash Table ?[1]
 Each record has a special field,
called its key.
 In this example, the key is a long
integer field called Number.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]
. . .
[ 700]
[ 4 ]
Number 506643548
What is a Hash Table ?[1]
 The number might be a
person's identification
number, and the rest of the
record has information
about the person.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]
. . .
[ 700]
[ 4 ]
Number 506643548
Inserting a New Record[2]
 In order to insert a new record, the key must
somehow be converted to an array index.
 The index is called the hash value of the key.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685
Inserting a New Record[2]
 Typical way to create a hash value:
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685
(Number mod 701)
What is (580625685 mod 701) ?
3
Inserting a New Record[2]
 The hash value is used for the
location of the new record.
Number 580625685
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
[3]
Collisions[3]
 Here is another new record to insert,
with a hash value of 2.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685
Number 701466868
My hash
value is
[2].
Collisions
 This is called a collision, because
there is already another valid record
at .
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685
Number 701466868
Collisions
 This is called a collision, because there is
already another valid record at [2].
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
The new record
goes
in the empty spot.
Searching for a Key
 The data that's attached to a key can be found fairly
quickly.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
Searching for a Key
 Calculate the hash value.
 Check that location of the array
for the key.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Not me.
Searching for a Key
 Keep moving forward until you
find the key, or you reach an
empty spot.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Yes!
Searching for a Key
 When the item is found, the
information can be copied to the
necessary location.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Yes!
Deleting a Record
 Records may also be deleted from a hash table.
 But the location must not be left as an ordinary "empty
spot" since that could interfere with searches.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Deleting a Record
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 233667136Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
 Records may also be deleted from a hash table.
 But the location must not be left as an ordinary "empty
spot" since that could interfere with searches.
 The location must be marked in some special way so
that a search can tell that the spot used to have
something in it.
Handling collisions
If the number of possible keys greatly exceeds the
numbers of records, and of computed storage locations,
hash collisions become inevitable and so have to be
handled without loss of data.
3 approaches are used to handle collisions:
open hashing
quadratic hashing
chained hashing
Files
A file can be seen as
1. A stream of bytes (no structure), or
2. A collection of records with fields
A Stream File[4]
 File is viewed as a sequence of bytes:
 Data semantics is lost: there is no way to get it
apart again.
87359CarrollAlice in wonderland38180FolkFile Structures ...
Field and Record Organization
Definitions
Record: a collection of related fields.
Field: the smallest logically meaningful unit of
information in a file.
Key: a subset of the fields in a record used to identify
(uniquely) the record.
e.g. In the example file of books:
 Each line corresponds to a record.
 Fields in each record: ISBN, Author, Title
Record Keys
 Primary key: a key that uniquely identifies a
record.
 Secondary key: other keys that may be used for
search
 Author name
 Book title
 Author name + book title
 Note that in general not every field is a key (keys
correspond to fields, or a combination of fields,
that may be used in a search).
Field Structures
 Fixed-length fields
87359Carroll Alice in wonderland
38180Folk File Structures
 Begin each field with a length indicator
058735907Carroll19Alice in wonderland
053818004Folk15File Structures
 Place a delimiter at the end of each field
87359|Carroll|Alice in wonderland|
38180|Folk|File Structures|
 Store field as keyword = value
ISBN=87359|AU=Carroll|TI=Alice in wonderland|
ISBN=38180|AU=Folk|TI=File Structures
Record Structures
1. Fixed-length records.
2. Fixed number of fields.
3. Begin each record with a length indicator.
4. Use an index to keep track of addresses.
5. Place a delimiter at the end of the record.
Fixed-length records[4]
Two ways of making fixed-length records:
1. Fixed-length records with fixed-length fields.
2. Fixed-length records with variable-length fields.
87359 Carroll Alice in wonderland
03818 Folk File Structures
87359|Carroll|Alice in wonderland| unused
38180|Folk|File Structures| unused
Variable-length records[5]
 Fixed number of fields:
 Record beginning with length indicator:
 Use an index file to keep track of record addresses:
 The index file keeps the byte offset for each record; this
allows us to search the index (which have fixed length
records) in order to discover the beginning of the record.
 Placing a delimiter: e.g. end-of-line char
87359|Carroll|Alice in wonderland|38180|Folk|File Structures| ...
3387359|Carroll|Alice in wonderland|2638180|Folk|File Structures| ..
File Operations
 Typical Operations:
 Retrieve a record
 Insert a record
 Delete a record
 Modify a field of a record
 In direct files:
 Get a record with a given field value
 In sequential files:
 Get the next record
Taxonomy of file structures[6]
 The access method determines how records can be
retrieved: sequentially or randomly.
• One record after another,
from beginning to end
• Access one specific record
without having to retrieve all records before it
Sequential file
 Sequential file –Records can only be accessed
sequentially, one after another, from beginning to end.
 Records are arranged in sequential manner.
Mapping in an indexed file
 To access a record in a file Randomly, you need to know
the address of the record.
 An index file can relate the key to the record address.
Indexed files
 An index file is made of a data file, which is a sequential file,
and an index.
 Index – a small file with only two fields:
 The key of the sequential file
 The address of the corresponding record on the disk.
 To access a record in the file :
 Load the entire index file into main memory.
 Search the index file to find the desired key.
 Retrieve the address the record.
 Retrieve the data record. (using the address)
 Inverted file –
you can have more than one index, each with a different key.
Mapping in a hashed file
 A hashed file uses a hash function to map the key to the
address.
 Eliminates the need for an extra file (index).
 There is no need for an index and all of the overhead
associated with it.
Direct Hashing
 The file must contain a record for every possible key.
 Adv. – no collision.
 Disadv. – space is wasted.
 Hashing techniques –
map a large population of possible keys into
a small address space.
References
1. An introduction to Datastructure with application by jean Trembley and sorrenson
2. Data structures by schaums and series –seymour lipschutz
3. http://en.wikipedia.org/wiki/Book:Data_structures
4. http://www.amazon.com/Data-Structures-Algorithms
5. http://www.amazon.in/Data-Structures-Algorithms-Made-Easy/dp/0615459811/
6. http://www.amazon.in/Data-Structures-SIE-Seymour-Lipschutz/dp
List of images
1. http://www.expertsmind.com/questions/depth-first-search-30153061.aspx
2. http://burtleburtle.net/bob/hash/index.html
3.http://herakles.zcu.cz/~skala/PUBL/PUBL_2010/2010_WSEASCorfu_Hash-final.pdf
4. http://www.scribd.com/doc/199819816/Fundamental-Data-Structures
5. http://searchsqlserver.techtarget.com/definition/hashing
6. http://searchsqlserver.techtarget.com/definition/hashing

More Related Content

What's hot

Indexing and-hashing
Indexing and-hashingIndexing and-hashing
Indexing and-hashing
Ami Ranjit
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
Jeet Poria
 

What's hot (8)

Indexing and-hashing
Indexing and-hashingIndexing and-hashing
Indexing and-hashing
 
What is merkle tree
What is merkle treeWhat is merkle tree
What is merkle tree
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashing
 
Merkle Trees and Fusion Trees
Merkle Trees and Fusion TreesMerkle Trees and Fusion Trees
Merkle Trees and Fusion Trees
 
Isam
IsamIsam
Isam
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
 
Data storage and indexing
Data storage and indexingData storage and indexing
Data storage and indexing
 
Data indexing presentation
Data indexing presentationData indexing presentation
Data indexing presentation
 

Similar to Mca ii dfs u-5 sorting and searching structure

Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...
Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...
Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...
mrhabib10
 

Similar to Mca ii dfs u-5 sorting and searching structure (20)

Searching.ppt
Searching.pptSearching.ppt
Searching.ppt
 
CS-102 Data Structures HashFunction CS102.pdf
CS-102 Data Structures HashFunction CS102.pdfCS-102 Data Structures HashFunction CS102.pdf
CS-102 Data Structures HashFunction CS102.pdf
 
Searching.ppt
Searching.pptSearching.ppt
Searching.ppt
 
Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...
Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...
Searching Algorithms with Binary Search and Hashing Concept with Time and Spa...
 
Hashing in DS
Hashing in DSHashing in DS
Hashing in DS
 
Hash table
Hash tableHash table
Hash table
 
dsa pdf.pdf
dsa pdf.pdfdsa pdf.pdf
dsa pdf.pdf
 
Linear and Binary search
Linear and Binary searchLinear and Binary search
Linear and Binary search
 
Searching algorithm
Searching algorithmSearching algorithm
Searching algorithm
 
Unit 8 searching and hashing
Unit   8 searching and hashingUnit   8 searching and hashing
Unit 8 searching and hashing
 
Hash tables
Hash tablesHash tables
Hash tables
 
DS Unit 1.pptx
DS Unit 1.pptxDS Unit 1.pptx
DS Unit 1.pptx
 
Indexing
IndexingIndexing
Indexing
 
Hashing Technique In Data Structures
Hashing Technique In Data StructuresHashing Technique In Data Structures
Hashing Technique In Data Structures
 
Hashing 1
Hashing 1Hashing 1
Hashing 1
 
searching techniques.pptx
searching techniques.pptxsearching techniques.pptx
searching techniques.pptx
 
arrays.pptx
arrays.pptxarrays.pptx
arrays.pptx
 
Searching algorithms
Searching algorithmsSearching algorithms
Searching algorithms
 
SISAP17
SISAP17SISAP17
SISAP17
 
Intro to Lists
Intro to ListsIntro to Lists
Intro to Lists
 

More from Rai University

Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Rai University
 

More from Rai University (20)

Brochure Rai University
Brochure Rai University Brochure Rai University
Brochure Rai University
 
Mm unit 4point2
Mm unit 4point2Mm unit 4point2
Mm unit 4point2
 
Mm unit 4point1
Mm unit 4point1Mm unit 4point1
Mm unit 4point1
 
Mm unit 4point3
Mm unit 4point3Mm unit 4point3
Mm unit 4point3
 
Mm unit 3point2
Mm unit 3point2Mm unit 3point2
Mm unit 3point2
 
Mm unit 3point1
Mm unit 3point1Mm unit 3point1
Mm unit 3point1
 
Mm unit 2point2
Mm unit 2point2Mm unit 2point2
Mm unit 2point2
 
Mm unit 2 point 1
Mm unit 2 point 1Mm unit 2 point 1
Mm unit 2 point 1
 
Mm unit 1point3
Mm unit 1point3Mm unit 1point3
Mm unit 1point3
 
Mm unit 1point2
Mm unit 1point2Mm unit 1point2
Mm unit 1point2
 
Mm unit 1point1
Mm unit 1point1Mm unit 1point1
Mm unit 1point1
 
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
 
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
 
Bsc agri 2 pae u-4.3 public expenditure
Bsc agri  2 pae  u-4.3 public expenditureBsc agri  2 pae  u-4.3 public expenditure
Bsc agri 2 pae u-4.3 public expenditure
 
Bsc agri 2 pae u-4.2 public finance
Bsc agri  2 pae  u-4.2 public financeBsc agri  2 pae  u-4.2 public finance
Bsc agri 2 pae u-4.2 public finance
 
Bsc agri 2 pae u-4.1 introduction
Bsc agri  2 pae  u-4.1 introductionBsc agri  2 pae  u-4.1 introduction
Bsc agri 2 pae u-4.1 introduction
 
Bsc agri 2 pae u-3.3 inflation
Bsc agri  2 pae  u-3.3  inflationBsc agri  2 pae  u-3.3  inflation
Bsc agri 2 pae u-3.3 inflation
 
Bsc agri 2 pae u-3.2 introduction to macro economics
Bsc agri  2 pae  u-3.2 introduction to macro economicsBsc agri  2 pae  u-3.2 introduction to macro economics
Bsc agri 2 pae u-3.2 introduction to macro economics
 
Bsc agri 2 pae u-3.1 marketstructure
Bsc agri  2 pae  u-3.1 marketstructureBsc agri  2 pae  u-3.1 marketstructure
Bsc agri 2 pae u-3.1 marketstructure
 
Bsc agri 2 pae u-3 perfect-competition
Bsc agri  2 pae  u-3 perfect-competitionBsc agri  2 pae  u-3 perfect-competition
Bsc agri 2 pae u-3 perfect-competition
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Recently uploaded (20)

Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 

Mca ii dfs u-5 sorting and searching structure

  • 1. Course: MCA Subject: Data and File Structure Unit-5 Sorting and Searching
  • 2. What is a Hash Table ?  The simplest kind of hash table is an array of records.  This example has 701 records. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] An array of records . . . [ 700]
  • 3. What is a Hash Table ?[1]  Each record has a special field, called its key.  In this example, the key is a long integer field called Number. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] . . . [ 700] [ 4 ] Number 506643548
  • 4. What is a Hash Table ?[1]  The number might be a person's identification number, and the rest of the record has information about the person. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] . . . [ 700] [ 4 ] Number 506643548
  • 5. Inserting a New Record[2]  In order to insert a new record, the key must somehow be converted to an array index.  The index is called the hash value of the key. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685
  • 6. Inserting a New Record[2]  Typical way to create a hash value: [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 (Number mod 701) What is (580625685 mod 701) ? 3
  • 7. Inserting a New Record[2]  The hash value is used for the location of the new record. Number 580625685 [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . [3]
  • 8. Collisions[3]  Here is another new record to insert, with a hash value of 2. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 My hash value is [2].
  • 9. Collisions  This is called a collision, because there is already another valid record at . [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868
  • 10. Collisions  This is called a collision, because there is already another valid record at [2]. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 The new record goes in the empty spot.
  • 11. Searching for a Key  The data that's attached to a key can be found fairly quickly. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868
  • 12. Searching for a Key  Calculate the hash value.  Check that location of the array for the key. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Not me.
  • 13. Searching for a Key  Keep moving forward until you find the key, or you reach an empty spot. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Yes!
  • 14. Searching for a Key  When the item is found, the information can be copied to the necessary location. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Yes!
  • 15. Deleting a Record  Records may also be deleted from a hash table.  But the location must not be left as an ordinary "empty spot" since that could interfere with searches. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868
  • 16. Deleting a Record [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868  Records may also be deleted from a hash table.  But the location must not be left as an ordinary "empty spot" since that could interfere with searches.  The location must be marked in some special way so that a search can tell that the spot used to have something in it.
  • 17. Handling collisions If the number of possible keys greatly exceeds the numbers of records, and of computed storage locations, hash collisions become inevitable and so have to be handled without loss of data. 3 approaches are used to handle collisions: open hashing quadratic hashing chained hashing
  • 18. Files A file can be seen as 1. A stream of bytes (no structure), or 2. A collection of records with fields
  • 19. A Stream File[4]  File is viewed as a sequence of bytes:  Data semantics is lost: there is no way to get it apart again. 87359CarrollAlice in wonderland38180FolkFile Structures ...
  • 20. Field and Record Organization Definitions Record: a collection of related fields. Field: the smallest logically meaningful unit of information in a file. Key: a subset of the fields in a record used to identify (uniquely) the record. e.g. In the example file of books:  Each line corresponds to a record.  Fields in each record: ISBN, Author, Title
  • 21. Record Keys  Primary key: a key that uniquely identifies a record.  Secondary key: other keys that may be used for search  Author name  Book title  Author name + book title  Note that in general not every field is a key (keys correspond to fields, or a combination of fields, that may be used in a search).
  • 22. Field Structures  Fixed-length fields 87359Carroll Alice in wonderland 38180Folk File Structures  Begin each field with a length indicator 058735907Carroll19Alice in wonderland 053818004Folk15File Structures  Place a delimiter at the end of each field 87359|Carroll|Alice in wonderland| 38180|Folk|File Structures|  Store field as keyword = value ISBN=87359|AU=Carroll|TI=Alice in wonderland| ISBN=38180|AU=Folk|TI=File Structures
  • 23. Record Structures 1. Fixed-length records. 2. Fixed number of fields. 3. Begin each record with a length indicator. 4. Use an index to keep track of addresses. 5. Place a delimiter at the end of the record.
  • 24. Fixed-length records[4] Two ways of making fixed-length records: 1. Fixed-length records with fixed-length fields. 2. Fixed-length records with variable-length fields. 87359 Carroll Alice in wonderland 03818 Folk File Structures 87359|Carroll|Alice in wonderland| unused 38180|Folk|File Structures| unused
  • 25. Variable-length records[5]  Fixed number of fields:  Record beginning with length indicator:  Use an index file to keep track of record addresses:  The index file keeps the byte offset for each record; this allows us to search the index (which have fixed length records) in order to discover the beginning of the record.  Placing a delimiter: e.g. end-of-line char 87359|Carroll|Alice in wonderland|38180|Folk|File Structures| ... 3387359|Carroll|Alice in wonderland|2638180|Folk|File Structures| ..
  • 26. File Operations  Typical Operations:  Retrieve a record  Insert a record  Delete a record  Modify a field of a record  In direct files:  Get a record with a given field value  In sequential files:  Get the next record
  • 27. Taxonomy of file structures[6]  The access method determines how records can be retrieved: sequentially or randomly. • One record after another, from beginning to end • Access one specific record without having to retrieve all records before it
  • 28. Sequential file  Sequential file –Records can only be accessed sequentially, one after another, from beginning to end.  Records are arranged in sequential manner.
  • 29. Mapping in an indexed file  To access a record in a file Randomly, you need to know the address of the record.  An index file can relate the key to the record address.
  • 30. Indexed files  An index file is made of a data file, which is a sequential file, and an index.  Index – a small file with only two fields:  The key of the sequential file  The address of the corresponding record on the disk.  To access a record in the file :  Load the entire index file into main memory.  Search the index file to find the desired key.  Retrieve the address the record.  Retrieve the data record. (using the address)  Inverted file – you can have more than one index, each with a different key.
  • 31. Mapping in a hashed file  A hashed file uses a hash function to map the key to the address.  Eliminates the need for an extra file (index).  There is no need for an index and all of the overhead associated with it.
  • 32. Direct Hashing  The file must contain a record for every possible key.  Adv. – no collision.  Disadv. – space is wasted.  Hashing techniques – map a large population of possible keys into a small address space.
  • 33. References 1. An introduction to Datastructure with application by jean Trembley and sorrenson 2. Data structures by schaums and series –seymour lipschutz 3. http://en.wikipedia.org/wiki/Book:Data_structures 4. http://www.amazon.com/Data-Structures-Algorithms 5. http://www.amazon.in/Data-Structures-Algorithms-Made-Easy/dp/0615459811/ 6. http://www.amazon.in/Data-Structures-SIE-Seymour-Lipschutz/dp List of images 1. http://www.expertsmind.com/questions/depth-first-search-30153061.aspx 2. http://burtleburtle.net/bob/hash/index.html 3.http://herakles.zcu.cz/~skala/PUBL/PUBL_2010/2010_WSEASCorfu_Hash-final.pdf 4. http://www.scribd.com/doc/199819816/Fundamental-Data-Structures 5. http://searchsqlserver.techtarget.com/definition/hashing 6. http://searchsqlserver.techtarget.com/definition/hashing