មេរៀនៈ Data Structure and Algorithm in C/C++

Data Structure
and
Algorithms
Lecturer: CHHAY Nuppakun
E-mail: nuppakunc@yahoo.com

Department of Computer Studies
Norton University - 2013

Chapter 1

Fundamental ideas
of
data structure
and
algorithm

Read Ahead
You are expected to read the lecture notes
before the lecture.
This will facilitate more productive discussion
during class.
Also please
Like in an proof read
English class assignments
& tests.
3

Programs and programming
 What is a program?
 A set of instructions working with data
designed to accomplish a specific task
 The “recipe” analogy
 Ingredients are the Data
 Directions are the Program Statements

 What is programming
 The art and craft of writing programs
 The art to control these “idiot servants” and
“naïve children”
4

Introduction to Programming
 Programming is to solve problems using computers
 How to do it at all ?

 How to do it robustly ?

 How to do it effectively ?

 Programming consists of two steps:
 Algorithmic design (the architects)

 Coding (the construction workers)

 Programming requires:
 A programming language (C/C++/C#) to express your ideas

 A set of tools to design, edit, and debug your code

 A compiler to translate your programs into machine code

 A machine to run the executable code

5

Crafting Programs Effectively
 Program design
 design process

 stepwise refinement & top-down design

 bottom-up design

 modularization, interfaces

 use of abstractions

 Programming style
 structured programming

 readable code

 effective use of language constructs

 “formatting”

 software organization

 Documentation and comments 6

Good Programs
 There are a number of facets to good
programs: they must
 run correctly
 run efficiently
 be easy to read and understand
 be easy to debug and
 be easy to modify
 better running times will generally be
obtained from use of the most appropriate
data structures and algorithms

7

Why Data Structure and Algorithms
 Computer is becoming ubiquitous …
 programming gets you more out of computer
 learn how to solve problems
 dealing with abstractions
 be more precise

 Unfortunately, most people
 know little about Computer Science
 know little about Programming
 write bad or buggy programs
 become lost when writing large programs
8

Algorithms and Data Structures
 Algorithm: a strategy for computing something, e.g.,
 sorting: putting data in order by key

 searching: finding data in some kind of index

 finding primes and generating random numbers

 string processing

 graphics: drawing lines, arcs, and other geometric
objects
 Data structure: a way to store data, e.g.,
 arrays and vectors

 linked lists

 Two are related:
 data structures organize data

 algorithms use that organization
9

What are computers?
 “idiot servants” that can do simple operations
incredibly fast if you tell them every step to do
 like little children in their need for specific and
detailed instruction
 computers are not “brains” & are not “smart” -
they only as good as the program they are
running

10

Computer Environment: Hardware
 Hardware
 the physical, tangible parts of a computer
 E.g., CPU, storage, keyboard, monitor
chip that executes
Monitor Central program commands
Processing e.g.,
Keyboard Unit Intel Pentium IV
Sun Sparc
Transmeta
primary storage area for
programs and data Hard Disk
Main
also called RAM Memory
CD ROM
11

Computer Environment: Software
 Operating System
 E.g., Linux, Mac OS X, Windows 2000, Windows XP

 manages resources such as CPU, memory, and disk

 controls all machine activities

 Application programs
 generic term for any other kind of software

 compiler, word processors, missile control systems,

games

12

Operating System
 What does an OS do?
 hides low level details of bare machine
 arbitrates competing resource demands
 Useful attributes
 multi-user
 multi-tasking

CPU
User Operating
Program System Disk

Network
13

Chapter 2

Review of C++
Essentials

Main Program and Library Files
<preprocessor directives>
<global data and function declarations>
int main( )
{
<local data declarations>
<statements>
return 0;
}
<main program function implementation>

15

Program Comments

/*
<multiline comments >
*/

//<end-of-line comments>

16

C++ Data Types

simple structured

integral enum floating

float double long double address

pointer reference

17

Simple Data Types
char
int
long, double,
float unsigned

Variables
<data type> <list of identifiers>
<data type> <identifier> = <initial value>

18

Symbolic Constants

const float PI =
3.141592653589793238;

const int UPPER_BOUND = 100;

const char BLANK = ` `;

19

Expressions and Assignment

Operators: +, -, *, /, %, =, <, <=, >=, = =, !=,
&&, ||, !, ( )
Examples:
a = b = c = 5;
((a = b) = c) = 5; //?
a = = 0;
a=0

20

Type Conversion
<type name> (<expression>) or
(<type name>) <expression>

Example:
int (3.14) returns 3
(float) 3 returns 3.0

21

Interactive I/O

cout << “Enter an int, a float, and a string,
“
<< “separated by spaces”

cin >> int_value >> float_value >> string

22

Functions
double pow(double base, double exponent);
cout << pow(5,2) << endl; // function call
*********************************
void hello_world( )
{
cout << “Hello World” << endl;
}
hello_world( ); //call to a void function

23

Selection
if
if … else
switch

Iteration
For (<initiation>, <termination>, <update>)
while (<condition>) { <statements>}
do {<statements>} while (<condition>)

24

User Defined Type
Using typedef
typedef int boolean;
Using enum
enum weekday {MON, TUE, WED, THUR, FRI};
enum primary_color {RED, YELLOW, BLUE};
weekday day = MON;
primary_color color = RED;

25

Structured Data Types

Arrays
Strings
Structs
Files

26

Why do we need an array?
cin >> value0;
#include <iostream.h> cin >> value1;
int value0; …
int value1; cin >> value999;
int value2; cout << value0;
… cout << value1;
int value999; cout << value2;
…
cout << value999

27

Array Declaration
<type> <ArrayName>[Size];

Example int value[1000];

Multidimensional Array
 Declaration
<type> <ArrayName> [index0][...][indexN]

Example
int hiTemp[52][7]
int ThreeD[10][10][5]
28

Accessing an Array
 Array initialization
for (I = 0; I < = 999, I++)
value[I] = 2 *I -1;
 Each of an array’s elements can be accessed
in sequence by varying an array index
variable within a loop
 Multidimensional arrays can be accessed with
nested loops.

29

Chapter 3

Algorithms

Algorithm
Definition
 A step-by-step procedure for solving a

problem in a finite amount of time
 Pseudo-code

 is a compact and informal high-level
description of a computer programming
algorithm that uses the structural conventions
of a programming language

31

Algorithms (Continue)
Algorithm is used in computer science to describe a
problem-solving method suitable for implementation as a
computer program:
1. Most algorithms of interest involve methods of organizing the data
involved in the computation. Objects created in this way are called
data structures => algorithms and data structures go hand in hand
2. use a computer to help us solve a problem for small or for huge
problems - quickly become motivated to devise methods that use
time or space as efficiently as possible.
3. Careful algorithm design is an extremely effective part of the
process of solving a huge problem, whatever the applications
area

32

Algorithms (Continue)
4. Huge or complex computer program is to be developed, a great deal of
effort must go into understanding and defining the problem to be solved,
In most cases, however, there are a few algorithms whose choice is
critical because most of the system resources will be spent running
those algorithms
5. The sharing of programs in computer systems is becoming more
widespread => to reimplement basic algorithms arises frequently, that
we are faced with completely new computing environments (hardware
and software) with new features that old implementations may not use
to best advantage. To make our solutions more portable and longer
lasting.
6. The choice of the best algorithm for a particular task can be a
complicated process, perhaps involving sophisticated mathematical
analysis. The branch of computer science that comprises the study of
such questions is called analysis of algorithms .

33

Analysis of Algorithms
 Analysis is the key to being able to understand algorithms
sufficiently well
 Analysis plays a role at every point in the process of
designing and implementing algorithms
 which mathematical analysis can play a role in the process of
comparing the performance of algorithms
 The following are among the reasons that we perform
mathematical analysis of algorithms:
 To compare different algorithms for the same task

 To predict performance in a new environment

 To set values of algorithm parameters

34

Growth of Functions
 Most algorithms have a primary parameter N that affects the
running time most significantly:
 The parameter N might be the degree of a polynomial

 the size of a file to be sorted or searched

 the number of characters in a text string

 or some other abstract measure of the size of the
problem being considered
 By using mathematical formulas that are as simple as
possible and that are accurate for large values of the
parameters

35

Growth of Functions (Continue)
 The algorithms in typically have running times
proportional to one of the following functions:
 1 Most instructions of most programs are executed once or at most
only a few times, that the program's running time is constant
 log N When the running time of a program is logarithmic, the
program gets slightly slower as N grows. This running time commonly
occurs in programs that solve a big problem by transformation into a
series of smaller problems
 N When the running time of a program is linear, it is generally the
case that a small amount of processing is done on each input
element
 N log N The N log N running time arises when algorithms solve a
problem by breaking it up into smaller subproblems, solving them
independently, and then combining the solutions

36

Growth of Functions (Continue)
 N2 When the running time of an algorithm is quadratic, that
algorithm is practical for use on only relatively small problems
 N3 Similarly, an algorithm that processes triples of data items
(perhaps in a triple nested loop) has a cubic running time and is
practical for use on only small problems
 2N Few algorithms with exponential running time are likely to be
appropriate for practical use, even though such algorithms arise
naturally as brute-force solutions to problems.
 The running time of a particular program is likely to be
some constant multiplied by one of these terms (the
leading term) plus some smaller terms.

37

Running Time
 Most algorithms transform input objects into
output objects.
 The running time of an algorithm typically
grows with the input size.
 Average case time is often difficult to
determine.
 We focus on the worst case running time.
 Easier to analyze
 Crucial to applications such as games, finance and
robotics

39

Experimental Studies
 Write a program implementing the algorithm
 Run the program with inputs of varying size and
composition
 Use a function, like the built-in clock() function, to
get an accurate measure of the actual running time
 Plot the results

40

Limitations of Experiments
 It is necessary to implement the algorithm, which
may be difficult
 Results may not be indicative of the running time on
other inputs not included in the experiment.
 In order to compare two algorithms, the same
hardware and software environments must be used

41

Algorithm Analysis
C= a + b;
Operands: c, a, b
Operators: +, =

Simple model computation steps :
- load operands (fetch time for c, a, b)
- perform operations (operates time for + and =)
- so above instruction needs 3Tfetch + 1T+ + 1Tstore

42

Algorithm Analysis
int num= 25;
Operands: num, constant: 25, operator: =
Time needed: 1Tfetch + 1Tstore
n>= I;
Operands: n, i, operator: >=
Time needed: 2Tfetch + 1T>=
++i; i=i+1;
Time needed: 2Tfetch + 1T+ + 1Tstore
43

Algorithm Analysis

Exercises
1- cout<< i;
2- area= l * w;
3- C=5/9 * (F-32);
4- return i;
5- *p= &a;

44

Computing running time
Arithmetic series summation (eg.)
1- unsignet int Sum (unsigned int n)
2- {
Statement Time Code
3- unsigned int result=0;
3 Tfetch + Tstore result=0;
4- for (int i=0; i<=n; i++)
4a Tfetch + Tstore i=0;
5- result+=l;
6- return result; 4b (2Tfetch + T<) * (n+1) i<=n;

7- } 4c (2Tfetch + T+ + Tstore) * n i++;

5 (3Tfetch + T+ + Tstore) * n result+=I;

6 Tfetch + Treturn return result;

(7Tfetch + 2T+ + 2Tstore + T<) * n
Total +
(5Tfetch + 2Tstore + T< + Treturn)

Computing running time of the program
45

Big-Oh Notation
 The mathematical artifact that allows us to suppress detail when we
are analyzing algorithms is called the O-notation, or "big-Oh
notation,"
 Definition 1 A function g(N) is said to be O(f (N)) if there
exist constants co and No such that g(N) < co f (N) for all N > No.
 We use the O-notation for three distinct purposes:
 To bound the error that we make when we ignore small
terms in mathematical formulas
 To bound the error that we make when we ignore parts of a
program that contribute a small amount to the total being
analyzed
 To allow us to classify algorithms according to upper bounds
on their total running times

46

Big-Oh Notation (Continue)
 Often, the results of a mathematical analysis are not exact, but rather
are approximate in a precise technical sense
 The O-notation allows us to keep track of the leading terms while
ignoring smaller terms when manipulating approximate mathematical
expressions
 For example, if we expand the expression:
(N + O (1)) (N + O (log N) + O(1)),
we get six terms: N2 + O (N) + O (N log N) + O (log N) + O (N) +
O (1),
but can drop all but the largest O-term, leaving the approximation
N2 + O (N log N).
That is, N2 is a good approximation to this expression when N is
large.

47

 Another Example
 What if the input size is 10,000

 Algorithm 1: 1,000,000

 Algorithm 2: 100,000,000

 Conclusion

 Algorithm 1 is better!

 Question:
 Who is REALLY better?

 Confused!

 Reason
 Too precise!

 Solution
 Big-O notation – Order of the algorithm

 Rougher measurement

 Measure the increasing speed, ignoring the constants and smaller items

 Better algorithms have lower increasing speed

 Remember
 The order of an algorithm generally is more important than the speed of

the processor (CPU)
 Why? 48

Chapter 4

Data Structure

Data Structure
 Definition
 A data structure is a collection of data, generally
organized so that items can be stored and retrieved
by some fixed techniques
 Example
 An array
 Stored and retrieved based on an index assigned to
each item

51

Data Structures vs. Software

 How They Related
 Software is designed to help people solve problems in
reality
 To solve the problems, there are some THINGS, or
INFOs in reality to be processed
 Those THINGS or INFOs are called DATA
 DATA and their RELATIONS can be complicated

52

Data Structures vs. Software
 How They Related
 Reasonable organization of DATA helps improving
software efficiency, decreasing software design difficulty
 Experiences accumulated in the past will be learned in this
course, and they are certain DATA STRUCUTRES, such
as the linked list and the binary tree
 DATA STRUCTURE is a smart way to organize DATA,
depends on the features of DATA, and how the DATA are
processed

53

Phases of Software Development
 Phases
 Specification of the task
 Design of a solution
 Implementation of the solution
 Analysis of the solution
 Testing and debugging
 Maintenance and evolution of the system
 Obsolescence

54

Phases of Software Development
 Features of the Phases
 NOT a fixed sequence
 For example, in a widely used OO DESIGN method,
Unified Process (UP), there are many iterations, and in
each iteration, there are specification, design,
implementation and test involved. Feedback from
previous iteration helps improving the next iteration
 You can find other examples from textbook
 Most phases are independent of programming
languages
 We will use Java for IMPLEMNTATION
 However, most of what we learned in this course
applies to other languages

55

Arrays
 The most fundamental data structure is the array
 An array is a fixed number of data items that are stored
contiguously and that are accessible by an index
 A simple example of the use of an array, which prints out all the
prime numbers less than 1000.

const int N = 1000;
main( )
{ int i, j, a[N+1];
for (a[1] = 0, i = 2; i <= N; i++) a[i]=1;
for (i = 2; i <= N/2; i++)
for (j = 2; j <= N/i; j++) a[i*j] = 0;
for (i = 1; i <= N; i++)
if (a[i]) cout << i << ‘ ‘ ; cout << ‘n’;
}

56

Arrays (Continue)
 The primary feature of arrays is that if the index is
known, any item can be accessed in constant time
 The size of the array must be known beforehand, it is
possible to declare the size of an array at execution
time
 Arrays are fundamental data structures in that they
have a direct correspondence with memory systems
on virtually all computers
 The entire computer memory as an array, with the
memory addresses corresponding to array indices
57

Linked Lists
 The second elementary data structure to
consider is the linked list
 The primary advantage of linked lists
over arrays is that:
 linked lists can grow and shrink in size during their
lifetime
 their maximum size need not be known in advance

 it possible to have several data structures share the

same space

58

Linked Lists (Continue)
 A second advantage of linked lists is that:
 they provide flexibility in allowing the items to be
rearranged efficiently
 This flexibility is gained at the expense of quick

access to any arbitrary item in the list
 A linked list is a set of items organized
sequentially, just like an array
A L I S T

A linked list

59

 Flexible space use
 Dynamically allocate space for each element as

needed
 Include a pointer to the next item

Linked list Data Next
 Each node of the list contains
 the data item (an object pointer in our ADT)
 a pointer to the next node object

60

 Collection structure has a pointer to the list head
 Initially NULL Collection
 Add first item
Head
 Allocate space for node node
 Set its data pointer to object Data Next
 Set Next to NULL
 Set Head to point to new node
object

61

 Add second item
 Allocate space for node
 Set its data pointer to object
 Set Next to current Head
 Set Head to point to new node
Collection
Head

node node
Data Next
Data Next

object2 object
62

head z
A L I S T

A linked list with its dummy nodes.

head z
A L I S T

head z
T A L I S

Rearranging a linked list

63

X

head z
A L I S T

head z
A L I X S T

head z
A L I X S T

Insertion into and deletion from a linked list.

64

Linked Lists - LIFO and FIFO
 Single Linked List
 One-way cursor
 Only can move forward
 Simplest implementation
 Add to head
Last-In-First-Out (LIFO) semantics
 Modifications
 First-In-First-Out (FIFO)
 Keep a tail pointer

head

tail

65

Linked Lists - Doubly linked

 Doubly linked lists
 Can be scanned in both directions
 Two-way cursor
 Can move forward and backward

head prev prev prev

tail

66

Linked List vs. Array
 Arrays are better at random access
 What is the 4th element in the list?
 Arrays need O(C) time
 Linked lists need O(n) time at worst case
 Linked lists are better at additions and
removals at a cursor
 Operations at the cursor need O(C) time
 Arrays don’t have cursor, so addition and removal
operations need O(n) time at worst case

67

Linked List vs. Array
 Resizing can be inefficient for an array
 For arrays, capacity must be maintained in an inefficient
way
 For linked lists, no problem
 Summary
 Array
 Frequent random access operations
 Linked lists
 Operations occur at a cursor
 Frequent capacity changes
 Operations occur at a two-way cursor (DLL)

68

Storage Allocation
 arrays are a rather direct representation of the
memory of the computer
 direct-array representation of linked lists
 is to use "parallel arrays“
 The advantage of using parallel arrays is that the
structure can be built on top of' the data: the array
key contains data and only data all the structure
is in the parallel array next
 more data can be added with more parallel arrays

69

Pushdown Stacks
 The most important restricted-access data structure is the
pushdown stack. Items are added in a: L ast I n F irst O ut
(LIFO) approach
 two basic operations are involved: one can push an item
onto the stack (insert it at the beginning) and pop an item
(remove it from the beginning)
 pushdown stacks appear as the fundamental data structure
for many algorithms
 The stack is represented with an array stack and pointer p
to the top of the stack the functions push, pop, and empty
are straightforward implementations of the basic stack
operations

70

Stack Example – Math Parser
 Define Parser
 9 * ( 3 + 5 ) * (4 + 2) = ?
 Why not 10 ?
 In INFIX notation
 Convert to Postfix using a STACK
953+*42+*
 Then compute using a STACK
 Answer:

71

Infix -> Postfix Algorithm
 9 * ( 3 + 5 ) * (4 + 2) = ?
 Only worrying about +, *, and ()
 Initialize Stack
 If you get a #, output it
 If you get a operand, entries are popped until
we get a lower priority
 If you get a ‘)’, pop and output operands until
you clear a ‘(‘

72

Infix -> Postfix

Start 9 * ( 3 + 5 ) * (4 + 2) = ?

Output 9

9

73

Infix -> Postfix
End 9 * ( 3 + 5 ) * (4 + 2) = ?

Pop until stack
is empty

9 3 5 +* 4 2 + *
Top
74

Calculate Postfix
 935+42+**
 Given a #, push it
 Given an operand
 Pop the top two #s
 Apply operand
 Push result back onto stack

75

Calculate Postfix

935+42+**

Push 9

Top
9

76

Calculate Postfix

935+42+**

Push 3

Top
3
9

77

Calculate Postfix

935+42+**

Push 5

Top
5
3
9

78

Calculate Postfix

935+42+**
5 3

Pop Two
Numbers

5
3
Top
9

79

Calculate Postfix
935+42+**
5 + 3

Apply +

5
3
Top
9

80

Calculate Postfix

935+42+**

Push Result (8)

Top
8
9

81

Calculate Postfix
935+42+**

Push 4

Top
4
8
9

82

Calculate Postfix

935+42+**

Push 2

Top
2
4
8
9

83

Calculate Postfix

935+42+**
2 + 4

Pop 2 and 4, Add

Top
8
9

84

Calculate Postfix

935+42+**

Push Result (6)

Top
6
8
9

85

Calculate Postfix

935+42+**
6 * 8

Pop 6 and Pop 8
and Multiply

Top
9

86

Calculate Postfix

935+42+**

Push Result (48)

Top
48
9

87

Calculate Postfix

935+42+**
48 * 9

Pop 48 and Pop 9
Multiply

Answer: 432
Top
88

Using Stacks
 Computer Architecture
 Operating Systems
 Event Planning (Networking, OS)
 Computer Graphics (Scene graphs)
 Compilers, Parsers

89

Queues
 Another fundamental restricted-access data structure
is called the queue
 two basic operations are involved: one can insert
(add) an item into the queue at the beginning and
remove an item from the end
 queues obey a "first in, first out” (FIFO) discipline
 There is three class variables: the size of the queue
and two indices, one to the beginning of the queue
(head) and one to the end (tail)
 If head and tail are equal, then the queue is defined to
be empty; but if put would make them equal, then it is
defined to be full
90

Applications of Queues
 Direct applications
 Waiting lines
 Access to shared resources (e.g., printer)
 Multiprogramming
 Indirect applications
 Auxiliary data structure for algorithms
 Component of other data structures

91

Queue Example
 You: Bank of America employee
 Boss: How many tellers do I need?
 How do you go about solving this
problem?
 Simulations!
 What are the parameters?

92

Bank Teller Example
 Classes
 Data structures
 Input
 Time step = 5 sec
 Transaction = 2 minutes
 Customer Frequency = 50% chance every 15
seconds
 What questions do we want to know?
 Average wait time
 Average line length
 How a simulation would work
93

More Queue examples

 Networking: Router
 Computer Architecture: Execution Units
 Printer queues
 File systems
 Wal-Mart checkout lines
 Disney entrance

94

Recursion
 Two Necessary Parts
 Recursive calls
 Stopping or base cases
 Infinite recursion
 Every recursive call produces another recursive call
 Stopping case not well defined, or not reached
 Very useful technique
 Definition of mathematical functions
 Definition of data structures
 Recursive structures are naturally processed by
recursive functions!
 Recursively defined functions
 factorial
 Fibonacci
 GCD by Euclid’s algorithm
 Games
 Towers of Hanoi
95

Recurrences
Factorial function, defined by the formula
N! = N . (N - 1)!, for N > 1 with 0! = 1.

This corresponds directly to the following simple recursive program:
int factorial(int N)
{ if (N == 0) return 1;
return N * factorial(N-1);
}

This program illustrates the basic features of a recursive
program: it calls itself and it has a termination condition in which it
directly computes its result
96

Recurrences (Continue)
Well-known recurrence relation is the one that defines the Fibonacci
numbers:

FN = FN- 1 + FN-2 , for N >= 2 with F0 = F1 = 1

The recurrence corresponds directly to the simple recursive program:

int fibonacci(int N)
{ if (N <= 2) return 1;
return fibonacci(N-1) + fibonacci(N-2);
}

This is an even less convincing example of the “power" of recursion, that the
recursive
calls indicate that FN-1 and FN-2 should be computed independently.

97

Recurrences (Continue)

 The relationship between recursive programs and
recursively defined functions is often more
philosophical than practical
 factorial function really could be implemented with a
loop and that the Fibonacci function is better handled
by storing all precomputed values in an array

98

Divide-and-Conquer
 Most of the recursive programs use two recursive
calls, each operating on about half the input -
called "divide and conquer " paradigm for
algorithm design
 Divide-and conquer is a general algorithm design
paradigm:
 Divide: divide the input data S in two or more is joint
subsets S1, S2, …
 Recur: solve the subproblems recursively
 Conquer: combine the solutions for S1, S2, …, into a
solution for S
99

Divide-and-Conquer (Continue)
divide-and-conquer recursive program is a straightforward
Way to accomplish our objective:

void rule (int l, int r, int h)
{ int m = (l+r) /2;
if (h > 0)
{ rule (l,m,h-1);
mark (m, h) ;
rule (m,r,h-1);
}
}

The idea behind the method is the following: to make the marks
in an interval, first make the long mark in the middle

100

rule (0,8,3)
mark (4,3)
rule (0,4,2)
mark (2,2)
rule (0,2,1) Drawing a ruler (Preorder)
mark (1,1)
rule (0,1,0) in detail, giving the list of procedure calls
rule (1,2,0)
rule (2,4,1) and marks resulting from the call
mark (3,1) rule (0, 8, 3). We mark the middle and call
rule (2,3,0)
rule (3,4,0) rule for the left half, then do the same for
rule (4,8,2) the left half, and so forth, until a mark of
mark (6,2)
rule (4,6,1) length 0 is called for. Eventually we return
mark (5,1) from rule and mark right halves in the
rule (4,5,0)
rule (5,6,0) same way.
rule (6,8,1)
mark (7,1)
rule (6,7,0)
rule (7,8,0)

101

rule (0,8,3)
rule (0,4,2)
rule (0,2,1)
rule (0,1,0)
mark (1,1)
rule (1,2,0) Drawing a ruler (Inorder version)
mark (2,2)
rule (2,4,1) In general, divide-and-conquer
rule (2,3,0) algorithms involve doing some work
mark (3,1) to split the input into two pieces, or
rule (3,4,0)
mark (4,3) to merge the results of processing
rule (4,8,2) two independent "solved" portions
rule (4,6,1) of the input, or to help things along
rule (4,5,0)
mark (5,1) after half of the input has been
rule (5,6,0) processed.
mark (6,2)
rule (6,8,1)
rule (6,7,0)
mark( 7,1)
102

Divide-and-Conquer (Continue)
 nonrecursive algorithm, which does not correspond to
any recursive implementation, is to draw the shortest
marks first, then the next shortest, etc.

rule(int l, int r, int h);
{ int i , j , t;
for (i=1,j=1; i<=h; i++, j+=j)
for (t = 0 ; t<=(l+r)/j; t++)
mark (l+j+t*(j+j), i);
}

 combine and conquer - method of algorithm design where we
solve a problem by first solving trivial subproblems, then combining
those solutions to solve slightly bigger subproblems, etc., until the
whole problem is solved.

103

TREES GLOSSARY
 one item follows the other, which will consider two-dimensional
linked structures called trees
 Trees are encountered frequently in everyday life
 A tree is a nonempty collection of vertices and edges:
 A vertex is a simple object (also referred to as a node)
 An edge is a connection between two vertices
 A path in a tree is a list of distinct vertices in which
successive vertices are connected by edges in the tree
 One node in the tree is designated, as the root the defining
property of a tree
 If there is more than one path between the root and some node,
or if there is no path between the root and some node, then
what we have is a graph, not a tree

105

TREES
 In computer science, a tree is an abstract model of a
hierarchical structure
 Nodes with no children are sometimes called leaves, or
terminal nodes
 Nodes with at least one child are sometimes called
nonterminal nodes
 nonterminal nodes refer as internal nodes and terminal
nodes as external nodes
E
 Applications:
A R E
 Organization charts
File systems
 A S T

 Programming environments M P L E

A sample tree
106

TREES (Continue)
 The nodes in a tree divide themselves into levels - the
level of a node is the number of nodes on the path from
the node to the root
 The height of a tree is the maximum level among all
nodes in the tree (or the maximum distance to the root
from any node)
 The path length of a tree is the sum of the levels of all
the nodes in the tree (or the sum of the lengths of the
paths from each node to the root)
 The tree in figure of slide No 3 is height 3 and path length
21

107

Binary Trees
 A binary tree has nodes , similar to nodes in a
linked list structure.
 Data of one sort or another may be stored at each
node.
 But it is the connections between the nodes
which characterize a binary tree.

108

A Binary Tree of States
In this example, the
data contained at
each node is one of
the 50 states.

Each tree has a
special node
called its root ,
usually drawn at
the top.
111

Each node is
permitted to have two
Arkansas has a
Arkansas has a links to other nodes,
left child, but no
left child, but no
right child. called the left child
right child.
and the right child .

Some nodes
have only one
child.

112

Washington is the
Washington is the
parent of Arkansas
parent of Arkansas
and Colorado.
and Colorado.

Each node is
called the
parent of its
A node with no children.
children is called a
leaf .

113

Two rules about parents:

The root has no
parent.
Every other node
has exactly one
parent.

114


Two nodes with
the same
Arkansas
parent are Arkansas
and Colorado
and Colorado
called siblings. are siblings.
are siblings.

115

Complete Binary Trees
A complete binary tree is a When a complete
When a complete
binary tree is built,
special kind of binary tree binary tree is built,
its first node must be
which will be useful to us. the root.
the root.

The second node of a complete
binary tree is always the left child
of the root...

116

Complete Binary Trees
The second node of a complete binary
tree is always the left child of the root...
... and the third node is always the right
child of the root.

....
The next nodes must always
fill the next level from left
to right .

117

 Binary Tree
 Consists of
 Node
 Left and Right sub-trees

 Both sub-trees are binary trees

Each sub-tree
is itself
a binary tree

118

Trees - Performance
 Find
 Complete Tree

 Height, h
 Nodes traversed in a path from the root to a leaf
 Number of nodes, h
 n = 1 + 21 + 22 + … + 2h = 2h+1 - 1
 h = floor( log n )
2

119

Trees - Performance
 Find
 Complete Tree

 Since we need at most h+1 comparisons,
find in O(h+1) or O(log n)
 Same as binary search

120

Summary
 Binary trees contain nodes.
 Each node may have a left child and a right child.
 If you start from any node and move upward, you will
eventually reach the root.
 Every node except the root has one parent. The root has no
parent.
 Complete binary trees require the nodes to fill in each level
from left-to-right before starting the next level.

121

PROPERTIES
 Property 1 - There is exactly one path connecting
any two nodes in a tree :
 Any two nodes have a least common ancestor
 that any node can be the root: each node in a tree has the
property that there is exactly one path connecting that node with
every other node in the tree
 Property 2 - A tree with N codes has N - 1 edges
 each node, except the root, has a unique parent, and every edge
connects a node to its parent
 Property 3 - A binary tree with N internal nodes has N
+ 1 external nodes
 A binary tree with no internal nodes has one external node
 the left subtree has k + 1 external nodes and the right subtree has
N - k external nodes, for a total of N + 1

122

PROPERTIES ( Continue )
 Property 4 - The external path length of any binary
tree with N internal nodes is 2N greater than the
internal path length
 start with the binary tree consisting of one external node

 The process starts with a tree with internal and external

path length both 0 and, for each of N steps, increases the
external path length by 2 more than the internal path length
 Property 5 - The height of a full binary tree with N
internal nodes is about 10g2 N
 if the height is n, then we must have 2n-1 <N+1 ≤ 2n ,

since there are N + 1 external nodes

123

Representing Binary Trees
 The most prevalent representation of binary trees is a
straightforward use of records with two links per node
 For the representation corresponds to have two different
types of records, one for internal nodes, one for external
nodes; for others, it may be appropriate to use just one
type of node and to use the links in external nodes for
some other purpose
 The parse tree for an expression is defined by the simple
recursive rule: "put the operator at the root and then put
the tree for the expression corresponding to the first
operand on the left and the tree corresponding to the
expression for the second operand on the right

124

Representing Binary Trees ( Continue )
The parse tree for A B C + D E * * F + * (the same expression
in postfix)-- infix and postfix are two ways to represent
arithmetic expressions, parse trees are a third
*

A +

*
F

+ *

B C D E

Parse tree for A * ( ( ( B + C ) * ( D * E ) ) + F )
125

Representing Binary Trees ( Continue )
There are two other commonly used solutions. One option is to use a
different type of node for external nodes, one with no links. Another
option is to mark the links in some way (to distinguish them from
other links in the tree), then have them point elsewhere in the tree.
+ *

B C D E

*

+ A +

* * *
F F

+ * + * + *

B C D E B C D E B C D E

Building the parse tree for A B C + D E * * F + *
126

TRAVERSING TREES
 How to traverse tree and how to systematically visit every node
- there are a number of different ways to proceed
 The first method to consider is preorder traversal - The method
is defined by the simple recursive rule. "Visit the root, then
visit the left subtree, then visit the right subtree ."

traverse(struct node *t)
{ stack.push(t);
while ( !stack.empty ( ) )
{ t = stack.popo; visit(t);
if (t->r != z) stack.push(t->r ) ;
if (t->l != z) stack.push(t->l );
}
}
127

TRAVERSING TREES (Continue)

Preorder traversal
128

 The Second method to consider is inorder traversal - is
defined with the recursive rule "visit the left subtree,
then visit the root, then visit the right subtree ." ,
sometimes called symmetric order
 The implementation of a stack-based program for inorder
is almost identical to the above program.
 This method of traversal is probably the most widely
used

129


Inorder traversal
130

 The Third method to consider is postorder traversal - is
defined by the recursive rule "visit the left subtree,
then visit the right subtree , then visit the root ."
 Implementation of a stack-based program for postorder
is more complicated than for the other two because one
must arrange for the root and the right subtree to be
saved while the left subtree is visited and for the root to
be saved while the right subtree is visited.

131


Postorder traversal
132

 The Fourth method to consider is level-order traversal - is
defined not recursive at all - simply visit the nodes as they
appear on the page, reading down from top to bottom
and from left to right , because all the nodes on each level
appear together .
level-order traversal can be achieved by using the program above for
preorder, with a queue instead of a stack:
traverse(struct node *t)
{ queue.put(t);
while ( !queue.empty( ) )
{ t = queue.get( ); visit(t);
if (t->l != z) queue.put(t->l);
if (t->r != z) queue.put(t->r);
}
} 133


Level order traversal
134

Heaps Root

A heap is a certain
kind of complete
binary tree.

When a complete
When a complete
the root.
the root.

135

Heaps
Complete
Left child
binary tree. of the Right child
root of the
root

The second node is
The second node is
always the left child
always the left child
of the root. The third node is
The third node is
of the root. always the right child
always the right child The next nodes
of the root.
of the root. The next nodes
always fill the next
always fill the next
level from left-to-right. .
level from left-to-right

136

Heaps
45
A heap is a
certain kind 35 23
of complete
binary tree. 27 21 22 4

19

Each node in a heap
Each node in a heap The "heap property"
The "heap property"
contains a key that
contains a key that requires that each
requires that each
can be compared to
can be compared to node's key is >= the
node's key is >= the
other nodes' keys.
other nodes' keys. keys of its children
keys of its children
137

Adding a Node to a Heap

Put the new node in the 45
next available spot.
Push the new node 42
35 23
upward, swapping with its
parent until the new node 42 27 21 22 4
reaches an acceptable
location.
19 42

138

Adding a Node to a Heap
The parent has a key that is
>= new node, or 45
The node reaches the root.
The process of pushing the 42 23
new node upward is
called reheapification 35 21 22 4
upward .
19 27

139

Removing the Top of a Heap

Move the last node onto the
27
root.
Push the out-of-place node
downward, swapping with its 42 23
larger child until the new node
reaches an acceptable 35 21 22 4
location.
19

140

Removing the Top of a Heap
The children all have keys <= 42
the out-of-place node, or
The node reaches the leaf. 35 23
The process of pushing the new
27 21 22 4
node downward is called
reheapification 19
downward .

141

Implementing a Heap
42
 Data from the root
goes in the first 35 23
location of the
array. 27 21

 Data from the 42 35 23
next row goes in
the next two An array of data
array locations.

142

Implementing a Heap
42
 Data from the next row
goes in the next two
array locations. 35 23

27 21

42 35 23 27 21

An array of data
We don't care what's in
this part of the array.

143

Summary
 A heap is a complete binary tree, where the entry at
each node is greater than or equal to the entries in
its children.
 To add an entry to a heap, place the new entry at the
next available spot, and perform a reheapification
upward.
 To remove the biggest entry, move the last node
onto the root, and perform a reheapification
downward.

144

Sorting
 In numerous sorting applications, a simple algorithm may
be the method of choice
 often use a sorting program only once, or just a few

times
 elementary methods are always suitable for small

files
 As a rule, the elementary methods - take time proportional
to N2 to sort N randomly arranged items. If N is small, this
running time may be perfectly adequate

146

SELECTION SORT
 find the smallest element in the array, and exchange it with
the element in the first position
 find the second smallest element and exchange it with the
element in the second position
 Continue in this way until the entire array is sorted

- It works by repeatedly selecting the smallest remaining
element
- A disadvantage of selection sort is that its running time
depends only slightly on the amount of order already in the
file.
147

Selection sort
For each i from l to r-1, exchange a[i]
with the minimum element in a [i], . . . , a
[r]. As the index i travels from left to right,
the elements to its left are in their final
position in the array (and will not be
touched again), so the array is fully sorted
when i reaches the right end.

template <class Item>
void selection(Item a[], int l, int r)
{ for (int i = l; i < r; i++)
{ int min = i;
for (int j = i+1; j <= r; j++)
if (a[j] < a[min]) min = j;
exch(a[i], a[min]);
}
}
148

INSERTION SORT
 often use to sort bridge hands is to
consider the elements one at a time
 inserting each into its proper place
 need to make space for the element being
inserted by moving larger elements one
position to the right
 then inserting the element into the vacated
position

149

Insertion sort example
During the first pass of insertion
sort, the S in the second position is
larger than the A, so it does not
have to be moved. On the second
pass, when the O in the third
position is encountered, it is ex-
changed with the S to put A 0 S in
sorted order, and so forth. Un-
shaded elements that are not
circled are those that were moved
one position to the right.

The running time of insertion sort primarily depends on the initial order
of the keys in the input. For example, if the file is large and the keys are
already in order (or even are nearly in order), then insertion sort is quick
and selection sort is slow.
150

Insertion sort
First puts the smallest element in the array into the
first position, so that that element can serve as a
sentinel;
For each i, it sorts the elements a [1], . . ., a [i] by
moving one position to the right elements in the sorted
list a [1], . . . , a [i-1] that are larger than a [i],
then putting a [i] into its proper position.

template <class Item>
void insertion(Item a[], int l, int r)
{ int i;
for (i = r; i > l; i--) compexch(a[i-1], a[i]);
for (i = l+2; i <= r; i++)
{int j = i; Item v = a[il;
while (v < a[j-1])
{ a[j] = a[j-1]; j--; }
a[jl = v;
}
}
151

BUBBLE SORT
 Keep passing through the file
 exchanging adjacent elements that are out of order
 continuing until the file is sorted

 it is actually easier to implement than insertion or
selection sort is arguable
 Bubble sort generally will he slower than the other two
methods

152

Bubble Sort (Continue)
/* Bubble sort for integers */
#define SWAP(a,b) { int t; t=a; a=b; b=t; }

void bubble( int a[], int n )
{ int i, j;
for(i=0;i<n;i++)
{ /* n passes thru the array */
/* From start to the end of unsorted part */
for(j=1;j<(n-i);j++)
{/* If adjacent items out of order, swap */
if( a[j-1]>a[j] ) SWAP(a[j-1],a[j]);
}
}
}
153

Bubble sort example
Small keys percolate over to the left in
bubble sort. As the sort moves from right
to left, each key is exchanged with the one
on its left until a smaller one is
encountered. On the first pass, the E is
exchanged with the L, the P, and the M
before stopping at the A on the right; then
the A moves to the beginning of the file,
stopping at the other A, which is already
Bubble sort : in position. The ith smallest key reaches its
O(n2) - Very simple code final position after the ith pass, just as in
Insertion sort: selection sort, but other keys are moved
Slightly better than bubble
closer to their final position, as well.
sort
Fewer comparisons -
Also O(n2) 154

Searching
 The goal of the search is to find all records with keys
matching a given search key
 Applications of searching are widespread, and involve a
variety of different operations
 Two common terms often used to describe data structures
for searching are dictionaries and symbol tables
 In searching have programs that are in widespread and
frequent use to study a variety of methods that store
records in arrays that are either searched with key
comparisons or indexed by key value.

159

Searching (Continue)
 search algorithms as belonging to packages
implementing a variety of generic operations that can
be separated from particular implementations, so
that alternate implementations can be substituted
easily. The operations of interest include:
 Initialize the data structure.
 Search for a record (or records) having a given key.
 Insert a new record.
 Delete a specified record.
 Join two dictionaries to make a large one.
 Sort the dictionary; output all the records in sorted order.

160

Searching (Continue)
 search and insert operation is often included for efficiency in
situations where records with duplicate keys are not to be kept within
the data structure
 Records with duplicate keys can be handled in several ways:
 the primary searching data structure contain only records with

distinct keys
 to leave records with equal keys in the primary searching data

structure and return any record with the given key for a search
 to assume that each record has a unique identifier (apart from

the key) and require that a search find the record with a given
identifier, given the key
 to arrange for the search program to call a specified function

for each record with the given key

161

Sequential Searching
 method for searching is simply to store
the records in an array:
 When a new record is to be inserted, we put it
at the end of the array
 when a search is to perform, we look through

the array sequentially

162

Sequential Searching (Continue)
 Property 1 - Sequential search (array implementation) uses
N + 1 comparisons for an unsuccessful search (always) and
about N/2 comparisons for a successful search (on the
average)
 For unsuccessful search, this property follows directly

from the code: each record must be examined to decide
that a record with any particular key is absent. For
successful search, if we assume that each record is
equally likely to be sought, then the average number of
comparisons is (1 + 2 +…+ N)/N = (N + 1)/2, exactly
half the cost of unsuccessful search

163

Sequential Searching (Continue)

 Property 2 - Sequential search (sorted list implementation)
uses about N/2 comparisons for both successful and
unsuccessful search (on the average)
 For successful search, the situation is the same as

before. For unsuccessful search, if we assume that the
search is equally likely to be terminated by the tail node
z or by each of the elements in the list (which is the
case for a number of "random" search models), then the
average number of comparisons is the same as for
successful search in a table of size N + 1, or (N + 2)/2

164

Binary Search
 Binary Search is an incredibly powerful technique for
searching an ordered list
 The basic algorithm is to find the middle element of
the list
 compare it against the key
 decide which half of the list must contain the key
 and repeat with that half
 Two requirements to support binary search:
 Random access of the list elements, so we need arrays
instead of linked lists.
 The array must contain elements in sorted order by the
search key
165

Binary Search (Continue)
 Property 3 - Binary search never uses more than lg N + 1
comparisons for either successful or unsuccessful search
 This follows from the fact that the subfile size is at least halved at
each step: an upper bound on the number of comparisons satisfies
the recurrence CN = CN/2 +1 with C, = 1, which implies the stated
result.
 It is important to note that the time required to insert new records is
high for binary search
 Property 4 - Interpolation search uses fewer than lg lgN + 1
comparisons for both successful and unsuccessful search, in files of
random keys
 This function is a very slowly growing one, which can be thought of
as a constant for practical purposes: if N is one billion, lg lgN < 5.
Thus, any record can be found using only a few accesses (on the
average), a substantial improvement over binary search

166

Binary Tree Search
 Binary tree search is a simple, efficient
dynamic searching method that qualifies as of
the most fundamental algorithms in computer
science
 The defining property of a binary tree is that
each node has left and right links

A binary search tree

167

Binary Tree Search (Continue)
 Property 5 - A search or insertion in a binary search tree requires
about 2 lnN comparisons, on the average, in a tree built from N
random keys.
 For each node in the tree, the number of comparisons used for
a successful search to that node is the distance to the root. The
sum of these distances for all nodes is called the internal path
length of the tree. Dividing the internal path length by N, we get
the average number of comparisons for successful search. But
if CN denotes the average internal path length of a binary
search tree of N nodes, we have the recurrence
 Property 6 - In the worse case, a search in a binary search tree with
N keys can require N comparisons.
 For example, when the keys are inserted in order (or in reverse
order), the binary- tree search method is no better than the
sequential search method that we saw at the beginning of this
chapter

168

មេរៀនៈ Data Structure and Algorithm in C/C++

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to មេរៀនៈ Data Structure and Algorithm in C/C++

Similar to មេរៀនៈ Data Structure and Algorithm in C/C++ (20)

More from Ngeam Soly

More from Ngeam Soly (11)

មេរៀនៈ Data Structure and Algorithm in C/C++

Editor's Notes