The document discusses curve fitting and regression analysis techniques. It explains the difference between interpolation and regression, and how polynomials and lines can be fitted to a set of data points. The objectives are to understand interpolation vs regression, how to fit polynomials to data, and how to use polynomials for interpolation. Examples are provided on finding the best fit line and parabola to given data sets. The key steps of determining constants that minimize the error in the fit are shown.
1. 1
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Regression/Curve Fitting
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Objectives
• Understanding the difference between
regression and interpolation
• Knowing how to “best fit” a polynomial into
a set of data
• Knowing how to use a polynomial to
interpolate data
2. 2
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Measured Data
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Polynomial Fit!
3. 3
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Line Fit!
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Which is better?
4. 4
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Curve Fitting
• If the data measured is of high accuracy
and it is required to estimate the values of
the function between the given points,
then, polynomial interpolation is the best
choice.
• If the measurements are expected to be of
low accuracy, or the number of measured
points is too large, regression would be
the best choice.
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Regression
5. 5
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Why Regression?
• Measurements that we get from real
situations are not usually consistent!
• The number of “pieces” of information that
we can get about a certain project is
HUGE
• You can NEVER measure exact values!
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Measured Data
6. 6
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
But, how to get the equation of a
line that is “good” for all the data
you have!
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Equation of a Line: Revision
xaay 10 +=
If you have two points
1101 xaay +=
2102 xaay +=
=
2
1
1
0
2
1
1
1
y
y
a
a
x
x
7. 7
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solving for the constants!
12
12
1
12
2112
0 &
xx
yy
a
xx
yxyx
a
−
−
=
−
−
=
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
What if I have more than two
points?
8. 8
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
For every point
nn xaay
xaay
xaay
10
2102
1101
+≠
+≠
+≠
M
≠
1
02
1
2
1
1
1
1
a
a
x
x
x
y
y
y
nn
MMM
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
So, we may write the error vector
−
=
1
02
1
2
1
2
1
1
1
1
a
a
x
x
x
y
y
y
e
e
e
nnn
MMMM
{ } { } [ ] { } 1*22*1*1* aAye nnn −=
10. 10
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Note: this is a quadratic equation in {a}!!!
{ } { } { } [ ] { } { } [ ] [ ]{ }aAAayAayye
TTTTT
+−= 2
2
To minimize the error in the above equation, we need to
differentiate with respect to the parameters
{ }
[ ] { } [ ] [ ]{ } 022
2
=+−= aAAyA
ad
ed TT
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solving the equation
We get:
{ }
[ ] { } [ ] [ ]{ } 022
2
=+−= aAAyA
ad
ed TT
[ ] [ ] { } [ ] { } 1**21*22**2 n
T
nn
T
n yAaAA =
[ ] { } { } 1*21*22*2 yaA = { } [ ] { }yAa
1−
=
11. 11
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Example
• If you are given the
data.
• Find the equation of
the “best-fit” line.
y=a1+a2x
5.57
66
3.55
44
23
2.52
0.51
yx
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution
=
5.5
6
5.3
4
2
5.2
5.0
71
61
51
41
31
21
11
1
0
a
a
[ ] { }
=
=
5.5
6
5.3
4
2
5.2
5.0
&
71
61
51
41
31
21
11
yA
16. 16
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Homework #4
• Chapter 17, pp. 471-472, numbers:
17.4,17.5.
• Use the data and regression to get the
equation of the line that best fits the data
• Number 17.7
• Use the data and regression to get the
equation of the line and the parabola that
best fit the data