Financial Econometric Models I

ESGF 5IFM Q1 2012
Financial Econometric Models
Vincent JEANNIN – ESGF 5IFM
Q1 2012

vinzjeannin@hotmail.com
1

ESGF 5IFM Q1 2012
Summary of the session (est 3h)

• Introduction & Objectives
• Bibliography
• OLS & Exploration

2

Introduction & Objectives
• What is a model? �� = �� + �� with �� being a white noise

ESGF 5IFM Q1 2012
• What the point writing models?

Describe data behaviour

Modelise data behaviour
Forecast data behaviour

• Acquire theory knowledge on Econometrics & Statistics
• Step by step from OLS to ANOVA on residuals
• Usage of R and Excel 3

Bibliography

vinzjeannin@hotmail.com ESGF 5IFM Q1 2012
4

OLS & Exploration
OLS: Ordinary Least Square

ESGF 5IFM Q1 2012
Linear regression model
Minimize the sum of the square vertical distances
between the observations and the linear
approximation

�� = �� = �� + ��

Residual ε

5

Two parameters to estimate:
• Intercept α
• Slope β

ESGF 5IFM Q1 2012
Minimising residuals

��

�� = �� 2 = �� − �� + �� 2

��=1 ��=1

When E is minimal?

When partial derivatives i.r.w. a and b are 0
6

��

�� = �� 2 = �� − �� + �� 2 = �� − �� − �� 2

��=1 ��=1 ��=1

Quick high school reminder if necessary…

ESGF 5IFM Q1 2012
�� − �� − �� 2 = �� 2 − 2�� − 2�� + �� 2 �� 2 + 2�� + ��2

��
��

= −2�� + 2�� 2 + 2�� = 0 = −2�� + 2�� + 2�� = 0
��
��=1 ��=1

��

−�� + �� 2 + �� = 0 −�� + �� + �� = 0
��=1 ��=1
��

�� ∗ �� 2 + �� ∗ �� = �� ∗ �� + �� = ��
��=1 ��=1 ��=1 ��=1 ��=1
7

��
Leads easily to the intercept
��
��

�� ∗ �� + �� = ��
��=1 ��=1

ESGF 5IFM Q1 2012
�� + �� = ��

�� + �� = ��

�� = �� − ��

The regression line is going through (�� , ��)

The distance of this point to the line is 0 indeed

8

�� = �� − �� y = �� + �� − ��

y − �� = ��(�� − �� )

ESGF 5IFM Q1 2012
��
��
= −2�� + 2�� 2 + 2�� = 0 = −2�� + 2�� + 2�� = 0
��
��=1 ��=1

��

�� − �� − �� = 0 �� − �� − �� = 0
��=1 ��=1
��
��
�� − �� − �� + �� = 0
��=1
�� − �� + �� − �� = 0
��=1
��

�� (�� − �� − �� − �� ) = 0 (�� − ��) − ��(�� − �� ) = 0
��=1 ��=1
�� 9
�� ( �� − �� − �� − �� ) = 0
��=1

We have
��

�� (�� − �� − �� − �� ) = 0 and �� ( �� − �� − �� − �� ) = 0
��=1 ��=1

ESGF 5IFM Q1 2012
��

�� (�� − �� − �� − �� ) = �� ( �� − �� − �� − �� )
��=1 ��=1

��

�� (�� − �� − �� − �� ) − �� − �� − �� − �� =0
��=1 ��=1

��

(�� −�� )(�� − �� − �� − �� ) = 0
��=1

Finally…

��
��=1(�� −�� )(�� − ��) 10
�� = �� 2
��=1(�� −�� )

�� Covariance
��=1(�� − �� )(�� − ��)
�� = �� 2
��=1(�� − �� ) Variance

ESGF 5IFM Q1 2012
��
�� =
��2��

�� = �� − ��

You can use Excel function INTERCEPT and SLOPE

11

Calculate the Variances and Covariance of X{1,2,3,3,1,2} and Y{2,3,1,1,3,2}

ESGF 5IFM Q1 2012
12

You can use Excel function VAR.P, COVARIANCE.P and STDEV.P

Let’s asses the quality of the regression

Let’s calculate the correlation coefficient (aka Pearson Product-Moment
Correlation Coefficient – PPMCC):

ESGF 5IFM Q1 2012
��
�� = Value between -1 and 1
��

�� = 1

Perfect dependence

�� ~0 No dependence

Give an idea of the dispersion of the scatterplot
13

You can use Excel function CORREL

Poor quality
R=0.62
R=0.96

High quality

14

What is good quality?

ESGF 5IFM Q1 2012
Slightly discretionary…

If
3
�� ≥ = 0.8666 …
2
It’s largely admitted as the threshold for acceptable / poor

15

The regression itself introduces a bias

Let’s introduce the coefficient of determination R-Squared

ESGF 5IFM Q1 2012
Total Dispersion = Dispersion Regression + Dispersion Residual

2 2 2
�� − �� = �� − �� + �� − ��

Dispersion Regression
��2 =
Total Dispersion

In other words the part of the total dispersion explained by the regression 16

You can use Excel function RSQ

In a simple linear regression with intercept ��2 = �� 2

ESGF 5IFM Q1 2012
Is a good correlation coefficient and a good coefficient of
determination enough to accept the regression?

Not necessarily!

Residuals need to have no effect, in other word to be a white noise!

17

18

Don’t get fooled by numbers!

ESGF 5IFM Q1 2012
For every dataset of the Quarter

�� = 9
�� = 7.5

�� = 3 + 0.5��
�� = 0.82
��2 = 0.67

Can you say at this stage which regression is the best?
19

Certainly not those on the right you need a LINEAR dependence

ESGF 5IFM Q1 2012
Is any linear regression useless?

Think what you could do to the series

Polynomial transformation, log transformation,…

20
Else, non linear regressions, but it’s another story

First application on financial market

S&P / AmEx in 2011

ESGF 5IFM Q1 2012
21

��,��&��
�� = = 0.8501
�� &��

��2 = �� 2 = 0.7227

ESGF 5IFM Q1 2012
Oups :-o
Is Excel wrong?

R-Squared has different calculation methods

Let’s accept the following regression then as the quality seems pretty good

�� = 0.06% + 1.1046 ∗ ��&��

22

How to use this?

ESGF 5IFM Q1 2012
• Forecasting? Not really…
Both are random variables

• Hedging? Yes but basis risk
Yes but careful to the residuals…

In theory, what is the daily result of the hedge? ��

Let’s have a try!

23

Hedging $1.0M of AmEx Stocks with $1.1046M of S&P

ESGF 5IFM Q1 2012
It would have been too easy… Great differences… Why?

Sensitivity to the size of the sample
24
Heteroscedasticity

Let’s have a similar approach using a proper statistics and econometrics software

ESGF 5IFM Q1 2012
• Free
• Open Source
• Developments shared by developers

Let’s begin with statistical exploration to get familiar with the series
and the software
> Val<-read.csv(file="C:/Users/Vinz/Desktop/Val.csv",head=TRUE,sep=",")
> summary(Val)

SPX AMEX
Min. :-0.0666344 Min. :-0.0883287
1st Qu.:-0.0069082 1st Qu.:-0.0094580
Median : 0.0010016 Median : 0.0013007 25
Mean : 0.0001249 Mean : 0.0005891
3rd Qu.: 0.0075235 3rd Qu.: 0.0102923
Max. : 0.0474068 Max. : 0.0710967

> hist(Val$AMEX, breaks=20, main="Distribution
AMEX Returns")
> sd(Val$AMEX)
[1] 0.01915489

ESGF 5IFM Q1 2012
> hist(Val$SPX, breaks=20, main="Distribution
SPXX Returns")
> sd(Val$SPX)
[1] 0.01468776 26

These are obvious negatively skewed distributions

ESGF 5IFM Q1 2012
Reminders
3
�� − �� − �� 3
�� = �� =
�� − �� 2 3/2

• Negative skew: long left tail, mass on the right, skew to the left
• Positive skew: long right tail, mass on the left, skew to the right

> skewness(Val$AMEX)
[1] -0.2453693
> skewness(Val$SPX) 27
[1] -0.4178701

These are obvious leptokurtic distributions

ESGF 5IFM Q1 2012
Reminders

4
�� − �� − �� 4
�� = �� =
�� − �� 2 2

> library(moments)
> kurtosis(Val$AMEX) What is their K?
[1] 5.770583 (excess kurtosis)
> kurtosis(Val$SPX)
[1] 5.671254 28
Subtract 3 to make it relative to the
normal distribution…

Quick check: what are the Skewness and Kurtosis of {1,2,-3,0,-2,1,1}?

ESGF 4IFM Q1 2012
Excel function SKEW
R function skewness (package moments)
29

ESGF 4IFM Q1 2012
Excel function KURT
R function kurtosis (package moments)
30

By the way, what is the most platykurtic distribution in the nature?

Toss it!

ESGF 4IFM Q1 2012
Head = Success = 1 / Tail = Failure = 0

> require(moments)
> library(moments)
> toss<-rbinom(10000000,1,0.5)
> mean(toss)
[1] 0.5001777
> kurtosis(toss)
[1] 1.000001
> kurtosis(toss)-3
[1] -1.999999
> hist(toss, breaks=10,main="Tossing a
coin 10 millions times",xlab="Result
of the trial",ylab="Occurence") 31
> sum(toss)
[1] 5001777

50.01777% rate of success: fair or not fair? Trick coin ?

Can be tested later with a Bayesian approach

ESGF 4IFM Q1 2012
On a perfect 50/50, Kurtosis would be 1, Excess Kurtosis -2: the minimum!
This is a Bernoulli trial

��(��, ��) with �� > 1 and 0 < �� < 1 �� ∈ ℝ and �� integer

Mean ��

SD ��(1 − ��)

Skewness 1 − 2��
��(1 − ��)

Kurtosis 1
−3
��(1 − ��)
32
Easy to demonstrate if p=0.5 the Kurtosis will be the lowest
Bit more complicated to demonstrate it for any distribution

Back to our series, a good tool is the BoxPlot

ESGF 5IFM Q1 2012
Too
Many
Outliers!

There should be 2 max
To be normal

Fatter tails than the
normal distribution

33
boxplot(Val$AMEX,Val$SPX, main="AMEX & S&P BoxPlots",
names=c("AMEX","SPX"),col="blue")

Leptokurtic distributions

Negatively skewed distribution

ESGF 5IFM Q1 2012
Are they normal distributions?

Let’s compare them to normal distributions with same
standard deviation and mean and make the QQ Plots

34

x=seq(-0.2,0.2,length=200)
y1=dnorm(x,mean=mean(Val$AMEX),sd=sd(
Val$AMEX))
hist(Val$AMEX, breaks=100,main="AmEx
Returns / Normal

ESGF 5IFM Q1 2012
Distribution",xlab="Return",ylab="Occ
urence")
lines(x,y1,type="l",lwd=3,col="red")

x=seq(-0.2,0.2,length=200)
y1=dnorm(x,mean=mean(Val$SPX),sd=sd(Val$S
PX))
hist(Val$SPX, breaks=20,main="S&P Returns
/ Normal
Distribution",xlab="Return",ylab="Occuren
ce")
lines(x,y1,type="l",lwd=3,col="red") 35

ESGF 5IFM Q1 2012
Excess kurtosis obvious

Fatter and longer tails

36
Let’s have a look to their CDF through QQPlot

> qqnorm(Val$AMEX) > qqnorm(Val$SPX)
> qqline(Val$AMEX) > qqline(Val$SPX)

ESGF 5IFM Q1 2012
Fatter tails 37
Let’s properly test the normality

Can use many tests…

• Kolmogorov-Smirnov
• Jarque Bera
• Chi Square
•

ESGF 5IFM Q1 2012
Shapiro Wilk

Let’s try Kolmogorov-Smirnov

It compares the distance between the empirical

CDF and the CFD of the reference distribution

38

ESGF 5IFM Q1 2012
x=seq(-4,4,length=1000)
plot(ecdf(Val$AMEX),do.points=FALSE, col="red", lwd=3,
main="Normal Distribution against AMEX - CFD's", xlab="x",
ylab="P(X<=x)")
lines(x,pnorm(x,mean=mean(Val$AMEX),sd=sd(Val$AMEX)),col="blue",t
ype="l",lwd=3)

x=seq(-4,4,length=1000)
plot(ecdf(Val$SPX),do.points=FALSE, col="red", lwd=3,
main="Normal Distribution against S&P - CFD's", xlab="x",
ylab="P(X<=x)")
lines(x,pnorm(x,mean=mean(Val$SPX),sd=sd(Val$SPX)),col="blue",typ
e="l",lwd=3)

39

> ks.test(Val$SPX, "pnorm") > ks.test(Val$AMEX, "pnorm")

One-sample Kolmogorov- One-sample Kolmogorov-Smirnov
Smirnov test test

data: Val$SPX data: Val$AMEX
D = 0.4811, p-value < 2.2e-16 D = 0.4742, p-value < 2.2e-16
alternative hypothesis: two-sided alternative hypothesis: two-sided

ESGF 5IFM Q1 2012
The 0 hypothesis is the distribution is normal

Do we accept or reject the hypothesis 0 with a 95%
confidence interval?

The hypothesis regarding the distributional
form is rejected if the test statistic, D, is greater
than the critical value obtained from a table

40

1.36
Sample size: 251 = 0.086
251

Rejected or not? 41

P-Value was giving
Rejected! Series aren’t fitting a normal distribution
the answer

Ok, we now know a bit more the 2 series we want to regress
> lm(Val$AMEX~Val$SPX)

Call:
lm(formula = Val$AMEX ~ Val$SPX)

ESGF 5IFM Q1 2012
Coefficients:
(Intercept) Val$SPX
0.0004505 1.1096287

plot(Val$SPX,Val$AMEX, main="S&P / AmEx", xlab="S&P", ylab="AmEx",
col="red")

abline(lm(Val$AMEX~Val$SPX), col="blue")

�� = 110.96% ∗ �� + 0.045%

42

The next important step is no analyse the residuals

> Reg<-lm(Val$AMEX~Val$SPX)
> summary(Reg)

ESGF 5IFM Q1 2012
Call:
lm(formula = Val$AMEX ~ Val$SPX)

Residuals:
Min 1Q Median 3Q Max
-0.030387 -0.006072 -0.000114 0.006624 0.027824

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0004505 0.0006365 0.708 0.48
Val$SPX 1.1096287 0.0434231 25.554 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1

Residual standard error: 0.01008 on 249 degrees of freedom
Multiple R-squared: 0.7239, Adjusted R-squared: 0.7228
F-statistic: 653 on 1 and 249 DF, p-value: < 2.2e-16

43
They need to be a white noise, you can have a first assessment with quartiles

plot(Reg)
layout(matrix(1:4,2,2))

44

QQ Plot compares the CDF

ESGF 5IFM Q1 2012
A perfect fit is a line

Left tail noticeably different

45

ESGF 5IFM Q1 2012
Residuals should be randomly distributed around the 0 horizontal line

You don’t want to see a trend, a dependence

To accept or reject the regression you need residuals to be a white noise

46
Their mean should be 0

ESGF 5IFM Q1 2012
Nothing suggesting a white noise

• Square root of the standardized residuals as a function of the
fitted values
• There should be no obvious trend in this plot

47

Showing now leverage

Marginal importance of a point in the regression

ESGF 5IFM Q1 2012
Far points suggest outlier or poor model

48

So do we accept the regression?

Probably not… But let’s check…
Kolmogorov-Smirnov on residuals

ESGF 5IFM Q1 2012
1.36 Higher bound value for the
�� = = 0.086
251 H0 to be accepted

Resid<-resid(Reg)
ks.test(Resid, "pnorm")

One-sample Kolmogorov-Smirnov test

data: Resid
D = 0.4889, p-value < 2.2e-16
alternative hypothesis: two-sided

Rejected! Regression between 2 different asset are very often poor
49
Heteroscedasticity

Basis risk if you hedge anyway

Conclusion

ESGF 5IFM Q1 2012
OLS

Residuals

Normality

Heteroscedasticity

50

Financial Econometric Models I

Recommended

Recommended

More Related Content

More from Vincent JEANNIN

More from Vincent JEANNIN (7)

Recently uploaded

Recently uploaded (20)

Financial Econometric Models I