BCE24 | Virtual Brand Ambassadors: Making Brands Personal - John Meulemans
A multiplicative time series model
1. A Multiplicative Time Series Model For
Air Transport Demand Forecasting
Presented By
Mohammed Salem Awad
Consultant
YEMEN
2. A Multiplicative Time Series Model For
Air Transport Demand Forecasting
Jairam Singh1 A. A. Bashaswan2 Mohd Salem Awad3
Abstract:
The air travel demand is influenced by a variety of factors which renders the number
of passengers to vary as a seasonality fluctuating non-stationary time series. For non-
seasonal series it is possible to obtain a parsimonious representation in the form of
ARMA model but seasonality makes the model cumbersome. In the present paper a
multiplicative model of the type (p,d,q)lx(P,D,Q)s has been used to represent a non-
stationary time series displaying seasonality at an interval of 'S' observations. The data
of the International Passengers from the records of Yemenia. Yemen Airways have
been used to estimate the parameters of this model. The model gives a fairly good
forecast values.
1. Introduction:
There are several modes of transportations for passengers and freight and air
transport is a part of a larger product. Air travel is not an end in it self. Air travel
demand is dependent on the demand for the other product of leisure and business.
Airline product is characterized by the passengers and the freight mix strategy. A seat
in the aircraft is very much like another. Similarly the freight product is also
homogeneous but the role airfreight tends to be underestimated although it amounts to
one quarter of the RTKs output and one tenth of the total revenue. Therefore, the
determinants of air travel demand are many such as personal income, cost of air
travel, convenience and speed level of trade, population distribution and changes and
the of economic activity. The institutional factors such as festivals working practice
and school holidays cause the seasonal fluctuation in the demand.
Figure 1. Shows a time series of the total of international airline passenger's quoted by
Yemenia / Aden Center [10] given in Appendix A.
1
Prof. of Industrial Management, Department of Mechanical Engineering, Faculty of Engineering,
Aden University.
2
Associate Prof. of Mechanical Engineering, Faculty of Engineering, Aden University
3
Mechanical Engineer, YEMENIA Yemen Airways.
5. This series exhibits a periodic behavior with period S = 12 months, in the late summer
months a secondary peak occurs in the spring.
One of the deficiencies in the analysis of a time series has been the confusion of
fitting a series and forecasting it. A common method of analyzing a time series is to
decompose it arbitrarily into three components – a "trend", "seasonal component", and
a "random component". The trend can be fitted by a polynomial and the seasonal
components by a Fourier series. A forecast can then be made by projecting these fitted
functions. Such methods can give an extremely misleading results in cases where a
part of the time series may look to be quadratic some times due to some of the random
deviates which is taken to be the characteristics of the demand data, if they are fitted
to it.
Another model involving sines and cosines may be used to present seasonal variations
which may be written as
Zt
j1
π j Zt j a t ψ a
j 1
j t j at …………………………. ( 1 )
With suitable values of the coefficients π j and ψ j , is entirely adequate to describe
many seasonal time series analysis. The problem is to choose a suitable system of
parsimonious parameterization for such models involving a mixture of sines and
cosines and polynomial terms to allow the changes in the level of series.
For non-seasonal series, it is usually possible to obtain a useful and
parsimonious representation in the form of ARMA model as
Φ (B) Zt Θ (B) a t …………………………………………. ( 2 )
Where B is backward shift operator
Moreover, the generalized autoregressive operators (B) determines the eventual
forecast function which is the solution of the difference equation
(B) Zt ( l ) 0
(1 Φ1 B - Φ2 B2 - Φ 3B3 .......... p Bb ) ZT (l) 0 ………………. ( 3 )
Φ
6. Where B is understood to operate on l . In representing a seasonal behavior we shall
want the forecast function to trace out a periodic pattern. If (B) represents a
forecast a forecast function which is a sine wave with a twelve month period, adaptive
with phase and amplitude, will satisfy the difference equation.
(1 - 3 B B 2 ) Z t ( l ) 0
However, it is not sure that the periodic behavior is truly represented. It might need
many sine – cosine components
If we give some thought as to what happens when we try to induce stationary by
differencing d times and we write
(B) (B) (1 - B) d
Which is equivalent to setting d roots of equation
(B) 0 , equal to unity. When such a presentation proved adequate 1 - B
was used as a simplifying operators to convert a non-stationary series into a stationary
series. The fundamental fact that a seasonal time series will have observations similar
to each other after a certain intervals which is called a period. Therefore, the operation
B s Z t will represent an observation before s interval i.e
B s Z t Z t -s
and then the series which exhibits seasonally will be Z t , Z t -s , Z t -2s , Z t -3s ,......
This series may also be expected to be non-stationary, therefore, a simplifying
operator , Z t (1 - B s )Z t Z t Z t -s might be useful to make it stationary [2,3,4].
2. The Multiplicative Model
The seasonal effect implies that an observation for a particular month, say April is
related to the observations for previous Aprils. Suppose the t th observation Z t is the
7. month of April. We might be able to link this observation Z t , with observations in
previous April by a model of the form
(B s ) S Z t (B S ) t …………………………………….. (4)
D
Where S=12, S 1 B S and (B s ) , (B S ) , are polynomial in BS of degree P and
Q, respectively and satisfying the stationary and invertibility criteria. Similarly a
model of the form [5,6,7,8]
(B s ) S Z t -1 (B S ) t -1 …………………………..………( 5 )
D
,might be used to link the current behavior for March with the previous March
observations, and so on, for each of the twelve months.
Now the error components t , t -1 , t - 2 ,........would not in general is uncorrelated. For
example, the total airline passengers in April 1990, while related to April totals,
would also relate to totals in March 1990, February of 1990 and January of 1990, etc.
Thus, we would expect that t , in eq. (4) would be related to t -1 , in eq.(5) and so on.
A second model may be introduced to take care of such relationships,
(B) d t q (B) a t ……..………………………………… ( 6 )
,where at is a white noise process and (B) and q (B) , are the polynomials of degree
p and q respectively, and 1 1 B .
Substituting Eq. (6) in (4), we get
p (B) (B S ) d S Z t q (B)(B S ) a t ……………………….. ( 7 )
D
,where for this particular example, S=12. The resulting multiplicative process will
be said to be the order of (p,d,q)x(P,D,Q)S
8. 3. Choice of Transformation of the Data from Yemen Airways
( YEMENIA Aden Center )
It is particularly true for seasonal model that the weighted averages of the
previous data values, which comprise the forecasts, may extend far back into the
series; care is therefore needed in choosing a transformation in terms of which a
parsimonious linear model will closely apply over a sufficient stretch of the series. A
data based transformation may help determine in what metric the amplitude of the
seasonal components is roughly independent of the level of the series. Let us assume
that some power transformation, z = x for ≠ 0 , z = ln x for = 0, may be
needed to make the model (7) appropriate. The approach of Box and Cox [11] may be
followed and the maximum likelihood value obtained by fitting the model to x() = (x
- 1)/ x-1 for various values of which results in the smallest residual sum of
squares S , In this expression x is the geometric mean of the series. It was shown by
Box and Jenkins [9] by the airline data, the maximum likelihood value is thus close to
= 0 confirming for this particular example, the appropriateness of the logarithmic
transformation.
The monthly totals of the passengers in international travel shows a seasonal
behavior with period S = 12. The data are shown in Table 1. , which represents the
logarithms of the airline data.
Table 1. Natural Logarithmic of Monthly
Passengers Total in international Air Travel
By Yemenia, Aden Center (Using Aden as a hub).
9. 4. Representation of the Airline Data by
Multiplicative by (p,d,q) x ( P,D,Q)12 Model
The arrangement of Table 1. emphasizes the fact that in periodic data there are two
main intervals which are important. We expect relationships to occur:
(a) between the observations of the same month in the successive years
(b) between the observations in the successive months in a particular years.
Identification of Multiplicative Model
A tentative identification of time series model is done by analysis of historical
data. Usually at least 50 observations are required to achieve satisfactory results. The
primary tool used in this analysis is the autocorrelation function.
Consider the time Z1, Z2 , ………………..ZN
The theoretical autocorrelation function is
EZ t μ Z t -κ μ
ρκ , κ 0,1,2, .......... K
, (8)
σ2
z
Where σ 2 is the variance of the series. The quantity ρ κ , is called autocorrelation
z
at lag k. Obviously ρ 0 1. The theoretical autocorrelation function is never known
with a certainty, and must be estimated. Satisfactory estimate of ρ κ is the sample
autocorrelation function
Z
N -k
1
N-k
t
Z Z t -κ Z
t 1
ρκ N
, ………………….………….. (9)
Z
1 2
t Z
N t 1
N
For useful results, we would usually compute the first k , autocorrelations
4
As a supplemental aid the partial autocorrelation function often proves useful. We
shall define partial autocorrelation coefficient kk as the kth , coefficients in an
autoregressive process of order k. It can be shown ( Ref. 9 chapter 3 ) that the partial
autocorrelation coefficients satisfy the following Yule-Walker equations.
10. ρ j k1 ρ j-1 k2 ρ j- 2 k3 ρ j- 3 .......... kk ρ j-k
... ……….……… (10)
j 1,2, ..........
k
ˆ
This partial autocorrelation coefficient may be estimated by substituting ρ j , for
ρ j , in Eq. (10), Yielding
ρ j k1 ρ j-1 k2 ρ j- 2 k3 ρ j- 3 .......... kk ρ j- k
ˆ ˆ ˆ ˆ ... ˆ ………… ( 11 )
j 1,2, ..........
k
And solving the resulting equation for k 1,2, ......K to obtain 11 , 22 , ....,kk
, the sample partial correlation function.
From the estimated autocorrelation function, which can be conveniently
exhibited by a graph, a tentative model autocorrelation function patterns. These
patterns are discussed in reference [9].
A sample autocorrelation and partial autocorrelations function of non-stationary
time series die down extremely slowly from a value of one. If this type of behavior is
exhibited, the usual approach is to compute the autocorrelation and the partial
autocorrelation functions for the first difference of the series. If these functions
behave according to the characteristics of a stationary series. Then one degree of
differencing is necessary to achieve stationary.
In case of Yemenia ( Aden Center) data after first differencing, the
autocorrelations for all lags beyond the first is zero ( see table 2 ). Therefore, an
IMA(0,1,1) model is appropriate. This contains no seasonal components.
Suppose, we want to use this model to link the data 12 months apart then the model
would be
12 Z t (1 - B 12 ) t ……………………………….. (12)
Further, we want to employ a similar model using a linear filter to link the data only
one month apart. This gives a model
t Z t (1 - B) a t ……………………..………… (13)
Where and will have different values.
11. Then on combining expression (12) and (13), we would obtained the seasonal
multiplicative model
12 Z t (1 - θB) (1 - B 12 ) a t ……………………………..…………….. ( 14)
of the order (0,1,1)x(0,1,1)12 . The model written explicitly is
Z t - Z t 1 - Z t -12 Z t -13 a t - θa t -1 θa t -13 ………………….. (15)
The invertibility region of this model is defined by
1 θ 1 and 1 1
From the table 2, we see that the autocorrelation function does not die down rapidly
and it can be concluded from this that the logged data of the time series is non-
stationary. Therefore, some degree of differencing will be necessary to produce
stationary. The first difference of time series is taken and its autocorrelation function
is calculated as shown in table 2. It appears that the simple differencing reduces the
auto correlations in general but a heavy periodic components remains, this is evident
particularly at large lags. Sample differencing with respect to period 12 results in
correlations which are firstly persistently positive and then persistently negative. By
contrast the differencing 12 Z markedly reduces the correlation coefficient
throughout.
12. Table 2. Estimated Autocorrelations of Various
Differencing of the Logged Airlines Data.
Autocorrelations
ˆ ˆ
Estimation: Iterative Calculation of least squares estimates θ , and .
An iterative linearization technique may be used in straight forward situation to
supply the least squares estimates and their standard approximates errors. For the
present examples we can write approximately
a t,0 θ θ 0 χ 0 χ at
1,t 2,t
Where
a a
χ 1,t - χ 2,t -
θ θ 0Θ 0 Θ θ 0Θ 0
,and where θ 0 , and Θ 0 are guessed values and a t,0 at θ0Θ0 , The derivative are
mostly easily computed numerically [9].
For (p,d,q) x ( P,D,Q)12 model the preliminary estimates of autocorrelations
functions would be
13. -θ -Θ
ρ1 ρ 12
1 θ 2 , and 1 Θ 2 ………………. (16)
On substitution the sample estimates
r1 0.337 , and r12 0.189 in (16) [9], we obtain
2 2
0.337(1 θ ) θ and 0.189(1 Θ ) Θ .
2 2
Or …… θ 2.967 θ 1 0 , and Θ 5.291 Θ 1 0
ˆ ˆ
From these a rough estimates of θ 0.3876 , and Θ 0.1962 ……
4.2.1. Iterative Estimation of θ , and Θ
χ 1,t , can be written as
χ1,t at ω, β1,0, .....,β.....βk,0 at ω, β1,0, .....,β1,0 δ1 .....βk,0 δ1 ….. (17)
Consider the fitting of the airline data to (0,1,1)x(0,1,1) 12 process
ω t 12 Z t (1 - θB)(1 - ΘB 12 ) a t
The beginning of the calculation is shown in table (3) for the estimated value of
ˆ ˆ
θ 0.3876 , and Θ 0.1962 The back – forecasted values of [at] were actually
obtained using the expression as worked out in Appendix A.
e t 12 Z t θe t-1 Θe t -12 θΘe t -13
The value of [at] can be obtained by
a t 12 Z t θa t-1 Θa t-12 θΘa t -13
ˆ ˆ
χ 1,t , can be evaluated by eq. (17) giving an increments in θ and Θ till S(,)
becomes the minimum. After many iterations improves to 0.3225 and , 0.3712.
14. 5. Forecasting:
Forecasting are best computed directly from the difference equation it self. Thus using
the seasonal model (15) for forecasting at a lead time , and origin t is given by
Z t Z t 1 Z t 12 Z t 13 a t θa t 1 Θa t 12 θΘa t 13 ……..(18)
After setting = 0.3225, and = 0.3712 , the minimum means squared errors
forecast at lead time , and origin t is given by
Z t Z t 1 Z t 12 Z t 13 a t θa t 1 Θa t 12 θΘa t 13 …….
ˆ (19)
Here we refer to
Z t EZ t θ, Θ, Z t , Z t ,......... .......... .... ……... (20)
As the conditional expectation of Z t , taken at origin t, In this expression the
parameters are supposed exactly known and knowledgeable of series Z t , Z t 1 , is
supported to the extend into the remote past. This practical application depends upon
The facts that.
a) Invertible models fitted to the actual data usually yield forecasts which depends
appreciably only on the recent values of the series.
b) The forecasts are insensitive to small changes in parameter values such as are
introduced by estimation errors.
Now
Z t 1 j 0
Z t 1
………………. (21)
Z j
ˆ j 0
t
a t 1 j 0
a t 1
…………….. (22)
0 j 0
15. Thus to obtain the forecasts, we simply replace unknown ZS, by forecast, and
unknown a S, by zeros.
The known aS , are of course the one step ahead forecast errors already computed,
ˆ
that is, a t Z t Z t -1 …… (1)
For example, to obtain the three months ahead forecast, we have
Z t 3 Z t 2 Z t 9 Z t -10 a t 3 0.3225 a t 2 0.3712 a t 9 0.1197a t 10
Taking conditional expectation at origin,
Z t 3 Z t 2 Z t 9 Z t -10 0.3712a t 9 0.1197a t 10
ˆ ˆ
That is
Z t 3 Z t 2 Z t 9 Z t -10 0.3712 Z t -9 Z t -10 (1) 0.1197 Z t -10 Z t -11 (1) …
ˆ ˆ ˆ ˆ
Hence
Z t 3 Z t 2 0.6288 Z t 9 0.8803 Z t -10 0.3712 Z t -10 (1) 0.1197 Z t -11 (1)
ˆ ˆ ˆ ˆ (1)
This expresses the forecasts in terms of ZS and previous forecast of ZS
16. Conclusions:
It is obvious from the forecast values shown in Fig. 1., that the simple model
containing only two parameters faithfully reproduces the seasonal pattern and
supplies excellent forecasts. It is to be remembered that like all predications obtained
from the general pattern linear stochastic model, the forecast function is adaptive.
When the seasonal pattern changes, these will be appropriately projected into the
forecast. Of course, a forecast for a lead time of 36 may necessarily contains a fairly
larger error. However, in practice, an initially remote forecast will be continually
updated and as the lead shortens, greater accuracy will be possible.
The model presented here is robust to moderate changes in the values of values of the
parameters. Thus, if = 0.35 , and = 0.4 , instead of 0.3225 and 0.37, the forecast
would not be greatly affected. This is true for the forecasts made several steps ahead
e.g 12 months. This has been seen by studying the sum of squares surfaces of
modifying the values of the parameters by one step ahead forecasts.
References:
1- Brown, R.G. "Smoothing, Forecasting and Predication of Discrete Time Series",
Prentice Hall, New Jersey, 1962.
2- Box, G.E.P. and Jenkins G.M.. "Some Statistical Aspects of Adaptive
Optimization and Control", Jour. Royal stat. Soc. B24,297,1962.
3- Box, G.E.P. and Jenkins G.M.. "Further Contribution to Adaptive Quality:
Control Simultaneous estimation of Dynamos ; non zero costs", Bull Ins. Stat.
34th Seminars, 1963.
4- Box, G.E.P. and Jenkins G.M.. "Mathematical Models for Adaptive Control and
Optimization", A.I. Ch. E.-J. Chem. E. Symposium Series, 4, 61, 1965.
5- Box, G.E.P. and Jenkins G.M.. "Models for Forecasting Seasonal and Non
Seasonal Time Series" , Advanced Seminar On Spectral Analysis of Time Series,
ed B. Harris, 271, John Wiley, New York, 1967.
6- Box, G.E.P. and Jenkins G.M.. "Some Recent Advances in Forecasting and
Control, I". Applied Statistics, 17,91,1968.
7- Box, G.E.P. and Jenkins G.M.. "The Time Series Analysis Forecasting and
Control", Holden.-Day Singapore 1976.
8- Yule, G. U. , "On the Method of Investigating Periodicity in Disturbed Series
with Special Reference to Walfers Sunspot Number". Phil. Trans A226, 267,
1927.
9- Daniels , H.E. " Approximate Distribution of Serial Correlation Coefficients"
Biometrica, 43, 169, 1956.
10- Yemenia: Records of International Passengers, 1996.
11- Box, G.E.P. and Cox D.R., "An Analysis of Transformation". Jour. Of Rayal
Stat. Soc. B26, 211. 1964.
17. APPENDIX A :
Data International Airline Passenger quoted by Yemenia / Aden Centre
18. APPENDIX B :
CALCULATION OF THE UNCONDITIONAL SUM OF SQUARES FOR
THE MODEL ωt 12 Zt (1 - θB) (1 - B12 ) at
With ωt 12 Zt , the model ( 0,1,1) x (0,1,1)12 may be written in either the
forward or backward form.
ωt (1 - θB) (1 - B12 ) at
Or
ωt (1 - θF ) (1 - F12 ) et
And where μ E ω t is assumed to be zero. Hence we can write
e t ω t θ e t 1 e t 12 θΘe t 13 ……………… B1
a t ω t θ a t 1 a t 12 θΘa t 13 ……………… B2
Where ω t ω t for t = 1, 2, ………, n and is the back forecast of ω t for t 0
There are N = 48 observations in the airline series. Accordingly, in Table, these are
designated as Z12 , Z11 , Z10 , Z9 ,……….. Z35 . The ω ' s obtained by differencing
from the series ω 1 , ω 2 , ω 3 , ,......... ω n , where n = 35 we shall start calculation e 35
.......
by setting unknowns e ' s equal to zero.
Using (B.1) we get
e 35 ω 35 θ 0 0 θΘ 0
e 34 ω 34 θ e 35 0 θΘ 0
e 33 ω 33 θ e 34 e 35 θΘ 0
………….
…………..
e 4 ω 4 θ e 2 e 3 θΘ e 14
19. Table B1. Computation at
..........ω -12 , then B.2 is used to calculate a t
We can calculate ω 0 , ω -1 , ..........
Since each a t is function of previously occurring ω ' ' and a -1 0 , j > 12
Now
24
Sθ, Θ a
t 12
t
2
The next iteration would start with e ' s starting of the iteration using forecast values
......... ,from a ' s already calculated.
ω n1 , ω n 2 , ω n 3 ..........
Table 3. Numerical Calculation of Derivative from Airline Data
t Z a t,0 a t θ
0
a t θmδ a a δ
t θ t θδ
-12 8.1715 -0.0545 -0.0545 -0.0545 0
-11 7.9596 0.0995 0.0783 0.0778 0.0545
-1
0
1
35 9.8182 0.7071 0.6377 0.6361 0.1560
20. Using the whole series, the iteration can be proceeded.
a χ
n
t,0 1t
t 0
θ - θ0 n
χ
t 0
1t
2
Then changing the value θ 0 θ and repeating the same procedure, the values of
θ and can be found out which minimizes
n
Sθ, Θ a
t0
t0
2