SlideShare a Scribd company logo
1 of 204
Download to read offline
Rob J Hyndman
Automatic algorithms
for time series
forecasting
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Motivation 2
Motivation
Automatic algorithms for time series forecasting Motivation 3
Motivation
Automatic algorithms for time series forecasting Motivation 3
Motivation
Automatic algorithms for time series forecasting Motivation 3
Motivation
Automatic algorithms for time series forecasting Motivation 3
Motivation
Automatic algorithms for time series forecasting Motivation 3
Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
¯ determine an appropriate time series model;
¯ estimate the parameters;
¯ compute the forecasts with prediction intervals.
Automatic algorithms for time series forecasting Motivation 4
Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
¯ determine an appropriate time series model;
¯ estimate the parameters;
¯ compute the forecasts with prediction intervals.
Automatic algorithms for time series forecasting Motivation 4
Example: Asian sheep
Automatic algorithms for time series forecasting Motivation 5
Numbers of sheep in Asia
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Example: Asian sheep
Automatic algorithms for time series forecasting Motivation 5
Automatic ETS forecasts
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Example: Cortecosteroid sales
Automatic algorithms for time series forecasting Motivation 6
Monthly cortecosteroid drug sales in Australia
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
Example: Cortecosteroid sales
Automatic algorithms for time series forecasting Motivation 6
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Forecasting competitions 7
Makridakis and Hibon (1979)
Automatic algorithms for time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
Automatic algorithms for time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
This was the first large-scale empirical evaluation of
time series forecasting methods.
Highly controversial at the time.
Difficulties:
 How to measure forecast accuracy?
 How to apply methods consistently and objectively?
 How to explain unexpected results?
Common thinking was that the more
sophisticated mathematical models (ARIMA
models at the time) were necessarily better.
If results showed ARIMA models not best, it
must be because analyst was unskilled.
Automatic algorithms for time series forecasting Forecasting competitions 9
Makridakis and Hibon (1979)
It is amazing to me, however, that after all this
exercise in identifying models, transforming and so
on, that the autoregressive moving averages come
out so badly. I wonder whether it might be partly
due to the authors not using the backwards
forecasting approach to obtain the initial errors.
— W.G. Gilchrist
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so many of the
simple methods . . . these authors are more at
home with simple procedures than with
Box-Jenkins. — C. Chatfield
Automatic algorithms for time series forecasting Forecasting competitions 10
Makridakis and Hibon (1979)
It is amazing to me, however, that after all this
exercise in identifying models, transforming and so
on, that the autoregressive moving averages come
out so badly. I wonder whether it might be partly
due to the authors not using the backwards
forecasting approach to obtain the initial errors.
— W.G. Gilchrist
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so many of the
simple methods . . . these authors are more at
home with simple procedures than with
Box-Jenkins. — C. Chatfield
Automatic algorithms for time series forecasting Forecasting competitions 10
Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting methods;
¯ study what methods give the best forecasts;
¯ be aware of the dangers of over-fitting;
¯ treat forecasting as a different problem from
time series analysis.
Makridakis  Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic algorithms for time series forecasting Forecasting competitions 11
Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting methods;
¯ study what methods give the best forecasts;
¯ be aware of the dangers of over-fitting;
¯ treat forecasting as a different problem from
time series analysis.
Makridakis  Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic algorithms for time series forecasting Forecasting competitions 11
M-competition
Automatic algorithms for time series forecasting Forecasting competitions 12
M-competition
Main findings (taken from Makridakis  Hibon, 2000)
1 Statistically sophisticated or complex methods do
not necessarily provide more accurate forecasts
than simpler ones.
2 The relative ranking of the performance of the
various methods varies according to the accuracy
measure being used.
3 The accuracy when various methods are being
combined outperforms, on average, the individual
methods being combined and does very well in
comparison to other methods.
4 The accuracy of the various methods depends upon
the length of the forecasting horizon involved.
Automatic algorithms for time series forecasting Forecasting competitions 13
M3 competition
Automatic algorithms for time series forecasting Forecasting competitions 14
Makridakis and Hibon (2000)
“The M3-Competition is a final attempt by the authors to
settle the accuracy issue of various time series methods. . .
The extension involves the inclusion of more methods/
researchers (in particular in the areas of neural networks
and expert systems) and more series.”
3003 series
All data from business, demography, finance and
economics.
Series length between 14 and 126.
Either non-seasonal, monthly or quarterly.
All time series positive.
MH claimed that the M3-competition supported the
findings of their earlier work.
However, best performing methods far from “simple”.
Automatic algorithms for time series forecasting Forecasting competitions 15
Makridakis and Hibon (2000)
Best methods:
Theta
A very confusing explanation.
Shown by Hyndman and Billah (2003) to be average of
linear regression and simple exponential smoothing
with drift, applied to seasonally adjusted data.
Later, the original authors claimed that their
explanation was incorrect.
Forecast Pro
A commercial software package with an unknown
algorithm.
Known to fit either exponential smoothing or ARIMA
models using BIC.
Automatic algorithms for time series forecasting Forecasting competitions 16
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic algorithms for time series forecasting Forecasting competitions 17
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic algorithms for time series forecasting Forecasting competitions 17
® Calculations do not match
published paper.
® Some contestants apparently
submitted multiple entries but only
best ones published.
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Exponential smoothing 18
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
A,M: Multiplicative Holt-Winters’ method
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Only 19 models are numerically stable.
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Only 19 models are numerically stable.
Multiplicative trend models give poor forecasts
leaving 15 models.
Automatic algorithms for time series forecasting Exponential smoothing 19
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Trend
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Innovations state space models
¯ All ETS models can be written in innovations
state space form (IJF, 2002).
¯ Additive and multiplicative versions give the
same point forecasts but different prediction
intervals.
ETS state space model
xt−1
εt
yt
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Optimize L wrt model
parameters.
Innovations state space models
Let xt = ( t, bt, st, st−1, . . . , st−m+1) and εt
iid
∼ N(0, σ2
).
yt = h(xt−1) + k(xt−1)εt Observation equation
µt et
xt = f(xt−1) + g(xt−1)εt State equation
Additive errors:
k(xt−1) = 1. yt = µt + εt.
Multiplicative errors:
k(xt−1) = µt. yt = µt(1 + εt).
εt = (yt − µt)/µt is relative error.
Automatic algorithms for time series forecasting Exponential smoothing 22
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
Q: How to choose
between the 15 useful
ETS models?
Cross-validation
Traditional evaluation
Automatic algorithms for time series forecasting Exponential smoothing 24
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
Cross-validation
Traditional evaluation
Standard cross-validation
Automatic algorithms for time series forecasting Exponential smoothing 24
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic algorithms for time series forecasting Exponential smoothing 24
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic algorithms for time series forecasting Exponential smoothing 24
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic algorithms for time series forecasting Exponential smoothing 24
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
Also known as “Evaluation on
a rolling forecast origin”
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic algorithms for time series forecasting Exponential smoothing 26
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic algorithms for time series forecasting Exponential smoothing 26
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic algorithms for time series forecasting Exponential smoothing 26
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
Exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 29
Forecasts from ETS(M,A,N)
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
300400500600
Exponential smoothing
fit - ets(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Exponential smoothing 30
Forecasts from ETS(M,A,N)
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
300400500600
Exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 31
Forecasts from ETS(M,N,M)
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
Exponential smoothing
fit - ets(h02)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Exponential smoothing 32
Forecasts from ETS(M,N,M)
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
Exponential smoothing
 fit
ETS(M,N,M)
Smoothing parameters:
alpha = 0.4597
gamma = 1e-04
Initial states:
l = 0.4501
s = 0.8628 0.8193 0.7648 0.7675 0.6946 1.2921
1.3327 1.1833 1.1617 1.0899 1.0377 0.9937
sigma: 0.0675
AIC AICc BIC
-115.69960 -113.47738 -69.24592
Automatic algorithms for time series forecasting Exponential smoothing 33
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
ETS 17.38 13.13 1.43
Automatic algorithms for time series forecasting Exponential smoothing 34
Exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 35
Exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 35
www.OTexts.org/fpp
Exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 35
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting ARIMA modelling 36
ARIMA models
yt−1
yt−2
yt−3
yt
Inputs Output
Automatic algorithms for time series forecasting ARIMA modelling 37
ARIMA models
yt−1
yt−2
yt−3
εt
yt
Inputs Output
Automatic algorithms for time series forecasting ARIMA modelling 37
Autoregression (AR)
model
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic algorithms for time series forecasting ARIMA modelling 37
Autoregression moving
average (ARMA) model
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic algorithms for time series forecasting ARIMA modelling 37
Autoregression moving
average (ARMA) model
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Use optimization
algorithm to maximize L.
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic algorithms for time series forecasting ARIMA modelling 37
Autoregression moving
average (ARMA) model
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Use optimization
algorithm to maximize L.
ARIMA model
Autoregression moving
average (ARMA) model
applied to differences.
ARIMA modelling
Automatic algorithms for time series forecasting ARIMA modelling 38
ARIMA modelling
Automatic algorithms for time series forecasting ARIMA modelling 38
ARIMA modelling
Automatic algorithms for time series forecasting ARIMA modelling 38
Auto ARIMA
Automatic algorithms for time series forecasting ARIMA modelling 39
Forecasts from ARIMA(0,1,0) with drift
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Auto ARIMA
fit - auto.arima(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting ARIMA modelling 40
Forecasts from ARIMA(0,1,0) with drift
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Auto ARIMA
Automatic algorithms for time series forecasting ARIMA modelling 41
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
Auto ARIMA
fit - auto.arima(h02)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting ARIMA modelling 42
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
Auto ARIMA
 fit
Series: h02
ARIMA(3,1,3)(0,1,1)[12]
Coefficients:
ar1 ar2 ar3 ma1 ma2 ma3 sma1
-0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353 -0.5931
s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212 0.0651
sigma^2 estimated as 0.002706: log likelihood=290.25
AIC=-564.5 AICc=-563.71 BIC=-538.48
Automatic algorithms for time series forecasting ARIMA modelling 43
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Automatic algorithms for time series forecasting ARIMA modelling 44
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman  Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic algorithms for time series forecasting ARIMA modelling 44
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman  Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic algorithms for time series forecasting ARIMA modelling 44
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A seasonal ARIMA process
Φ(Bm
)φ(B)(1 − B)d
(1 − Bm
)D
yt = c + Θ(Bm
)θ(B)εt
Need to select appropriate orders p, q, d, P, Q, D, and
whether to include c.
Hyndman  Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select D using OCSB unit root test.
Select p, q, P, Q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic algorithms for time series forecasting ARIMA modelling 45
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
ETS 17.38 13.13 1.43
AutoARIMA 19.12 13.85 1.47
Automatic algorithms for time series forecasting ARIMA modelling 46
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 47
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Time series with complex seasonality 49
Examples
Automatic algorithms for time series forecasting Time series with complex seasonality 50
US finished motor gasoline products
Weeks
Thousandsofbarrelsperday
1992 1994 1996 1998 2000 2002 2004
6500700075008000850090009500
Examples
Automatic algorithms for time series forecasting Time series with complex seasonality 50
Number of calls to large American bank (7am−9pm)
5 minute intervals
Numberofcallarrivals
100200300400
3 March 17 March 31 March 14 April 28 April 12 May
Examples
Automatic algorithms for time series forecasting Time series with complex seasonality 50
Turkish electricity demand
Days
Electricitydemand(GW)
2000 2002 2004 2006 2008
10152025
TBATS model
TBATS
Trigonometric terms for seasonality
Box-Cox transformations for heterogeneity
ARMA errors for short-term dynamics
Trend (possibly damped)
Seasonal (including multiple and non-integer periods)
Automatic algorithm described in AM De Livera,
RJ Hyndman, and RD Snyder (2011). “Forecasting
time series with complex seasonal patterns using
exponential smoothing”. Journal of the American
Statistical Association 106(496), 1513–1527.
Automatic algorithms for time series forecasting Time series with complex seasonality 51
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
Fourier-like seasonal
terms
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
Fourier-like seasonal
terms
TBATS
Trigonometric
Box-Cox
ARMA
Trend
Seasonal
Examples
fit - tbats(gasoline)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Time series with complex seasonality 53
Forecasts from TBATS(0.999, {2,2}, 1, {52.1785714285714,8})
Weeks
Thousandsofbarrelsperday
1995 2000 2005
70008000900010000
Examples
fit - tbats(callcentre)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Time series with complex seasonality 54
Forecasts from TBATS(1, {3,1}, 0.987, {169,5, 845,3})
5 minute intervals
Numberofcallarrivals
0100200300400500
3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
Examples
fit - tbats(turk)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Time series with complex seasonality 55
Forecasts from TBATS(0, {5,3}, 0.997, {7,3, 354.37,12, 365.25,4})
Days
Electricitydemand(GW)
2000 2002 2004 2006 2008 2010
10152025
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Hierarchical and grouped time series 56
Hierarchical time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Net labour turnover
Tourism by state and region
Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
Hierarchical time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Net labour turnover
Tourism by state and region
Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
Hierarchical time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Net labour turnover
Tourism by state and region
Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
Hierarchical time series
Total
A B C
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
Hierarchical time series
Total
A B C
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1






YA,t
YB,t
YC,t


Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1




S


YA,t
YB,t
YC,t


Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1




S


YA,t
YB,t
YC,t


bt
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1




S


YA,t
YB,t
YC,t


bt
yt = Sbt
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
Hierarchical time series
Total
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
yt =












Yt
YA,t
YB,t
YC,t
YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t












=












1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1












S







YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t







bt
Automatic algorithms for time series forecasting Hierarchical and grouped time series 59
Hierarchical time series
Total
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
yt =












Yt
YA,t
YB,t
YC,t
YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t












=












1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1












S







YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t







bt
Automatic algorithms for time series forecasting Hierarchical and grouped time series 59
Hierarchical time series
Total
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
yt =












Yt
YA,t
YB,t
YC,t
YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t












=












1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1












S







YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t







bt
Automatic algorithms for time series forecasting Hierarchical and grouped time series 59
yt = Sbt
Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
General properties
˜yn(h) = SPˆyn(h)
Forecast bias
Assuming the base forecasts ˆyn(h) are unbiased,
then the revised forecasts are unbiased iff SPS = S.
Forecast variance
For any given P satisfying SPS = S, the covariance
matrix of the h-step ahead reconciled forecast
errors is given by
Var[yn+h − ˜yn(h)] = SPWhP S
where Wh is the covariance matrix of the h-step
ahead base forecast errors.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
General properties
˜yn(h) = SPˆyn(h)
Forecast bias
Assuming the base forecasts ˆyn(h) are unbiased,
then the revised forecasts are unbiased iff SPS = S.
Forecast variance
For any given P satisfying SPS = S, the covariance
matrix of the h-step ahead reconciled forecast
errors is given by
Var[yn+h − ˜yn(h)] = SPWhP S
where Wh is the covariance matrix of the h-step
ahead base forecast errors.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
General properties
˜yn(h) = SPˆyn(h)
Forecast bias
Assuming the base forecasts ˆyn(h) are unbiased,
then the revised forecasts are unbiased iff SPS = S.
Forecast variance
For any given P satisfying SPS = S, the covariance
matrix of the h-step ahead reconciled forecast
errors is given by
Var[yn+h − ˜yn(h)] = SPWhP S
where Wh is the covariance matrix of the h-step
ahead base forecast errors.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Challenges
Computational difficulties in big hierarchies due
to size of the S matrix and singular behavior of
(S ΛS).
Loss of information in ignoring covariance
matrix in computing point forecasts.
Still need to estimate covariance matrix to
produce prediction intervals.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 64
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Challenges
Computational difficulties in big hierarchies due
to size of the S matrix and singular behavior of
(S ΛS).
Loss of information in ignoring covariance
matrix in computing point forecasts.
Still need to estimate covariance matrix to
produce prediction intervals.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 64
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Challenges
Computational difficulties in big hierarchies due
to size of the S matrix and singular behavior of
(S ΛS).
Loss of information in ignoring covariance
matrix in computing point forecasts.
Still need to estimate covariance matrix to
produce prediction intervals.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 64
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Australian tourism
Automatic algorithms for time series forecasting Hierarchical and grouped time series 65
Australian tourism
Automatic algorithms for time series forecasting Hierarchical and grouped time series 65
Hierarchy:
States (7)
Zones (27)
Regions (82)
Australian tourism
Automatic algorithms for time series forecasting Hierarchical and grouped time series 65
Hierarchy:
States (7)
Zones (27)
Regions (82)
Base forecasts
ETS (exponential smoothing)
models
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Total
Year
Visitornights
1998 2000 2002 2004 2006 2008
600006500070000750008000085000
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: NSW
Year
Visitornights
1998 2000 2002 2004 2006 2008
18000220002600030000
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: VIC
Year
Visitornights
1998 2000 2002 2004 2006 2008
1000012000140001600018000
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Nth.Coast.NSW
Year
Visitornights
1998 2000 2002 2004 2006 2008
50006000700080009000
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Metro.QLD
Year
Visitornights
1998 2000 2002 2004 2006 2008
800090001100013000
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Sth.WA
Year
Visitornights
1998 2000 2002 2004 2006 2008
400600800100012001400
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: X201.Melbourne
Year
Visitornights
1998 2000 2002 2004 2006 2008
40004500500055006000
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: X402.Murraylands
Year
Visitornights
1998 2000 2002 2004 2006 2008
0100200300
Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: X809.Daly
Year
Visitornights
1998 2000 2002 2004 2006 2008
020406080100
Reconciled forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 67
Total
2000 2005 2010
650008000095000
Reconciled forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 67
NSW
2000 2005 2010
180002400030000
VIC
2000 2005 2010
100001400018000
QLD
2000 2005 2010
1400020000
Other 2000 2005 2010
1800024000
Reconciled forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 67
Sydney
2000 2005 2010
40007000
OtherNSW
2000 2005 2010
1400022000
Melbourne
2000 2005 2010
40005000
OtherVIC
2000 2005 2010
600012000
GCandBrisbane
2000 2005 2010
60009000
OtherQLD
2000 2005 2010
600012000
Capitalcities
2000 2005 2010
1400020000
Other
2000 2005 2010
55007500
Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
Hierarchy: states, zones, regions
MAPE h = 1 h = 2 h = 4 h = 6 h = 8 Average
Top Level: Australia
Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06
OLS 3.83 3.66 3.88 4.19 4.25 3.94
WLS 3.68 3.56 3.97 4.57 4.25 4.04
Level: States
Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03
OLS 11.07 10.58 11.13 11.62 12.21 11.35
WLS 10.44 10.17 10.47 10.97 10.98 10.67
Level: Zones
Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32
OLS 15.16 15.06 15.27 15.74 16.15 15.48
WLS 14.63 14.62 14.68 15.17 15.25 14.94
Bottom Level: Regions
Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18
OLS 35.89 33.86 34.26 36.06 37.49 35.43
WLS 31.68 31.22 31.08 32.41 32.77 31.89
Automatic algorithms for time series forecasting Hierarchical and grouped time series 69
hts package for R
Automatic algorithms for time series forecasting Hierarchical and grouped time series 70
hts: Hierarchical and grouped time series
Methods for analysing and forecasting hierarchical and grouped
time series
Version: 4.5
Depends: forecast (≥ 5.0), SparseM
Imports: parallel, utils
Published: 2014-12-09
Author: Rob J Hyndman, Earo Wang and Alan Lee
Maintainer: Rob J Hyndman Rob.Hyndman at monash.edu
BugReports: https://github.com/robjhyndman/hts/issues
License: GPL (≥ 2)
Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Recent developments 71
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
For further information
robjhyndman.com
Slides and references for this talk.
Links to all papers and books.
Links to R packages.
A blog about forecasting research.
Automatic algorithms for time series forecasting Recent developments 74

More Related Content

What's hot

What Is Prescriptive Analytics? Your 5-Minute Overview
What Is Prescriptive Analytics? Your 5-Minute OverviewWhat Is Prescriptive Analytics? Your 5-Minute Overview
What Is Prescriptive Analytics? Your 5-Minute OverviewShannon Kearns
 
Machine learning ~ Forecasting
Machine learning ~ ForecastingMachine learning ~ Forecasting
Machine learning ~ ForecastingShaswat Mandhanya
 
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic ArithmeticZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmeticharmonylab
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringHJ van Veen
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningKuppusamy P
 
R言語による簡便な有意差の検出と信頼区間の構成
R言語による簡便な有意差の検出と信頼区間の構成R言語による簡便な有意差の検出と信頼区間の構成
R言語による簡便な有意差の検出と信頼区間の構成Toshiyuki Shimono
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro9xdot
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisAditya Nag
 
Dynamic asset allocation strategy
Dynamic asset allocation strategyDynamic asset allocation strategy
Dynamic asset allocation strategyRifat Ahsan
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case studyRupam Devnath
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsOsman Ali
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingDerek Kane
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0Mohsen Sadok
 
Prophet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツールProphet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツールhoxo_m
 
行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかた
行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかた行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかた
行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかたMasaki Ito
 
言語表現モデルBERTで文章生成してみた
言語表現モデルBERTで文章生成してみた言語表現モデルBERTで文章生成してみた
言語表現モデルBERTで文章生成してみたTakuya Koumura
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data AnalyticsProduct School
 

What's hot (20)

What Is Prescriptive Analytics? Your 5-Minute Overview
What Is Prescriptive Analytics? Your 5-Minute OverviewWhat Is Prescriptive Analytics? Your 5-Minute Overview
What Is Prescriptive Analytics? Your 5-Minute Overview
 
Machine learning ~ Forecasting
Machine learning ~ ForecastingMachine learning ~ Forecasting
Machine learning ~ Forecasting
 
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic ArithmeticZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine Learning
 
R言語による簡便な有意差の検出と信頼区間の構成
R言語による簡便な有意差の検出と信頼区間の構成R言語による簡便な有意差の検出と信頼区間の構成
R言語による簡便な有意差の検出と信頼区間の構成
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro
 
NLP
NLPNLP
NLP
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Time Series Analysis/ Forecasting
Time Series Analysis/ Forecasting  Time Series Analysis/ Forecasting
Time Series Analysis/ Forecasting
 
Dynamic asset allocation strategy
Dynamic asset allocation strategyDynamic asset allocation strategy
Dynamic asset allocation strategy
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case study
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Predictive analytics
Predictive analytics Predictive analytics
Predictive analytics
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
 
Prophet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツールProphet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツール
 
行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかた
行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかた行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかた
行政サービスにデータ資産を活かす: 公共交通データから考える行政の現場でのデータ活用のありかた
 
言語表現モデルBERTで文章生成してみた
言語表現モデルBERTで文章生成してみた言語表現モデルBERTで文章生成してみた
言語表現モデルBERTで文章生成してみた
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 

Viewers also liked

Advances in automatic time series forecasting
Advances in automatic time series forecastingAdvances in automatic time series forecasting
Advances in automatic time series forecastingRob Hyndman
 
Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecastingRob Hyndman
 
Visualization and forecasting of big time series data
Visualization and forecasting of big time series dataVisualization and forecasting of big time series data
Visualization and forecasting of big time series dataRob Hyndman
 
MEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandMEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandRob Hyndman
 
SimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsSimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsRob Hyndman
 
Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesRob Hyndman
 
Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series dataRob Hyndman
 
Exploring the boundaries of predictability
Exploring the boundaries of predictabilityExploring the boundaries of predictability
Exploring the boundaries of predictabilityRob Hyndman
 
Forecasting Hierarchical Time Series
Forecasting Hierarchical Time SeriesForecasting Hierarchical Time Series
Forecasting Hierarchical Time SeriesRob Hyndman
 
Academia sinica jan-2015
Academia sinica jan-2015Academia sinica jan-2015
Academia sinica jan-2015Rob Hyndman
 
Forecasting without forecasters
Forecasting without forecastersForecasting without forecasters
Forecasting without forecastersRob Hyndman
 
Analyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec domsAnalyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec domsBabasab Patil
 
Chap15 time series forecasting & index number
Chap15 time series forecasting & index numberChap15 time series forecasting & index number
Chap15 time series forecasting & index numberUni Azza Aunillah
 
Time Series
Time SeriesTime Series
Time Seriesyush313
 
Probabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandProbabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandRob Hyndman
 
Supply Chain Index Rankings for 2006-2013 and 2009-2013
Supply Chain Index Rankings for 2006-2013 and 2009-2013Supply Chain Index Rankings for 2006-2013 and 2009-2013
Supply Chain Index Rankings for 2006-2013 and 2009-2013Lora Cecere
 
Demand Planning Leadership Exchange: Demand Sensing - Are You Ready?
Demand Planning Leadership Exchange: Demand Sensing - Are You Ready? Demand Planning Leadership Exchange: Demand Sensing - Are You Ready?
Demand Planning Leadership Exchange: Demand Sensing - Are You Ready? Plan4Demand
 
Supply Chains to Admire - An Analysis of Supply Chain Excellence for 2006-2013
Supply Chains to Admire -   An Analysis of Supply Chain Excellence for 2006-2013Supply Chains to Admire -   An Analysis of Supply Chain Excellence for 2006-2013
Supply Chains to Admire - An Analysis of Supply Chain Excellence for 2006-2013Lora Cecere
 
R tools for hierarchical time series
R tools for hierarchical time seriesR tools for hierarchical time series
R tools for hierarchical time seriesRob Hyndman
 

Viewers also liked (20)

Advances in automatic time series forecasting
Advances in automatic time series forecastingAdvances in automatic time series forecasting
Advances in automatic time series forecasting
 
Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecasting
 
Visualization and forecasting of big time series data
Visualization and forecasting of big time series dataVisualization and forecasting of big time series data
Visualization and forecasting of big time series data
 
MEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandMEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demand
 
SimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsSimpleR: tips, tricks & tools
SimpleR: tips, tricks & tools
 
Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time series
 
Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series data
 
Exploring the boundaries of predictability
Exploring the boundaries of predictabilityExploring the boundaries of predictability
Exploring the boundaries of predictability
 
Forecasting Hierarchical Time Series
Forecasting Hierarchical Time SeriesForecasting Hierarchical Time Series
Forecasting Hierarchical Time Series
 
Academia sinica jan-2015
Academia sinica jan-2015Academia sinica jan-2015
Academia sinica jan-2015
 
Forecasting without forecasters
Forecasting without forecastersForecasting without forecasters
Forecasting without forecasters
 
Analyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec domsAnalyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec doms
 
Time series Forecasting
Time series ForecastingTime series Forecasting
Time series Forecasting
 
Chap15 time series forecasting & index number
Chap15 time series forecasting & index numberChap15 time series forecasting & index number
Chap15 time series forecasting & index number
 
Time Series
Time SeriesTime Series
Time Series
 
Probabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandProbabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demand
 
Supply Chain Index Rankings for 2006-2013 and 2009-2013
Supply Chain Index Rankings for 2006-2013 and 2009-2013Supply Chain Index Rankings for 2006-2013 and 2009-2013
Supply Chain Index Rankings for 2006-2013 and 2009-2013
 
Demand Planning Leadership Exchange: Demand Sensing - Are You Ready?
Demand Planning Leadership Exchange: Demand Sensing - Are You Ready? Demand Planning Leadership Exchange: Demand Sensing - Are You Ready?
Demand Planning Leadership Exchange: Demand Sensing - Are You Ready?
 
Supply Chains to Admire - An Analysis of Supply Chain Excellence for 2006-2013
Supply Chains to Admire -   An Analysis of Supply Chain Excellence for 2006-2013Supply Chains to Admire -   An Analysis of Supply Chain Excellence for 2006-2013
Supply Chains to Admire - An Analysis of Supply Chain Excellence for 2006-2013
 
R tools for hierarchical time series
R tools for hierarchical time seriesR tools for hierarchical time series
R tools for hierarchical time series
 

Similar to Automatic algorithms for time series forecasting

Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecastingRob Hyndman
 
Agile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity managementAgile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity managementAgnirudra Sikdar
 
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting startedRob Hyndman
 
Demand Forecasting of a Perishable Dairy Drink: An ARIMA Approach
Demand Forecasting of a Perishable Dairy Drink: An ARIMA ApproachDemand Forecasting of a Perishable Dairy Drink: An ARIMA Approach
Demand Forecasting of a Perishable Dairy Drink: An ARIMA ApproachIJDKP
 
Presentation for lama.pptx
Presentation for lama.pptxPresentation for lama.pptx
Presentation for lama.pptxAdityaNath38
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingAkin Osman Kazakci
 
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...Carlos Pena
 
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...Carlos Pena
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESIRJET Journal
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESIRJET Journal
 
Notion of an algorithm
Notion of an algorithmNotion of an algorithm
Notion of an algorithmNisha Soms
 
Forecasting_Quantitative Forecasting.pptx
Forecasting_Quantitative Forecasting.pptxForecasting_Quantitative Forecasting.pptx
Forecasting_Quantitative Forecasting.pptxRituparnaDas584083
 
AnnualAutomobileSalesPredictionusingARIMAModel (2).pdf
AnnualAutomobileSalesPredictionusingARIMAModel (2).pdfAnnualAutomobileSalesPredictionusingARIMAModel (2).pdf
AnnualAutomobileSalesPredictionusingARIMAModel (2).pdfFarhad Sagor
 
NAA Maximize 2015 - Presentation on In-depth Analytics of Pricing Discovery
NAA Maximize 2015 - Presentation on In-depth Analytics of Pricing DiscoveryNAA Maximize 2015 - Presentation on In-depth Analytics of Pricing Discovery
NAA Maximize 2015 - Presentation on In-depth Analytics of Pricing DiscoveryThe Rainmaker Group
 
06-00-ACA-Evaluation.pdf
06-00-ACA-Evaluation.pdf06-00-ACA-Evaluation.pdf
06-00-ACA-Evaluation.pdfAlexanderLerch4
 

Similar to Automatic algorithms for time series forecasting (20)

Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecasting
 
Agile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity managementAgile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity management
 
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting started
 
Undergraduate Modeling Workshop - Air Quality Working Group Final Presentatio...
Undergraduate Modeling Workshop - Air Quality Working Group Final Presentatio...Undergraduate Modeling Workshop - Air Quality Working Group Final Presentatio...
Undergraduate Modeling Workshop - Air Quality Working Group Final Presentatio...
 
Demand Forecasting of a Perishable Dairy Drink: An ARIMA Approach
Demand Forecasting of a Perishable Dairy Drink: An ARIMA ApproachDemand Forecasting of a Perishable Dairy Drink: An ARIMA Approach
Demand Forecasting of a Perishable Dairy Drink: An ARIMA Approach
 
Presentation for lama.pptx
Presentation for lama.pptxPresentation for lama.pptx
Presentation for lama.pptx
 
I045046066
I045046066I045046066
I045046066
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
 
Stock analysis report
Stock analysis reportStock analysis report
Stock analysis report
 
Dj4201737746
Dj4201737746Dj4201737746
Dj4201737746
 
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
 
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
REVOLUTIONIZING PREDICTIVE REAL-TIME REMOTE MONITORING OF NATURAL GAS-FIRED R...
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIES
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIES
 
Notion of an algorithm
Notion of an algorithmNotion of an algorithm
Notion of an algorithm
 
Power ai t-imeseries
Power ai t-imeseriesPower ai t-imeseries
Power ai t-imeseries
 
Forecasting_Quantitative Forecasting.pptx
Forecasting_Quantitative Forecasting.pptxForecasting_Quantitative Forecasting.pptx
Forecasting_Quantitative Forecasting.pptx
 
AnnualAutomobileSalesPredictionusingARIMAModel (2).pdf
AnnualAutomobileSalesPredictionusingARIMAModel (2).pdfAnnualAutomobileSalesPredictionusingARIMAModel (2).pdf
AnnualAutomobileSalesPredictionusingARIMAModel (2).pdf
 
NAA Maximize 2015 - Presentation on In-depth Analytics of Pricing Discovery
NAA Maximize 2015 - Presentation on In-depth Analytics of Pricing DiscoveryNAA Maximize 2015 - Presentation on In-depth Analytics of Pricing Discovery
NAA Maximize 2015 - Presentation on In-depth Analytics of Pricing Discovery
 
06-00-ACA-Evaluation.pdf
06-00-ACA-Evaluation.pdf06-00-ACA-Evaluation.pdf
06-00-ACA-Evaluation.pdf
 

Recently uploaded

Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 

Recently uploaded (20)

Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 

Automatic algorithms for time series forecasting

  • 1. Rob J Hyndman Automatic algorithms for time series forecasting
  • 2. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting Motivation 2
  • 3. Motivation Automatic algorithms for time series forecasting Motivation 3
  • 4. Motivation Automatic algorithms for time series forecasting Motivation 3
  • 5. Motivation Automatic algorithms for time series forecasting Motivation 3
  • 6. Motivation Automatic algorithms for time series forecasting Motivation 3
  • 7. Motivation Automatic algorithms for time series forecasting Motivation 3
  • 8. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. Specifications Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic algorithms for time series forecasting Motivation 4
  • 9. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. Specifications Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic algorithms for time series forecasting Motivation 4
  • 10. Example: Asian sheep Automatic algorithms for time series forecasting Motivation 5 Numbers of sheep in Asia Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 11. Example: Asian sheep Automatic algorithms for time series forecasting Motivation 5 Automatic ETS forecasts Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 12. Example: Cortecosteroid sales Automatic algorithms for time series forecasting Motivation 6 Monthly cortecosteroid drug sales in Australia Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  • 13. Example: Cortecosteroid sales Automatic algorithms for time series forecasting Motivation 6 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
  • 14. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting Forecasting competitions 7
  • 15. Makridakis and Hibon (1979) Automatic algorithms for time series forecasting Forecasting competitions 8
  • 16. Makridakis and Hibon (1979) Automatic algorithms for time series forecasting Forecasting competitions 8
  • 17. Makridakis and Hibon (1979) This was the first large-scale empirical evaluation of time series forecasting methods. Highly controversial at the time. Difficulties: How to measure forecast accuracy? How to apply methods consistently and objectively? How to explain unexpected results? Common thinking was that the more sophisticated mathematical models (ARIMA models at the time) were necessarily better. If results showed ARIMA models not best, it must be because analyst was unskilled. Automatic algorithms for time series forecasting Forecasting competitions 9
  • 18. Makridakis and Hibon (1979) It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic algorithms for time series forecasting Forecasting competitions 10
  • 19. Makridakis and Hibon (1979) It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic algorithms for time series forecasting Forecasting competitions 10
  • 20. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ consider how to automate forecasting methods; ¯ study what methods give the best forecasts; ¯ be aware of the dangers of over-fitting; ¯ treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic algorithms for time series forecasting Forecasting competitions 11
  • 21. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ consider how to automate forecasting methods; ¯ study what methods give the best forecasts; ¯ be aware of the dangers of over-fitting; ¯ treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic algorithms for time series forecasting Forecasting competitions 11
  • 22. M-competition Automatic algorithms for time series forecasting Forecasting competitions 12
  • 23. M-competition Main findings (taken from Makridakis Hibon, 2000) 1 Statistically sophisticated or complex methods do not necessarily provide more accurate forecasts than simpler ones. 2 The relative ranking of the performance of the various methods varies according to the accuracy measure being used. 3 The accuracy when various methods are being combined outperforms, on average, the individual methods being combined and does very well in comparison to other methods. 4 The accuracy of the various methods depends upon the length of the forecasting horizon involved. Automatic algorithms for time series forecasting Forecasting competitions 13
  • 24. M3 competition Automatic algorithms for time series forecasting Forecasting competitions 14
  • 25. Makridakis and Hibon (2000) “The M3-Competition is a final attempt by the authors to settle the accuracy issue of various time series methods. . . The extension involves the inclusion of more methods/ researchers (in particular in the areas of neural networks and expert systems) and more series.” 3003 series All data from business, demography, finance and economics. Series length between 14 and 126. Either non-seasonal, monthly or quarterly. All time series positive. MH claimed that the M3-competition supported the findings of their earlier work. However, best performing methods far from “simple”. Automatic algorithms for time series forecasting Forecasting competitions 15
  • 26. Makridakis and Hibon (2000) Best methods: Theta A very confusing explanation. Shown by Hyndman and Billah (2003) to be average of linear regression and simple exponential smoothing with drift, applied to seasonally adjusted data. Later, the original authors claimed that their explanation was incorrect. Forecast Pro A commercial software package with an unknown algorithm. Known to fit either exponential smoothing or ARIMA models using BIC. Automatic algorithms for time series forecasting Forecasting competitions 16
  • 27. M3 results (recalculated) Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 Automatic algorithms for time series forecasting Forecasting competitions 17
  • 28. M3 results (recalculated) Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 Automatic algorithms for time series forecasting Forecasting competitions 17 ® Calculations do not match published paper. ® Some contestants apparently submitted multiple entries but only best ones published.
  • 29. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting Exponential smoothing 18
  • 30. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M Automatic algorithms for time series forecasting Exponential smoothing 19
  • 31. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing Automatic algorithms for time series forecasting Exponential smoothing 19
  • 32. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Automatic algorithms for time series forecasting Exponential smoothing 19
  • 33. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method Automatic algorithms for time series forecasting Exponential smoothing 19
  • 34. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Automatic algorithms for time series forecasting Exponential smoothing 19
  • 35. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method Automatic algorithms for time series forecasting Exponential smoothing 19
  • 36. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method Automatic algorithms for time series forecasting Exponential smoothing 19
  • 37. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method A,M: Multiplicative Holt-Winters’ method Automatic algorithms for time series forecasting Exponential smoothing 19
  • 38. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exp. smoothing methods. Automatic algorithms for time series forecasting Exponential smoothing 19
  • 39. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exp. smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Automatic algorithms for time series forecasting Exponential smoothing 19
  • 40. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exp. smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Only 19 models are numerically stable. Automatic algorithms for time series forecasting Exponential smoothing 19
  • 41. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exp. smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Only 19 models are numerically stable. Multiplicative trend models give poor forecasts leaving 15 models. Automatic algorithms for time series forecasting Exponential smoothing 19
  • 42. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic algorithms for time series forecasting Exponential smoothing 20
  • 43. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic algorithms for time series forecasting Exponential smoothing 20
  • 44. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic algorithms for time series forecasting Exponential smoothing 20
  • 45. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic algorithms for time series forecasting Exponential smoothing 20
  • 46. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic algorithms for time series forecasting Exponential smoothing 20
  • 47. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic algorithms for time series forecasting Exponential smoothing 20
  • 48. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic algorithms for time series forecasting Exponential smoothing 20 Innovations state space models ¯ All ETS models can be written in innovations state space form (IJF, 2002). ¯ Additive and multiplicative versions give the same point forecasts but different prediction intervals.
  • 49. ETS state space model xt−1 εt yt Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 50. ETS state space model xt−1 εt yt xt Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 51. ETS state space model xt−1 εt yt xt yt+1 εt+1 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 52. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 53. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 54. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 55. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 56. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 57. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 yt+4 εt+4 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal)
  • 58. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 yt+4 εt+4 Automatic algorithms for time series forecasting Exponential smoothing 21 State space model xt = (level, slope, seasonal) Estimation Compute likelihood L from ε1, ε2, . . . , εT. Optimize L wrt model parameters.
  • 59. Innovations state space models Let xt = ( t, bt, st, st−1, . . . , st−m+1) and εt iid ∼ N(0, σ2 ). yt = h(xt−1) + k(xt−1)εt Observation equation µt et xt = f(xt−1) + g(xt−1)εt State equation Additive errors: k(xt−1) = 1. yt = µt + εt. Multiplicative errors: k(xt−1) = µt. yt = µt(1 + εt). εt = (yt − µt)/µt is relative error. Automatic algorithms for time series forecasting Exponential smoothing 22
  • 60. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic algorithms for time series forecasting Exponential smoothing 23
  • 61. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic algorithms for time series forecasting Exponential smoothing 23
  • 62. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic algorithms for time series forecasting Exponential smoothing 23
  • 63. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic algorithms for time series forecasting Exponential smoothing 23
  • 64. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic algorithms for time series forecasting Exponential smoothing 23
  • 65. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic algorithms for time series forecasting Exponential smoothing 23 Q: How to choose between the 15 useful ETS models?
  • 66. Cross-validation Traditional evaluation Automatic algorithms for time series forecasting Exponential smoothing 24 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data
  • 67. Cross-validation Traditional evaluation Standard cross-validation Automatic algorithms for time series forecasting Exponential smoothing 24 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q
  • 68. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic algorithms for time series forecasting Exponential smoothing 24 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
  • 69. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic algorithms for time series forecasting Exponential smoothing 24 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
  • 70. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic algorithms for time series forecasting Exponential smoothing 24 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Also known as “Evaluation on a rolling forecast origin”
  • 71. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic algorithms for time series forecasting Exponential smoothing 25
  • 72. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic algorithms for time series forecasting Exponential smoothing 25
  • 73. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic algorithms for time series forecasting Exponential smoothing 25
  • 74. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic algorithms for time series forecasting Exponential smoothing 25
  • 75. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic algorithms for time series forecasting Exponential smoothing 25
  • 76. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic algorithms for time series forecasting Exponential smoothing 26
  • 77. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic algorithms for time series forecasting Exponential smoothing 26
  • 78. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic algorithms for time series forecasting Exponential smoothing 26
  • 79. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic algorithms for time series forecasting Exponential smoothing 27
  • 80. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic algorithms for time series forecasting Exponential smoothing 27
  • 81. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic algorithms for time series forecasting Exponential smoothing 27
  • 82. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic algorithms for time series forecasting Exponential smoothing 27
  • 83. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic algorithms for time series forecasting Exponential smoothing 27
  • 84. ets algorithm in R Automatic algorithms for time series forecasting Exponential smoothing 28 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 15 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 85. ets algorithm in R Automatic algorithms for time series forecasting Exponential smoothing 28 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 15 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 86. ets algorithm in R Automatic algorithms for time series forecasting Exponential smoothing 28 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 15 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 87. ets algorithm in R Automatic algorithms for time series forecasting Exponential smoothing 28 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 15 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 88. Exponential smoothing Automatic algorithms for time series forecasting Exponential smoothing 29 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
  • 89. Exponential smoothing fit - ets(livestock) fcast - forecast(fit) plot(fcast) Automatic algorithms for time series forecasting Exponential smoothing 30 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
  • 90. Exponential smoothing Automatic algorithms for time series forecasting Exponential smoothing 31 Forecasts from ETS(M,N,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
  • 91. Exponential smoothing fit - ets(h02) fcast - forecast(fit) plot(fcast) Automatic algorithms for time series forecasting Exponential smoothing 32 Forecasts from ETS(M,N,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
  • 92. Exponential smoothing fit ETS(M,N,M) Smoothing parameters: alpha = 0.4597 gamma = 1e-04 Initial states: l = 0.4501 s = 0.8628 0.8193 0.7648 0.7675 0.6946 1.2921 1.3327 1.1833 1.1617 1.0899 1.0377 0.9937 sigma: 0.0675 AIC AICc BIC -115.69960 -113.47738 -69.24592 Automatic algorithms for time series forecasting Exponential smoothing 33
  • 93. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 ETS 17.38 13.13 1.43 Automatic algorithms for time series forecasting Exponential smoothing 34
  • 94. Exponential smoothing Automatic algorithms for time series forecasting Exponential smoothing 35
  • 95. Exponential smoothing Automatic algorithms for time series forecasting Exponential smoothing 35 www.OTexts.org/fpp
  • 96. Exponential smoothing Automatic algorithms for time series forecasting Exponential smoothing 35
  • 97. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting ARIMA modelling 36
  • 98. ARIMA models yt−1 yt−2 yt−3 yt Inputs Output Automatic algorithms for time series forecasting ARIMA modelling 37
  • 99. ARIMA models yt−1 yt−2 yt−3 εt yt Inputs Output Automatic algorithms for time series forecasting ARIMA modelling 37 Autoregression (AR) model
  • 100. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic algorithms for time series forecasting ARIMA modelling 37 Autoregression moving average (ARMA) model
  • 101. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic algorithms for time series forecasting ARIMA modelling 37 Autoregression moving average (ARMA) model Estimation Compute likelihood L from ε1, ε2, . . . , εT. Use optimization algorithm to maximize L.
  • 102. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic algorithms for time series forecasting ARIMA modelling 37 Autoregression moving average (ARMA) model Estimation Compute likelihood L from ε1, ε2, . . . , εT. Use optimization algorithm to maximize L. ARIMA model Autoregression moving average (ARMA) model applied to differences.
  • 103. ARIMA modelling Automatic algorithms for time series forecasting ARIMA modelling 38
  • 104. ARIMA modelling Automatic algorithms for time series forecasting ARIMA modelling 38
  • 105. ARIMA modelling Automatic algorithms for time series forecasting ARIMA modelling 38
  • 106. Auto ARIMA Automatic algorithms for time series forecasting ARIMA modelling 39 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 107. Auto ARIMA fit - auto.arima(livestock) fcast - forecast(fit) plot(fcast) Automatic algorithms for time series forecasting ARIMA modelling 40 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 108. Auto ARIMA Automatic algorithms for time series forecasting ARIMA modelling 41 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  • 109. Auto ARIMA fit - auto.arima(h02) fcast - forecast(fit) plot(fcast) Automatic algorithms for time series forecasting ARIMA modelling 42 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  • 110. Auto ARIMA fit Series: h02 ARIMA(3,1,3)(0,1,1)[12] Coefficients: ar1 ar2 ar3 ma1 ma2 ma3 sma1 -0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353 -0.5931 s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212 0.0651 sigma^2 estimated as 0.002706: log likelihood=290.25 AIC=-564.5 AICc=-563.71 BIC=-538.48 Automatic algorithms for time series forecasting ARIMA modelling 43
  • 111. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Automatic algorithms for time series forecasting ARIMA modelling 44 Algorithm choices driven by forecast accuracy.
  • 112. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic algorithms for time series forecasting ARIMA modelling 44 Algorithm choices driven by forecast accuracy.
  • 113. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic algorithms for time series forecasting ARIMA modelling 44 Algorithm choices driven by forecast accuracy.
  • 114. How does auto.arima() work? A seasonal ARIMA process Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εt Need to select appropriate orders p, q, d, P, Q, D, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select D using OCSB unit root test. Select p, q, P, Q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic algorithms for time series forecasting ARIMA modelling 45
  • 115. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 B-J automatic 19.13 13.72 1.54 ETS 17.38 13.13 1.43 AutoARIMA 19.12 13.85 1.47 Automatic algorithms for time series forecasting ARIMA modelling 46
  • 116. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 47
  • 117. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone on automated ANN for time series. Watch this space! Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
  • 118. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone on automated ANN for time series. Watch this space! Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
  • 119. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone on automated ANN for time series. Watch this space! Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
  • 120. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone on automated ANN for time series. Watch this space! Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
  • 121. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone on automated ANN for time series. Watch this space! Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
  • 122. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting Time series with complex seasonality 49
  • 123. Examples Automatic algorithms for time series forecasting Time series with complex seasonality 50 US finished motor gasoline products Weeks Thousandsofbarrelsperday 1992 1994 1996 1998 2000 2002 2004 6500700075008000850090009500
  • 124. Examples Automatic algorithms for time series forecasting Time series with complex seasonality 50 Number of calls to large American bank (7am−9pm) 5 minute intervals Numberofcallarrivals 100200300400 3 March 17 March 31 March 14 April 28 April 12 May
  • 125. Examples Automatic algorithms for time series forecasting Time series with complex seasonality 50 Turkish electricity demand Days Electricitydemand(GW) 2000 2002 2004 2006 2008 10152025
  • 126. TBATS model TBATS Trigonometric terms for seasonality Box-Cox transformations for heterogeneity ARMA errors for short-term dynamics Trend (possibly damped) Seasonal (including multiple and non-integer periods) Automatic algorithm described in AM De Livera, RJ Hyndman, and RD Snyder (2011). “Forecasting time series with complex seasonal patterns using exponential smoothing”. Journal of the American Statistical Association 106(496), 1513–1527. Automatic algorithms for time series forecasting Time series with complex seasonality 51
  • 127. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic algorithms for time series forecasting Time series with complex seasonality 52 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt
  • 128. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic algorithms for time series forecasting Time series with complex seasonality 52 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation
  • 129. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic algorithms for time series forecasting Time series with complex seasonality 52 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods
  • 130. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic algorithms for time series forecasting Time series with complex seasonality 52 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend
  • 131. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic algorithms for time series forecasting Time series with complex seasonality 52 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error
  • 132. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic algorithms for time series forecasting Time series with complex seasonality 52 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error Fourier-like seasonal terms
  • 133. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic algorithms for time series forecasting Time series with complex seasonality 52 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error Fourier-like seasonal terms TBATS Trigonometric Box-Cox ARMA Trend Seasonal
  • 134. Examples fit - tbats(gasoline) fcast - forecast(fit) plot(fcast) Automatic algorithms for time series forecasting Time series with complex seasonality 53 Forecasts from TBATS(0.999, {2,2}, 1, {52.1785714285714,8}) Weeks Thousandsofbarrelsperday 1995 2000 2005 70008000900010000
  • 135. Examples fit - tbats(callcentre) fcast - forecast(fit) plot(fcast) Automatic algorithms for time series forecasting Time series with complex seasonality 54 Forecasts from TBATS(1, {3,1}, 0.987, {169,5, 845,3}) 5 minute intervals Numberofcallarrivals 0100200300400500 3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
  • 136. Examples fit - tbats(turk) fcast - forecast(fit) plot(fcast) Automatic algorithms for time series forecasting Time series with complex seasonality 55 Forecasts from TBATS(0, {5,3}, 0.997, {7,3, 354.37,12, 365.25,4}) Days Electricitydemand(GW) 2000 2002 2004 2006 2008 2010 10152025
  • 137. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting Hierarchical and grouped time series 56
  • 138. Hierarchical time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Total A AA AB AC B BA BB BC C CA CB CC Examples Net labour turnover Tourism by state and region Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
  • 139. Hierarchical time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Total A AA AB AC B BA BB BC C CA CB CC Examples Net labour turnover Tourism by state and region Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
  • 140. Hierarchical time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Total A AA AB AC B BA BB BC C CA CB CC Examples Net labour turnover Tourism by state and region Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
  • 141. Hierarchical time series Total A B C Automatic algorithms for time series forecasting Hierarchical and grouped time series 58 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. bt : vector of all series at bottom level in time t.
  • 142. Hierarchical time series Total A B C Automatic algorithms for time series forecasting Hierarchical and grouped time series 58 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. bt : vector of all series at bottom level in time t.
  • 143. Hierarchical time series Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1       YA,t YB,t YC,t   Automatic algorithms for time series forecasting Hierarchical and grouped time series 58 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. bt : vector of all series at bottom level in time t.
  • 144. Hierarchical time series Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   Automatic algorithms for time series forecasting Hierarchical and grouped time series 58 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. bt : vector of all series at bottom level in time t.
  • 145. Hierarchical time series Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   bt Automatic algorithms for time series forecasting Hierarchical and grouped time series 58 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. bt : vector of all series at bottom level in time t.
  • 146. Hierarchical time series Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   bt yt = Sbt Automatic algorithms for time series forecasting Hierarchical and grouped time series 58 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. bt : vector of all series at bottom level in time t.
  • 147. Hierarchical time series Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        bt Automatic algorithms for time series forecasting Hierarchical and grouped time series 59
  • 148. Hierarchical time series Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        bt Automatic algorithms for time series forecasting Hierarchical and grouped time series 59
  • 149. Hierarchical time series Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        bt Automatic algorithms for time series forecasting Hierarchical and grouped time series 59 yt = Sbt
  • 150. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Reconciled forecasts are of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
  • 151. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Reconciled forecasts are of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
  • 152. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Reconciled forecasts are of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
  • 153. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Reconciled forecasts are of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
  • 154. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Reconciled forecasts are of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
  • 155. General properties ˜yn(h) = SPˆyn(h) Forecast bias Assuming the base forecasts ˆyn(h) are unbiased, then the revised forecasts are unbiased iff SPS = S. Forecast variance For any given P satisfying SPS = S, the covariance matrix of the h-step ahead reconciled forecast errors is given by Var[yn+h − ˜yn(h)] = SPWhP S where Wh is the covariance matrix of the h-step ahead base forecast errors. Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
  • 156. General properties ˜yn(h) = SPˆyn(h) Forecast bias Assuming the base forecasts ˆyn(h) are unbiased, then the revised forecasts are unbiased iff SPS = S. Forecast variance For any given P satisfying SPS = S, the covariance matrix of the h-step ahead reconciled forecast errors is given by Var[yn+h − ˜yn(h)] = SPWhP S where Wh is the covariance matrix of the h-step ahead base forecast errors. Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
  • 157. General properties ˜yn(h) = SPˆyn(h) Forecast bias Assuming the base forecasts ˆyn(h) are unbiased, then the revised forecasts are unbiased iff SPS = S. Forecast variance For any given P satisfying SPS = S, the covariance matrix of the h-step ahead reconciled forecast errors is given by Var[yn+h − ˜yn(h)] = SPWhP S where Wh is the covariance matrix of the h-step ahead base forecast errors. Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
  • 158. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPWhP S ] has solution P = (S W † hS)−1 S W † h. W † h is generalized inverse of Wh. ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h) Revised forecasts Base forecasts Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh). Problem: Wh hard to estimate. Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
  • 159. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPWhP S ] has solution P = (S W † hS)−1 S W † h. W † h is generalized inverse of Wh. ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h) Revised forecasts Base forecasts Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh). Problem: Wh hard to estimate. Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
  • 160. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPWhP S ] has solution P = (S W † hS)−1 S W † h. W † h is generalized inverse of Wh. ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h) Revised forecasts Base forecasts Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh). Problem: Wh hard to estimate. Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
  • 161. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPWhP S ] has solution P = (S W † hS)−1 S W † h. W † h is generalized inverse of Wh. ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h) Revised forecasts Base forecasts Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh). Problem: Wh hard to estimate. Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
  • 162. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPWhP S ] has solution P = (S W † hS)−1 S W † h. W † h is generalized inverse of Wh. ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h) Revised forecasts Base forecasts Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh). Problem: Wh hard to estimate. Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
  • 163. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPWhP S ] has solution P = (S W † hS)−1 S W † h. W † h is generalized inverse of Wh. ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h) Revised forecasts Base forecasts Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh). Problem: Wh hard to estimate. Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
  • 164. Optimal combination forecasts Revised forecasts Base forecasts Solution 1: OLS ˜yn(h) = S(S S)−1 S ˆyn(h) Solution 2: WLS Approximate W1 by its diagonal. Assume Wh = khW1. Easy to estimate, and places weight where we have best one-step forecasts. ˜yn(h) = S(S ΛS)−1 S Λˆyn(h) Automatic algorithms for time series forecasting Hierarchical and grouped time series 63 ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h)
  • 165. Optimal combination forecasts Revised forecasts Base forecasts Solution 1: OLS ˜yn(h) = S(S S)−1 S ˆyn(h) Solution 2: WLS Approximate W1 by its diagonal. Assume Wh = khW1. Easy to estimate, and places weight where we have best one-step forecasts. ˜yn(h) = S(S ΛS)−1 S Λˆyn(h) Automatic algorithms for time series forecasting Hierarchical and grouped time series 63 ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h)
  • 166. Optimal combination forecasts Revised forecasts Base forecasts Solution 1: OLS ˜yn(h) = S(S S)−1 S ˆyn(h) Solution 2: WLS Approximate W1 by its diagonal. Assume Wh = khW1. Easy to estimate, and places weight where we have best one-step forecasts. ˜yn(h) = S(S ΛS)−1 S Λˆyn(h) Automatic algorithms for time series forecasting Hierarchical and grouped time series 63 ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h)
  • 167. Optimal combination forecasts Revised forecasts Base forecasts Solution 1: OLS ˜yn(h) = S(S S)−1 S ˆyn(h) Solution 2: WLS Approximate W1 by its diagonal. Assume Wh = khW1. Easy to estimate, and places weight where we have best one-step forecasts. ˜yn(h) = S(S ΛS)−1 S Λˆyn(h) Automatic algorithms for time series forecasting Hierarchical and grouped time series 63 ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h)
  • 168. Optimal combination forecasts Revised forecasts Base forecasts Solution 1: OLS ˜yn(h) = S(S S)−1 S ˆyn(h) Solution 2: WLS Approximate W1 by its diagonal. Assume Wh = khW1. Easy to estimate, and places weight where we have best one-step forecasts. ˜yn(h) = S(S ΛS)−1 S Λˆyn(h) Automatic algorithms for time series forecasting Hierarchical and grouped time series 63 ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h)
  • 169. Optimal combination forecasts Revised forecasts Base forecasts Solution 1: OLS ˜yn(h) = S(S S)−1 S ˆyn(h) Solution 2: WLS Approximate W1 by its diagonal. Assume Wh = khW1. Easy to estimate, and places weight where we have best one-step forecasts. ˜yn(h) = S(S ΛS)−1 S Λˆyn(h) Automatic algorithms for time series forecasting Hierarchical and grouped time series 63 ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h)
  • 170. Optimal combination forecasts Revised forecasts Base forecasts Solution 1: OLS ˜yn(h) = S(S S)−1 S ˆyn(h) Solution 2: WLS Approximate W1 by its diagonal. Assume Wh = khW1. Easy to estimate, and places weight where we have best one-step forecasts. ˜yn(h) = S(S ΛS)−1 S Λˆyn(h) Automatic algorithms for time series forecasting Hierarchical and grouped time series 63 ˜yn(h) = S(S W † hS)−1 S W † h ˆyn(h)
  • 171. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S ΛS). Loss of information in ignoring covariance matrix in computing point forecasts. Still need to estimate covariance matrix to produce prediction intervals. Automatic algorithms for time series forecasting Hierarchical and grouped time series 64 ˜yn(h) = S(S ΛS)−1 S Λˆyn(h)
  • 172. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S ΛS). Loss of information in ignoring covariance matrix in computing point forecasts. Still need to estimate covariance matrix to produce prediction intervals. Automatic algorithms for time series forecasting Hierarchical and grouped time series 64 ˜yn(h) = S(S ΛS)−1 S Λˆyn(h)
  • 173. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S ΛS). Loss of information in ignoring covariance matrix in computing point forecasts. Still need to estimate covariance matrix to produce prediction intervals. Automatic algorithms for time series forecasting Hierarchical and grouped time series 64 ˜yn(h) = S(S ΛS)−1 S Λˆyn(h)
  • 174. Australian tourism Automatic algorithms for time series forecasting Hierarchical and grouped time series 65
  • 175. Australian tourism Automatic algorithms for time series forecasting Hierarchical and grouped time series 65 Hierarchy: States (7) Zones (27) Regions (82)
  • 176. Australian tourism Automatic algorithms for time series forecasting Hierarchical and grouped time series 65 Hierarchy: States (7) Zones (27) Regions (82) Base forecasts ETS (exponential smoothing) models
  • 177. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: Total Year Visitornights 1998 2000 2002 2004 2006 2008 600006500070000750008000085000
  • 178. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: NSW Year Visitornights 1998 2000 2002 2004 2006 2008 18000220002600030000
  • 179. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: VIC Year Visitornights 1998 2000 2002 2004 2006 2008 1000012000140001600018000
  • 180. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: Nth.Coast.NSW Year Visitornights 1998 2000 2002 2004 2006 2008 50006000700080009000
  • 181. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: Metro.QLD Year Visitornights 1998 2000 2002 2004 2006 2008 800090001100013000
  • 182. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: Sth.WA Year Visitornights 1998 2000 2002 2004 2006 2008 400600800100012001400
  • 183. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: X201.Melbourne Year Visitornights 1998 2000 2002 2004 2006 2008 40004500500055006000
  • 184. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: X402.Murraylands Year Visitornights 1998 2000 2002 2004 2006 2008 0100200300
  • 185. Base forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 66 Domestic tourism forecasts: X809.Daly Year Visitornights 1998 2000 2002 2004 2006 2008 020406080100
  • 186. Reconciled forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 67 Total 2000 2005 2010 650008000095000
  • 187. Reconciled forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 67 NSW 2000 2005 2010 180002400030000 VIC 2000 2005 2010 100001400018000 QLD 2000 2005 2010 1400020000 Other 2000 2005 2010 1800024000
  • 188. Reconciled forecasts Automatic algorithms for time series forecasting Hierarchical and grouped time series 67 Sydney 2000 2005 2010 40007000 OtherNSW 2000 2005 2010 1400022000 Melbourne 2000 2005 2010 40005000 OtherVIC 2000 2005 2010 600012000 GCandBrisbane 2000 2005 2010 60009000 OtherQLD 2000 2005 2010 600012000 Capitalcities 2000 2005 2010 1400020000 Other 2000 2005 2010 55007500
  • 189. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
  • 190. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
  • 191. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
  • 192. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
  • 193. Hierarchy: states, zones, regions MAPE h = 1 h = 2 h = 4 h = 6 h = 8 Average Top Level: Australia Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06 OLS 3.83 3.66 3.88 4.19 4.25 3.94 WLS 3.68 3.56 3.97 4.57 4.25 4.04 Level: States Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03 OLS 11.07 10.58 11.13 11.62 12.21 11.35 WLS 10.44 10.17 10.47 10.97 10.98 10.67 Level: Zones Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32 OLS 15.16 15.06 15.27 15.74 16.15 15.48 WLS 14.63 14.62 14.68 15.17 15.25 14.94 Bottom Level: Regions Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18 OLS 35.89 33.86 34.26 36.06 37.49 35.43 WLS 31.68 31.22 31.08 32.41 32.77 31.89 Automatic algorithms for time series forecasting Hierarchical and grouped time series 69
  • 194. hts package for R Automatic algorithms for time series forecasting Hierarchical and grouped time series 70 hts: Hierarchical and grouped time series Methods for analysing and forecasting hierarchical and grouped time series Version: 4.5 Depends: forecast (≥ 5.0), SparseM Imports: parallel, utils Published: 2014-12-09 Author: Rob J Hyndman, Earo Wang and Alan Lee Maintainer: Rob J Hyndman Rob.Hyndman at monash.edu BugReports: https://github.com/robjhyndman/hts/issues License: GPL (≥ 2)
  • 195. Outline 1 Motivation 2 Forecasting competitions 3 Exponential smoothing 4 ARIMA modelling 5 Automatic nonlinear forecasting? 6 Time series with complex seasonality 7 Hierarchical and grouped time series 8 Recent developments Automatic algorithms for time series forecasting Recent developments 71
  • 196. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic algorithms for time series forecasting Recent developments 72
  • 197. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic algorithms for time series forecasting Recent developments 72
  • 198. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic algorithms for time series forecasting Recent developments 72
  • 199. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic algorithms for time series forecasting Recent developments 72
  • 200. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic algorithms for time series forecasting Recent developments 73
  • 201. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic algorithms for time series forecasting Recent developments 73
  • 202. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic algorithms for time series forecasting Recent developments 73
  • 203. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic algorithms for time series forecasting Recent developments 73
  • 204. For further information robjhyndman.com Slides and references for this talk. Links to all papers and books. Links to R packages. A blog about forecasting research. Automatic algorithms for time series forecasting Recent developments 74