Many applications require a large number of time series to be forecast completely automatically. For example, manufacturing companies often require weekly forecasts of demand for thousands of products at dozens of locations in order to plan distribution and maintain suitable inventory stocks. In these circumstances, it is not feasible for time series models to be developed for each series by an experienced analyst. Instead, an automatic forecasting algorithm is required.
In addition to providing automatic forecasts when required, these algorithms also provide high quality benchmarks that can be used when developing more specific and specialized forecasting models.
I will describe some algorithms for automatically forecasting univariate time series that have been developed over the last 20 years. The role of forecasting competitions in comparing the forecast accuracy of these algorithms will also be discussed.
2. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Motivation 2
8. Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
¯ determine an appropriate time series model;
¯ estimate the parameters;
¯ compute the forecasts with prediction intervals.
Automatic algorithms for time series forecasting Motivation 4
9. Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
¯ determine an appropriate time series model;
¯ estimate the parameters;
¯ compute the forecasts with prediction intervals.
Automatic algorithms for time series forecasting Motivation 4
10. Example: Asian sheep
Automatic algorithms for time series forecasting Motivation 5
Numbers of sheep in Asia
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
11. Example: Asian sheep
Automatic algorithms for time series forecasting Motivation 5
Automatic ETS forecasts
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
12. Example: Cortecosteroid sales
Automatic algorithms for time series forecasting Motivation 6
Monthly cortecosteroid drug sales in Australia
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
13. Example: Cortecosteroid sales
Automatic algorithms for time series forecasting Motivation 6
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
14. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Forecasting competitions 7
15. Makridakis and Hibon (1979)
Automatic algorithms for time series forecasting Forecasting competitions 8
16. Makridakis and Hibon (1979)
Automatic algorithms for time series forecasting Forecasting competitions 8
17. Makridakis and Hibon (1979)
This was the first large-scale empirical evaluation of
time series forecasting methods.
Highly controversial at the time.
Difficulties:
How to measure forecast accuracy?
How to apply methods consistently and objectively?
How to explain unexpected results?
Common thinking was that the more
sophisticated mathematical models (ARIMA
models at the time) were necessarily better.
If results showed ARIMA models not best, it
must be because analyst was unskilled.
Automatic algorithms for time series forecasting Forecasting competitions 9
18. Makridakis and Hibon (1979)
It is amazing to me, however, that after all this
exercise in identifying models, transforming and so
on, that the autoregressive moving averages come
out so badly. I wonder whether it might be partly
due to the authors not using the backwards
forecasting approach to obtain the initial errors.
— W.G. Gilchrist
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so many of the
simple methods . . . these authors are more at
home with simple procedures than with
Box-Jenkins. — C. Chatfield
Automatic algorithms for time series forecasting Forecasting competitions 10
19. Makridakis and Hibon (1979)
It is amazing to me, however, that after all this
exercise in identifying models, transforming and so
on, that the autoregressive moving averages come
out so badly. I wonder whether it might be partly
due to the authors not using the backwards
forecasting approach to obtain the initial errors.
— W.G. Gilchrist
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so many of the
simple methods . . . these authors are more at
home with simple procedures than with
Box-Jenkins. — C. Chatfield
Automatic algorithms for time series forecasting Forecasting competitions 10
20. Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting methods;
¯ study what methods give the best forecasts;
¯ be aware of the dangers of over-fitting;
¯ treat forecasting as a different problem from
time series analysis.
Makridakis Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic algorithms for time series forecasting Forecasting competitions 11
21. Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting methods;
¯ study what methods give the best forecasts;
¯ be aware of the dangers of over-fitting;
¯ treat forecasting as a different problem from
time series analysis.
Makridakis Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic algorithms for time series forecasting Forecasting competitions 11
23. M-competition
Main findings (taken from Makridakis Hibon, 2000)
1 Statistically sophisticated or complex methods do
not necessarily provide more accurate forecasts
than simpler ones.
2 The relative ranking of the performance of the
various methods varies according to the accuracy
measure being used.
3 The accuracy when various methods are being
combined outperforms, on average, the individual
methods being combined and does very well in
comparison to other methods.
4 The accuracy of the various methods depends upon
the length of the forecasting horizon involved.
Automatic algorithms for time series forecasting Forecasting competitions 13
25. Makridakis and Hibon (2000)
“The M3-Competition is a final attempt by the authors to
settle the accuracy issue of various time series methods. . .
The extension involves the inclusion of more methods/
researchers (in particular in the areas of neural networks
and expert systems) and more series.”
3003 series
All data from business, demography, finance and
economics.
Series length between 14 and 126.
Either non-seasonal, monthly or quarterly.
All time series positive.
MH claimed that the M3-competition supported the
findings of their earlier work.
However, best performing methods far from “simple”.
Automatic algorithms for time series forecasting Forecasting competitions 15
26. Makridakis and Hibon (2000)
Best methods:
Theta
A very confusing explanation.
Shown by Hyndman and Billah (2003) to be average of
linear regression and simple exponential smoothing
with drift, applied to seasonally adjusted data.
Later, the original authors claimed that their
explanation was incorrect.
Forecast Pro
A commercial software package with an unknown
algorithm.
Known to fit either exponential smoothing or ARIMA
models using BIC.
Automatic algorithms for time series forecasting Forecasting competitions 16
27. M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic algorithms for time series forecasting Forecasting competitions 17
28. M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic algorithms for time series forecasting Forecasting competitions 17
® Calculations do not match
published paper.
® Some contestants apparently
submitted multiple entries but only
best ones published.
29. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Exponential smoothing 18
30. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
Automatic algorithms for time series forecasting Exponential smoothing 19
31. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 19
32. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Automatic algorithms for time series forecasting Exponential smoothing 19
33. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
Automatic algorithms for time series forecasting Exponential smoothing 19
34. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Automatic algorithms for time series forecasting Exponential smoothing 19
35. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
Automatic algorithms for time series forecasting Exponential smoothing 19
36. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
Automatic algorithms for time series forecasting Exponential smoothing 19
37. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
A,M: Multiplicative Holt-Winters’ method
Automatic algorithms for time series forecasting Exponential smoothing 19
38. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Automatic algorithms for time series forecasting Exponential smoothing 19
39. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Automatic algorithms for time series forecasting Exponential smoothing 19
40. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Only 19 models are numerically stable.
Automatic algorithms for time series forecasting Exponential smoothing 19
41. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Only 19 models are numerically stable.
Multiplicative trend models give poor forecasts
leaving 15 models.
Automatic algorithms for time series forecasting Exponential smoothing 19
42. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
43. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
44. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Trend
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
45. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
46. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
47. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
48. Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting Exponential smoothing 20
Innovations state space models
¯ All ETS models can be written in innovations
state space form (IJF, 2002).
¯ Additive and multiplicative versions give the
same point forecasts but different prediction
intervals.
49. ETS state space model
xt−1
εt
yt
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
50. ETS state space model
xt−1
εt
yt
xt
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
51. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
52. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
53. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
54. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
55. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
56. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
57. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
58. ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic algorithms for time series forecasting Exponential smoothing 21
State space model
xt = (level, slope, seasonal)
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Optimize L wrt model
parameters.
59. Innovations state space models
Let xt = ( t, bt, st, st−1, . . . , st−m+1) and εt
iid
∼ N(0, σ2
).
yt = h(xt−1) + k(xt−1)εt Observation equation
µt et
xt = f(xt−1) + g(xt−1)εt State equation
Additive errors:
k(xt−1) = 1. yt = µt + εt.
Multiplicative errors:
k(xt−1) = µt. yt = µt(1 + εt).
εt = (yt − µt)/µt is relative error.
Automatic algorithms for time series forecasting Exponential smoothing 22
60. Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
61. Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
62. Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
63. Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
64. Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
65. Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic algorithms for time series forecasting Exponential smoothing 23
Q: How to choose
between the 15 useful
ETS models?
71. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
72. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
73. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
74. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
75. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting Exponential smoothing 25
76. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic algorithms for time series forecasting Exponential smoothing 26
77. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic algorithms for time series forecasting Exponential smoothing 26
78. Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic algorithms for time series forecasting Exponential smoothing 26
79. What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
80. What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
81. What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
82. What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
83. What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic algorithms for time series forecasting Exponential smoothing 27
84. ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
85. ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
86. ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
87. ets algorithm in R
Automatic algorithms for time series forecasting Exponential smoothing 28
Based on Hyndman, Koehler,
Snyder Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
88. Exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 29
Forecasts from ETS(M,A,N)
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
300400500600
89. Exponential smoothing
fit - ets(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Exponential smoothing 30
Forecasts from ETS(M,A,N)
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
300400500600
90. Exponential smoothing
Automatic algorithms for time series forecasting Exponential smoothing 31
Forecasts from ETS(M,N,M)
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
91. Exponential smoothing
fit - ets(h02)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Exponential smoothing 32
Forecasts from ETS(M,N,M)
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
92. Exponential smoothing
fit
ETS(M,N,M)
Smoothing parameters:
alpha = 0.4597
gamma = 1e-04
Initial states:
l = 0.4501
s = 0.8628 0.8193 0.7648 0.7675 0.6946 1.2921
1.3327 1.1833 1.1617 1.0899 1.0377 0.9937
sigma: 0.0675
AIC AICc BIC
-115.69960 -113.47738 -69.24592
Automatic algorithms for time series forecasting Exponential smoothing 33
93. M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
ETS 17.38 13.13 1.43
Automatic algorithms for time series forecasting Exponential smoothing 34
97. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting ARIMA modelling 36
101. ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic algorithms for time series forecasting ARIMA modelling 37
Autoregression moving
average (ARMA) model
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Use optimization
algorithm to maximize L.
102. ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic algorithms for time series forecasting ARIMA modelling 37
Autoregression moving
average (ARMA) model
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Use optimization
algorithm to maximize L.
ARIMA model
Autoregression moving
average (ARMA) model
applied to differences.
106. Auto ARIMA
Automatic algorithms for time series forecasting ARIMA modelling 39
Forecasts from ARIMA(0,1,0) with drift
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
107. Auto ARIMA
fit - auto.arima(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting ARIMA modelling 40
Forecasts from ARIMA(0,1,0) with drift
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
108. Auto ARIMA
Automatic algorithms for time series forecasting ARIMA modelling 41
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
109. Auto ARIMA
fit - auto.arima(h02)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting ARIMA modelling 42
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
110. Auto ARIMA
fit
Series: h02
ARIMA(3,1,3)(0,1,1)[12]
Coefficients:
ar1 ar2 ar3 ma1 ma2 ma3 sma1
-0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353 -0.5931
s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212 0.0651
sigma^2 estimated as 0.002706: log likelihood=290.25
AIC=-564.5 AICc=-563.71 BIC=-538.48
Automatic algorithms for time series forecasting ARIMA modelling 43
111. How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Automatic algorithms for time series forecasting ARIMA modelling 44
Algorithm choices driven by forecast accuracy.
112. How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic algorithms for time series forecasting ARIMA modelling 44
Algorithm choices driven by forecast accuracy.
113. How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic algorithms for time series forecasting ARIMA modelling 44
Algorithm choices driven by forecast accuracy.
114. How does auto.arima() work?
A seasonal ARIMA process
Φ(Bm
)φ(B)(1 − B)d
(1 − Bm
)D
yt = c + Θ(Bm
)θ(B)εt
Need to select appropriate orders p, q, d, P, Q, D, and
whether to include c.
Hyndman Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select D using OCSB unit root test.
Select p, q, P, Q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic algorithms for time series forecasting ARIMA modelling 45
115. M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
ETS 17.38 13.13 1.43
AutoARIMA 19.12 13.85 1.47
Automatic algorithms for time series forecasting ARIMA modelling 46
116. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 47
117. Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
118. Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
119. Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
120. Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
121. Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting Automatic nonlinear forecasting? 48
122. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Time series with complex seasonality 49
123. Examples
Automatic algorithms for time series forecasting Time series with complex seasonality 50
US finished motor gasoline products
Weeks
Thousandsofbarrelsperday
1992 1994 1996 1998 2000 2002 2004
6500700075008000850090009500
124. Examples
Automatic algorithms for time series forecasting Time series with complex seasonality 50
Number of calls to large American bank (7am−9pm)
5 minute intervals
Numberofcallarrivals
100200300400
3 March 17 March 31 March 14 April 28 April 12 May
125. Examples
Automatic algorithms for time series forecasting Time series with complex seasonality 50
Turkish electricity demand
Days
Electricitydemand(GW)
2000 2002 2004 2006 2008
10152025
126. TBATS model
TBATS
Trigonometric terms for seasonality
Box-Cox transformations for heterogeneity
ARMA errors for short-term dynamics
Trend (possibly damped)
Seasonal (including multiple and non-integer periods)
Automatic algorithm described in AM De Livera,
RJ Hyndman, and RD Snyder (2011). “Forecasting
time series with complex seasonal patterns using
exponential smoothing”. Journal of the American
Statistical Association 106(496), 1513–1527.
Automatic algorithms for time series forecasting Time series with complex seasonality 51
127. TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
128. TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
129. TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
130. TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
131. TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
132. TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
Fourier-like seasonal
terms
133. TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic algorithms for time series forecasting Time series with complex seasonality 52
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
Fourier-like seasonal
terms
TBATS
Trigonometric
Box-Cox
ARMA
Trend
Seasonal
134. Examples
fit - tbats(gasoline)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Time series with complex seasonality 53
Forecasts from TBATS(0.999, {2,2}, 1, {52.1785714285714,8})
Weeks
Thousandsofbarrelsperday
1995 2000 2005
70008000900010000
135. Examples
fit - tbats(callcentre)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Time series with complex seasonality 54
Forecasts from TBATS(1, {3,1}, 0.987, {169,5, 845,3})
5 minute intervals
Numberofcallarrivals
0100200300400500
3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
136. Examples
fit - tbats(turk)
fcast - forecast(fit)
plot(fcast)
Automatic algorithms for time series forecasting Time series with complex seasonality 55
Forecasts from TBATS(0, {5,3}, 0.997, {7,3, 354.37,12, 365.25,4})
Days
Electricitydemand(GW)
2000 2002 2004 2006 2008 2010
10152025
137. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Hierarchical and grouped time series 56
138. Hierarchical time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Net labour turnover
Tourism by state and region
Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
139. Hierarchical time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Net labour turnover
Tourism by state and region
Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
140. Hierarchical time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Net labour turnover
Tourism by state and region
Automatic algorithms for time series forecasting Hierarchical and grouped time series 57
141. Hierarchical time series
Total
A B C
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
142. Hierarchical time series
Total
A B C
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
143. Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =
1 1 1
1 0 0
0 1 0
0 0 1
YA,t
YB,t
YC,t
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
144. Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =
1 1 1
1 0 0
0 1 0
0 0 1
S
YA,t
YB,t
YC,t
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
145. Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =
1 1 1
1 0 0
0 1 0
0 0 1
S
YA,t
YB,t
YC,t
bt
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
146. Hierarchical time series
Total
A B C
yt = [Yt, YA,t, YB,t, YC,t] =
1 1 1
1 0 0
0 1 0
0 0 1
S
YA,t
YB,t
YC,t
bt
yt = Sbt
Automatic algorithms for time series forecasting Hierarchical and grouped time series 58
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
bt : vector of all series at
bottom level in time t.
150. Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
151. Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
152. Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
153. Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
154. Forecasting notation
Let ˆyn(h) be vector of initial h-step forecasts, made
at time n, stacked in same order as yt. (They may
not add up.)
Reconciled forecasts are of the form:
˜yn(h) = SPˆyn(h)
for some matrix P.
P extracts and combines base forecasts ˆyn(h)
to get bottom-level forecasts.
S adds them up
Automatic algorithms for time series forecasting Hierarchical and grouped time series 60
155. General properties
˜yn(h) = SPˆyn(h)
Forecast bias
Assuming the base forecasts ˆyn(h) are unbiased,
then the revised forecasts are unbiased iff SPS = S.
Forecast variance
For any given P satisfying SPS = S, the covariance
matrix of the h-step ahead reconciled forecast
errors is given by
Var[yn+h − ˜yn(h)] = SPWhP S
where Wh is the covariance matrix of the h-step
ahead base forecast errors.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
156. General properties
˜yn(h) = SPˆyn(h)
Forecast bias
Assuming the base forecasts ˆyn(h) are unbiased,
then the revised forecasts are unbiased iff SPS = S.
Forecast variance
For any given P satisfying SPS = S, the covariance
matrix of the h-step ahead reconciled forecast
errors is given by
Var[yn+h − ˜yn(h)] = SPWhP S
where Wh is the covariance matrix of the h-step
ahead base forecast errors.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
157. General properties
˜yn(h) = SPˆyn(h)
Forecast bias
Assuming the base forecasts ˆyn(h) are unbiased,
then the revised forecasts are unbiased iff SPS = S.
Forecast variance
For any given P satisfying SPS = S, the covariance
matrix of the h-step ahead reconciled forecast
errors is given by
Var[yn+h − ˜yn(h)] = SPWhP S
where Wh is the covariance matrix of the h-step
ahead base forecast errors.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 61
158. BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
159. BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
160. BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
161. BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
162. BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
163. BLUF via trace minimization
Theorem
For any P satisfying SPS = S, then
min
P
= trace[SPWhP S ]
has solution P = (S W
†
hS)−1
S W
†
h.
W
†
h is generalized inverse of Wh.
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
Revised forecasts Base forecasts
Equivalent to GLS estimate of regression
ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Wh).
Problem: Wh hard to estimate.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 62
164. Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
165. Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
166. Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
167. Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
168. Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
169. Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
170. Optimal combination forecasts
Revised forecasts Base forecasts
Solution 1: OLS
˜yn(h) = S(S S)−1
S ˆyn(h)
Solution 2: WLS
Approximate W1 by its diagonal.
Assume Wh = khW1.
Easy to estimate, and places weight where we
have best one-step forecasts.
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
Automatic algorithms for time series forecasting Hierarchical and grouped time series 63
˜yn(h) = S(S W
†
hS)−1
S W
†
h
ˆyn(h)
171. Challenges
Computational difficulties in big hierarchies due
to size of the S matrix and singular behavior of
(S ΛS).
Loss of information in ignoring covariance
matrix in computing point forecasts.
Still need to estimate covariance matrix to
produce prediction intervals.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 64
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
172. Challenges
Computational difficulties in big hierarchies due
to size of the S matrix and singular behavior of
(S ΛS).
Loss of information in ignoring covariance
matrix in computing point forecasts.
Still need to estimate covariance matrix to
produce prediction intervals.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 64
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
173. Challenges
Computational difficulties in big hierarchies due
to size of the S matrix and singular behavior of
(S ΛS).
Loss of information in ignoring covariance
matrix in computing point forecasts.
Still need to estimate covariance matrix to
produce prediction intervals.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 64
˜yn(h) = S(S ΛS)−1
S Λˆyn(h)
176. Australian tourism
Automatic algorithms for time series forecasting Hierarchical and grouped time series 65
Hierarchy:
States (7)
Zones (27)
Regions (82)
Base forecasts
ETS (exponential smoothing)
models
177. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Total
Year
Visitornights
1998 2000 2002 2004 2006 2008
600006500070000750008000085000
178. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: NSW
Year
Visitornights
1998 2000 2002 2004 2006 2008
18000220002600030000
179. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: VIC
Year
Visitornights
1998 2000 2002 2004 2006 2008
1000012000140001600018000
180. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Nth.Coast.NSW
Year
Visitornights
1998 2000 2002 2004 2006 2008
50006000700080009000
181. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Metro.QLD
Year
Visitornights
1998 2000 2002 2004 2006 2008
800090001100013000
182. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: Sth.WA
Year
Visitornights
1998 2000 2002 2004 2006 2008
400600800100012001400
183. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: X201.Melbourne
Year
Visitornights
1998 2000 2002 2004 2006 2008
40004500500055006000
184. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: X402.Murraylands
Year
Visitornights
1998 2000 2002 2004 2006 2008
0100200300
185. Base forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 66
Domestic tourism forecasts: X809.Daly
Year
Visitornights
1998 2000 2002 2004 2006 2008
020406080100
187. Reconciled forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 67
NSW
2000 2005 2010
180002400030000
VIC
2000 2005 2010
100001400018000
QLD
2000 2005 2010
1400020000
Other 2000 2005 2010
1800024000
188. Reconciled forecasts
Automatic algorithms for time series forecasting Hierarchical and grouped time series 67
Sydney
2000 2005 2010
40007000
OtherNSW
2000 2005 2010
1400022000
Melbourne
2000 2005 2010
40005000
OtherVIC
2000 2005 2010
600012000
GCandBrisbane
2000 2005 2010
60009000
OtherQLD
2000 2005 2010
600012000
Capitalcities
2000 2005 2010
1400020000
Other
2000 2005 2010
55007500
189. Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
190. Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
191. Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
192. Forecast evaluation
Select models using all observations;
Re-estimate models using first 12 observations
and generate 1- to 8-step-ahead forecasts;
Increase sample size one observation at a time,
re-estimate models, generate forecasts until
the end of the sample;
In total 24 1-step-ahead, 23 2-steps-ahead, up
to 17 8-steps-ahead for forecast evaluation.
Automatic algorithms for time series forecasting Hierarchical and grouped time series 68
193. Hierarchy: states, zones, regions
MAPE h = 1 h = 2 h = 4 h = 6 h = 8 Average
Top Level: Australia
Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06
OLS 3.83 3.66 3.88 4.19 4.25 3.94
WLS 3.68 3.56 3.97 4.57 4.25 4.04
Level: States
Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03
OLS 11.07 10.58 11.13 11.62 12.21 11.35
WLS 10.44 10.17 10.47 10.97 10.98 10.67
Level: Zones
Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32
OLS 15.16 15.06 15.27 15.74 16.15 15.48
WLS 14.63 14.62 14.68 15.17 15.25 14.94
Bottom Level: Regions
Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18
OLS 35.89 33.86 34.26 36.06 37.49 35.43
WLS 31.68 31.22 31.08 32.41 32.77 31.89
Automatic algorithms for time series forecasting Hierarchical and grouped time series 69
194. hts package for R
Automatic algorithms for time series forecasting Hierarchical and grouped time series 70
hts: Hierarchical and grouped time series
Methods for analysing and forecasting hierarchical and grouped
time series
Version: 4.5
Depends: forecast (≥ 5.0), SparseM
Imports: parallel, utils
Published: 2014-12-09
Author: Rob J Hyndman, Earo Wang and Alan Lee
Maintainer: Rob J Hyndman Rob.Hyndman at monash.edu
BugReports: https://github.com/robjhyndman/hts/issues
License: GPL (≥ 2)
195. Outline
1 Motivation
2 Forecasting competitions
3 Exponential smoothing
4 ARIMA modelling
5 Automatic nonlinear forecasting?
6 Time series with complex seasonality
7 Hierarchical and grouped time series
8 Recent developments
Automatic algorithms for time series forecasting Recent developments 71
196. Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
197. Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
198. Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
199. Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting Recent developments 72
200. Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
201. Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
202. Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
203. Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting Recent developments 73
204. For further information
robjhyndman.com
Slides and references for this talk.
Links to all papers and books.
Links to R packages.
A blog about forecasting research.
Automatic algorithms for time series forecasting Recent developments 74