Given the resurgence of neural network-based techniques in recent years, it is important for data science practitioner to understand how to apply these techniques and the tradeoffs between neural network-based and traditional statistical methods.
This lecture discusses two specific techniques: Vector Autoregressive (VAR) Models and Recurrent Neural Network (RNN). The former is one of the most important class of multivariate time series statistical models applied in finance while the latter is a neural network architecture that is suitable for time series forecasting. I’ll demonstrate how they are implemented in practice and compares their advantages and disadvantages. Real-world applications, demonstrated using python and Spark, are used to illustrate these techniques. While not the focus in this lecture, exploratory time series data analysis using time-series plot, plots of autocorrelation (i.e. correlogram), plots of partial autocorrelation, plots of cross-correlations, histogram, and kernel density plot, will also be included in the demo.
The attendees will learn – the formulation of a time series forecasting problem statement in context of VAR and RNN – the application of Recurrent Neural Network-based techniques in time series forecasting – the application of Vector Autoregressive Models in multivariate time series forecasting – the pros and cons of using VAR and RNN-based techniques in the context of financial time series forecasting – When to use VAR and when to use RNN-based techniques
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model: When and How with Jeffrey Yau
1. Jeffrey Yau
Chief Data Scientist, AllianceBernstein, L.P.
Lecturer, UC Berkeley Masters of Information Data
Science
Time Series Forecasting Using
Neural Network-Based and
Time Series Statistic Models
2. Learning Objectives
2
For data scientists and practitioners conducting time series forecasting
After this introductory lecture today, you will have learned
1. The characteristics of time series forecasting problem
2. How to set up a time-series forecasting problem
3. The basic intuition of two popular statistical time series models and
LSTM network
4. How to implement these methods in Python
5. The advantages and potential drawbacks of these methods
3. Agenda
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
3
5. Section I
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
5
8. Forecasting: Problem Formulation
• Forecasting: predicting the future values of the series
using current information set
• Current information set consists of current and past
values of the series and other “exogenous” series
8
11. Time Series Forecasting Requires Models
11
A statistical model or a
machine learning
algorithm
Forecast
horizon: H
Information Set:
12. A Naïve, Rule-based Model:
12
A model, f(), could be as simple as “a rule” - naive model:
The forecast for tomorrow is the observed value today
Forecast
horizon: h=1
Information Set:
13. However …
!()could be a slightly more complicated
“model” that utilizes more information
from the current information set
13
14. “Rolling” Average Model
14
The forecast for time t+1 is an average of the observed
values from a predefined, past k time periods
Forecast horizon: h=1 Information Set:
15. More Sophisticated Models
15
… there are other, much more sophisticated models that
can utilize the information set much more efficiently
16. Section II
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
16
17. Univariate Statistical Time Series Models
Statistical relationship is unidirectional
17
values from its own series
exogenous series
Dynamics of series y
18. Autoregressive Moving Average (ARIMA)
Model: 1-Minute Recap
18
values from own series shocks / “error” terms exogenous series
It models the dynamics of the series
y
20. Multivariate Time Series Modeling
20
Kequations
lag-1 of the K series
lag-p of the K series
exogenous series
Dynamics of each of the series Interdependence among the series
21. Section III.A
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
21
23. ● a system of linear equations of the K series being
modeled
● only applies to stationary series
● non-stationary series can be transformed into
stationary ones using simple differencing (note: if the
series are not co-integrated, then we can still apply
VAR ("VAR in differences"))
Vector Autoregressive (VAR) Models
23
24. Vector Autoregressive (VAR) Model of
Order 1
24
AsystemofKequations
Each series is modelled by its own lag as well as other
series’ lags
25. General Steps to Build VAR Model
25
1. Ingest the series
2. Train/validation/test split the series
3. Conduct exploratory time series data analysis on the training set
4. Determine if the series are stationary
5. Transform the series
6. Build a model on the transformed series
7. Model diagnostic
8. Model selection (based on some pre-defined criterion)
9. Conduct forecast using the final, chosen model
10.Inverse-transform the forecast
11. Conduct forecast evaluation
Iterative
26. Index of Consumer Sentiment
26
autocorrelation function
(ACF) graph
Partial autocorrelation
function (PACF) graph
29. Transforming the Series
29
Take the simple-difference of the natural logarithmic transformation of
the series
note: difference-transformation
generates missing values
31. Is the method we propose capable of answering the following questions?
● What are the dynamic properties of these series? Own lagged
coefficients
● How are these series interact, if at all? Cross-series lagged
coefficients
VAR Model Proposed
31
36. VAR Model Selection
36
Model selection, in the case of VAR(p), is the choice of the order and
the specification of each equation
Information criterion can be used for model selection:
37. VAR Model - Inverse Transform
37
Don’t forget to inverse-transform the forecasted series!
38. VAR Model - Forecast Using the Model
38
The Forecast Equation:
40. What do the result mean in this context?
40
Don’t forget to put the result in the existing context!
41. Section III.B
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
41
42. Feed-Forward Network with a Single Output
42
inputs output
Ø Information does not account
for time ordering
Ø inputs are processed
independently
Ø no “device” to keep the past
information
Network architecture does not have "memory" built in
Hidden layers
43. Recurrent Neural Network (RNN)
43
A network architecture that can
• retain past information
• track the state of the world, and
• update the state of the world as the network moves forward
Handles variable-length sequence by having a recurrent hidden state
whose activation at each time is dependent on that of the previous time.
47. LSTM: Hochreiter and Schmidhuber (1997)
47
The architecture of memory cells and gate units from the original
Hochreiter and Schmidhuber (1997) paper
48. Long Short Term Memory (LSTM) Network
48
Another representation of the architecture of memory cells and gate
units: Greff, Srivastava, Koutnık, Steunebrink, Schmidhuber (2016)
49. LSTM: An Overview
49
LSTM Memory
Cell
ht-1 ht
Use memory cells and gated units for information flow
hidden state
(value from
activation function)
in time step t-1
hidden state
(value from
activation function)
in time step t
50. LSTM: An Overview
50
LSTM Memory
Cell
ht-1 ht
hidden state memory cell (state)
Output gate
Forget gate Input gate
Training uses Backward Propagation Through Time (BPTT)
51. LSTM: An Overview
51
LSTM Memory
Cell
ht-1 ht
hidden state(t)
memory cell (t)
Training uses Backward Propagation Through Time (BPTT)
Candidate
memory cell (t)
Output gate
Input gate
Forget gate
52. Implementation in Keras
Some steps to highlight:
• Formulate the series for a RNN supervised learning regression problem
(i.e. (Define target and input tensors))
• Scale all the series
• Split the series for training/development/testing
• Reshape the series for (Keras) RNN implementation
• Define the (initial) architecture of the LSTM Model
○ Define a network of layers that maps your inputs to your targets and
the complexity of each layer (i.e. number of memory cells)
○ Configure the learning process by picking a loss function, an
optimizer, and metrics to monitor
• Produce the forecasts and then reverse-scale the forecasted series
• Calculate loss metrics (e.g. RMSE, MAE)
52
Note that stationarity, as defined previously, is not a requirement
56. Section IV
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
56
57. VAR vs. LSTM: Data Type
macroeconomic time
series, financial time
series, business time
series, and other numeric
series
57
DNA sequences, images,
voice sequences, texts, all
the numeric time series
(that can be modeled by
VAR)
VAR LSTM
58. Layer(s) of many non-
linear transformations
VAR vs. LSTM: Parametric form
A linear system of
equations - highly
parameterized (can be
formulated in the general
state space model)
58
VAR LSTM
59. • stationarity not a
requirement
VAR vs. LSTM: Stationarity Requirement
• applied to stationary
time series only
• its variant (e.g. Vector
Error Correction
Model) can be applied
to co-integrated series
59
VAR LSTM
60. • data preprocessing is a
lot more involved
• network architecture
design, model training
and hyperparameter
tuning requires much
more efforts
VAR vs. LSTM: Model Implementation
• data preprocessing is
straight-forward
• model specification is
relative straight-
forward, model
training time is fast
60
VAR LSTM
61. Section V
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
61
62. Implementation using Spark
62
One may think it is straightforward to use Spark for all the steps (outlined
above) up to model training, such as ...
Ordering in the series matterWait … Not So Fast!
63. Flint: A Time Series Library for Spark
63
Need time-series aware libraries whose operations preserve the natural
order of the series, such as Flint, open sourced by TwoSigma
(https://github.com/twosigma/flint)
- can handle a variety of datafile types, manipulate and aggregate time
series, and provide time-series-specific functions, such as (time-series)
join, (time-series) filtering, time-based windowing, cycles, and
intervalizing)
Example: Defining a moving average of the beer consumption series
64. A Few Words on TSA Workflow in Spark
64
Load, pre-process, analyze,
transform series using time-
series aware libraries
Conduct Hyperparameter
Tuning in a distributed
model and use Spark ML
Pipeline