SlideShare a Scribd company logo
1 of 65
Download to read offline
Jeffrey Yau
Chief Data Scientist, AllianceBernstein, L.P.
Lecturer, UC Berkeley Masters of Information Data
Science
Time Series Forecasting Using
Neural Network-Based and
Time Series Statistic Models
Learning Objectives
2
For data scientists and practitioners conducting time series forecasting
After this introductory lecture today, you will have learned
1. The characteristics of time series forecasting problem
2. How to set up a time-series forecasting problem
3. The basic intuition of two popular statistical time series models and
LSTM network
4. How to implement these methods in Python
5. The advantages and potential drawbacks of these methods
Agenda
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
3
Motivation - Why MTSA?
4
Daily Interest Rates
Days
InterestRates
Section I
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
5
Forecasting: Problem Formulation
• Forecasting: predicting the future values of the series
using current information set
6
Forecasting: Problem Formulation
• Forecasting: predicting the future values of the series
using current information set
7
Forecasting: Problem Formulation
• Forecasting: predicting the future values of the series
using current information set
• Current information set consists of current and past
values of the series and other “exogenous” series
8
Time Series Forecasting Requires a Model
9
Time Series Forecasting Requires Models
10
Time Series Forecasting Requires Models
11
A statistical model or a
machine learning
algorithm
Forecast
horizon: H
Information Set:
A Naïve, Rule-based Model:
12
A model, f(), could be as simple as “a rule” - naive model:
The forecast for tomorrow is the observed value today
Forecast
horizon: h=1
Information Set:
However …
!()could be a slightly more complicated
“model” that utilizes more information
from the current information set
13
“Rolling” Average Model
14
The forecast for time t+1 is an average of the observed
values from a predefined, past k time periods
Forecast horizon: h=1 Information Set:
More Sophisticated Models
15
… there are other, much more sophisticated models that
can utilize the information set much more efficiently
Section II
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
16
Univariate Statistical Time Series Models
Statistical relationship is unidirectional
17
values from its own series
exogenous series
Dynamics of series y
Autoregressive Moving Average (ARIMA)
Model: 1-Minute Recap
18
values from own series shocks / “error” terms exogenous series
It models the dynamics of the series
y
Autoregressive Integrated Moving Average
(ARIMA) Model: 1-Minute Recap
19
My tutorial at PyData San Francisco
2016
Multivariate Time Series Modeling
20
Kequations
lag-1 of the K series
lag-p of the K series
exogenous series
Dynamics of each of the series Interdependence among the series
Section III.A
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
21
Joint Modeling of Multiple Time Series
22
● a system of linear equations of the K series being
modeled
● only applies to stationary series
● non-stationary series can be transformed into
stationary ones using simple differencing (note: if the
series are not co-integrated, then we can still apply
VAR ("VAR in differences"))
Vector Autoregressive (VAR) Models
23
Vector Autoregressive (VAR) Model of
Order 1
24
AsystemofKequations
Each series is modelled by its own lag as well as other
series’ lags
General Steps to Build VAR Model
25
1. Ingest the series
2. Train/validation/test split the series
3. Conduct exploratory time series data analysis on the training set
4. Determine if the series are stationary
5. Transform the series
6. Build a model on the transformed series
7. Model diagnostic
8. Model selection (based on some pre-defined criterion)
9. Conduct forecast using the final, chosen model
10.Inverse-transform the forecast
11. Conduct forecast evaluation
Iterative
Index of Consumer Sentiment
26
autocorrelation function
(ACF) graph
Partial autocorrelation
function (PACF) graph
Series Transformation
27
Beer Consumption Series
28
Autocorrelation function
Transforming the Series
29
Take the simple-difference of the natural logarithmic transformation of
the series
note: difference-transformation
generates missing values
Transformed Series
30
Consumer Sentiment Beer Consumption
Is the method we propose capable of answering the following questions?
● What are the dynamic properties of these series? Own lagged
coefficients
● How are these series interact, if at all? Cross-series lagged
coefficients
VAR Model Proposed
31
VAR Model Estimation and Output
32
VAR Model Output - Estimated Coefficients
33put your #assignedhashtag here by setting the footer in view-header/footer
VAR Model Output - Var-Covar Matrix
34
VAR Model Diagnostic
35
UMCSENT Beer
VAR Model Selection
36
Model selection, in the case of VAR(p), is the choice of the order and
the specification of each equation
Information criterion can be used for model selection:
VAR Model - Inverse Transform
37
Don’t forget to inverse-transform the forecasted series!
VAR Model - Forecast Using the Model
38
The Forecast Equation:
VAR Model Forecast
39
where T is the last observation period and l is the lag
What do the result mean in this context?
40
Don’t forget to put the result in the existing context!
Section III.B
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
41
Feed-Forward Network with a Single Output
42
inputs output
Ø Information does not account
for time ordering
Ø inputs are processed
independently
Ø no “device” to keep the past
information
Network architecture does not have "memory" built in
Hidden layers
Recurrent Neural Network (RNN)
43
A network architecture that can
• retain past information
• track the state of the world, and
• update the state of the world as the network moves forward
Handles variable-length sequence by having a recurrent hidden state
whose activation at each time is dependent on that of the previous time.
Standard Recurrent Neural Network (RNN)
44
Limitation of Vanilla RNN Architecture
45
Exploding (and vanishing) gradient problems
(Sepp Hochreiter, 1991 Diploma Thesis)
Long Short Term Memory (LSTM) Network
46
LSTM: Hochreiter and Schmidhuber (1997)
47
The architecture of memory cells and gate units from the original
Hochreiter and Schmidhuber (1997) paper
Long Short Term Memory (LSTM) Network
48
Another representation of the architecture of memory cells and gate
units: Greff, Srivastava, Koutnık, Steunebrink, Schmidhuber (2016)
LSTM: An Overview
49
LSTM Memory
Cell
ht-1 ht
Use memory cells and gated units for information flow
hidden state
(value from
activation function)
in time step t-1
hidden state
(value from
activation function)
in time step t
LSTM: An Overview
50
LSTM Memory
Cell
ht-1 ht
hidden state memory cell (state)
Output gate
Forget gate Input gate
Training uses Backward Propagation Through Time (BPTT)
LSTM: An Overview
51
LSTM Memory
Cell
ht-1 ht
hidden state(t)
memory cell (t)
Training uses Backward Propagation Through Time (BPTT)
Candidate
memory cell (t)
Output gate
Input gate
Forget gate
Implementation in Keras
Some steps to highlight:
• Formulate the series for a RNN supervised learning regression problem
(i.e. (Define target and input tensors))
• Scale all the series
• Split the series for training/development/testing
• Reshape the series for (Keras) RNN implementation
• Define the (initial) architecture of the LSTM Model
○ Define a network of layers that maps your inputs to your targets and
the complexity of each layer (i.e. number of memory cells)
○ Configure the learning process by picking a loss function, an
optimizer, and metrics to monitor
• Produce the forecasts and then reverse-scale the forecasted series
• Calculate loss metrics (e.g. RMSE, MAE)
52
Note that stationarity, as defined previously, is not a requirement
53
Continue with the Same Example
LSTM Architecture Design, Training, Evaluation
54
LSTM: Forecast Results
55
UMSCENT
Beer
Section IV
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
56
VAR vs. LSTM: Data Type
macroeconomic time
series, financial time
series, business time
series, and other numeric
series
57
DNA sequences, images,
voice sequences, texts, all
the numeric time series
(that can be modeled by
VAR)
VAR LSTM
Layer(s) of many non-
linear transformations
VAR vs. LSTM: Parametric form
A linear system of
equations - highly
parameterized (can be
formulated in the general
state space model)
58
VAR LSTM
• stationarity not a
requirement
VAR vs. LSTM: Stationarity Requirement
• applied to stationary
time series only
• its variant (e.g. Vector
Error Correction
Model) can be applied
to co-integrated series
59
VAR LSTM
• data preprocessing is a
lot more involved
• network architecture
design, model training
and hyperparameter
tuning requires much
more efforts
VAR vs. LSTM: Model Implementation
• data preprocessing is
straight-forward
• model specification is
relative straight-
forward, model
training time is fast
60
VAR LSTM
Section V
Section I: Time series forecasting problem formulation
Section II: Univariate & Multivariate time series forecasting
Section III: Selected approaches to this problem:
v Autoregressive Integrated Moving Average (ARIMA) Model
v Vector Autoregressive (VAR) Model
v Recurrent Neural Network
Ø Formulation
Ø Python Implementation
Section IV: Comparison of the statistical and neural
network approaches
Section V: Spark Implementation Consideration
61
Implementation using Spark
62
One may think it is straightforward to use Spark for all the steps (outlined
above) up to model training, such as ...
Ordering in the series matterWait … Not So Fast!
Flint: A Time Series Library for Spark
63
Need time-series aware libraries whose operations preserve the natural
order of the series, such as Flint, open sourced by TwoSigma
(https://github.com/twosigma/flint)
- can handle a variety of datafile types, manipulate and aggregate time
series, and provide time-series-specific functions, such as (time-series)
join, (time-series) filtering, time-based windowing, cycles, and
intervalizing)
Example: Defining a moving average of the beer consumption series
A Few Words on TSA Workflow in Spark
64
Load, pre-process, analyze,
transform series using time-
series aware libraries
Conduct Hyperparameter
Tuning in a distributed
model and use Spark ML
Pipeline
Thank You!
65jyau@berkeley.edu

More Related Content

What's hot

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
Natasha Latysheva
 

What's hot (20)

Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
 
Hidden Markov Model & It's Application in Python
Hidden Markov Model & It's Application in PythonHidden Markov Model & It's Application in Python
Hidden Markov Model & It's Application in Python
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Time series forecasting
Time series forecastingTime series forecasting
Time series forecasting
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classification
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
GMM
GMMGMM
GMM
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 

Similar to Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model: When and How with Jeffrey Yau

Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Databricks
 
Machine Learning @NECST
Machine Learning @NECSTMachine Learning @NECST
Machine Learning @NECST
NECST Lab @ Politecnico di Milano
 
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Luigi Vanfretti
 
Study on Some Key Issues of Synergetic Neural Network
Study on Some Key Issues of Synergetic Neural Network  Study on Some Key Issues of Synergetic Neural Network
Study on Some Key Issues of Synergetic Neural Network
Jie Bao
 
Mumbai University M.E computer engg syllabus
Mumbai University M.E computer engg syllabusMumbai University M.E computer engg syllabus
Mumbai University M.E computer engg syllabus
Shini Saji
 
Programacion multiobjetivo
Programacion multiobjetivoProgramacion multiobjetivo
Programacion multiobjetivo
Diego Bass
 

Similar to Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model: When and How with Jeffrey Yau (20)

Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
Machine Learning @NECST
Machine Learning @NECSTMachine Learning @NECST
Machine Learning @NECST
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
 
Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine Learning
 
IRJET - Stock Market Prediction using Machine Learning Algorithm
IRJET - Stock Market Prediction using Machine Learning AlgorithmIRJET - Stock Market Prediction using Machine Learning Algorithm
IRJET - Stock Market Prediction using Machine Learning Algorithm
 
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
 
2012 05-10 kaiser
2012 05-10 kaiser2012 05-10 kaiser
2012 05-10 kaiser
 
Wait-free data structures on embedded multi-core systems
Wait-free data structures on embedded multi-core systemsWait-free data structures on embedded multi-core systems
Wait-free data structures on embedded multi-core systems
 
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
 
Study on Some Key Issues of Synergetic Neural Network
Study on Some Key Issues of Synergetic Neural Network  Study on Some Key Issues of Synergetic Neural Network
Study on Some Key Issues of Synergetic Neural Network
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and Innovations
 
Variation response method CAE simulation suite
Variation response method CAE simulation suiteVariation response method CAE simulation suite
Variation response method CAE simulation suite
 
Mumbai University M.E computer engg syllabus
Mumbai University M.E computer engg syllabusMumbai University M.E computer engg syllabus
Mumbai University M.E computer engg syllabus
 
2nd sem
2nd sem2nd sem
2nd sem
 
2nd sem
2nd sem2nd sem
2nd sem
 
Basic use of xcms
Basic use of xcmsBasic use of xcms
Basic use of xcms
 
Programacion multiobjetivo
Programacion multiobjetivoProgramacion multiobjetivo
Programacion multiobjetivo
 

More from Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 

Recently uploaded (20)

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 

Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model: When and How with Jeffrey Yau

  • 1. Jeffrey Yau Chief Data Scientist, AllianceBernstein, L.P. Lecturer, UC Berkeley Masters of Information Data Science Time Series Forecasting Using Neural Network-Based and Time Series Statistic Models
  • 2. Learning Objectives 2 For data scientists and practitioners conducting time series forecasting After this introductory lecture today, you will have learned 1. The characteristics of time series forecasting problem 2. How to set up a time-series forecasting problem 3. The basic intuition of two popular statistical time series models and LSTM network 4. How to implement these methods in Python 5. The advantages and potential drawbacks of these methods
  • 3. Agenda Section I: Time series forecasting problem formulation Section II: Univariate & Multivariate time series forecasting Section III: Selected approaches to this problem: v Autoregressive Integrated Moving Average (ARIMA) Model v Vector Autoregressive (VAR) Model v Recurrent Neural Network Ø Formulation Ø Python Implementation Section IV: Comparison of the statistical and neural network approaches Section V: Spark Implementation Consideration 3
  • 4. Motivation - Why MTSA? 4 Daily Interest Rates Days InterestRates
  • 5. Section I Section I: Time series forecasting problem formulation Section II: Univariate & Multivariate time series forecasting Section III: Selected approaches to this problem: v Autoregressive Integrated Moving Average (ARIMA) Model v Vector Autoregressive (VAR) Model v Recurrent Neural Network Ø Formulation Ø Python Implementation Section IV: Comparison of the statistical and neural network approaches Section V: Spark Implementation Consideration 5
  • 6. Forecasting: Problem Formulation • Forecasting: predicting the future values of the series using current information set 6
  • 7. Forecasting: Problem Formulation • Forecasting: predicting the future values of the series using current information set 7
  • 8. Forecasting: Problem Formulation • Forecasting: predicting the future values of the series using current information set • Current information set consists of current and past values of the series and other “exogenous” series 8
  • 9. Time Series Forecasting Requires a Model 9
  • 10. Time Series Forecasting Requires Models 10
  • 11. Time Series Forecasting Requires Models 11 A statistical model or a machine learning algorithm Forecast horizon: H Information Set:
  • 12. A Naïve, Rule-based Model: 12 A model, f(), could be as simple as “a rule” - naive model: The forecast for tomorrow is the observed value today Forecast horizon: h=1 Information Set:
  • 13. However … !()could be a slightly more complicated “model” that utilizes more information from the current information set 13
  • 14. “Rolling” Average Model 14 The forecast for time t+1 is an average of the observed values from a predefined, past k time periods Forecast horizon: h=1 Information Set:
  • 15. More Sophisticated Models 15 … there are other, much more sophisticated models that can utilize the information set much more efficiently
  • 16. Section II Section I: Time series forecasting problem formulation Section II: Univariate & Multivariate time series forecasting Section III: Selected approaches to this problem: v Autoregressive Integrated Moving Average (ARIMA) Model v Vector Autoregressive (VAR) Model v Recurrent Neural Network Ø Formulation Ø Python Implementation Section IV: Comparison of the statistical and neural network approaches Section V: Spark Implementation Consideration 16
  • 17. Univariate Statistical Time Series Models Statistical relationship is unidirectional 17 values from its own series exogenous series Dynamics of series y
  • 18. Autoregressive Moving Average (ARIMA) Model: 1-Minute Recap 18 values from own series shocks / “error” terms exogenous series It models the dynamics of the series y
  • 19. Autoregressive Integrated Moving Average (ARIMA) Model: 1-Minute Recap 19 My tutorial at PyData San Francisco 2016
  • 20. Multivariate Time Series Modeling 20 Kequations lag-1 of the K series lag-p of the K series exogenous series Dynamics of each of the series Interdependence among the series
  • 21. Section III.A Section I: Time series forecasting problem formulation Section II: Univariate & Multivariate time series forecasting Section III: Selected approaches to this problem: v Autoregressive Integrated Moving Average (ARIMA) Model v Vector Autoregressive (VAR) Model v Recurrent Neural Network Ø Formulation Ø Python Implementation Section IV: Comparison of the statistical and neural network approaches Section V: Spark Implementation Consideration 21
  • 22. Joint Modeling of Multiple Time Series 22
  • 23. ● a system of linear equations of the K series being modeled ● only applies to stationary series ● non-stationary series can be transformed into stationary ones using simple differencing (note: if the series are not co-integrated, then we can still apply VAR ("VAR in differences")) Vector Autoregressive (VAR) Models 23
  • 24. Vector Autoregressive (VAR) Model of Order 1 24 AsystemofKequations Each series is modelled by its own lag as well as other series’ lags
  • 25. General Steps to Build VAR Model 25 1. Ingest the series 2. Train/validation/test split the series 3. Conduct exploratory time series data analysis on the training set 4. Determine if the series are stationary 5. Transform the series 6. Build a model on the transformed series 7. Model diagnostic 8. Model selection (based on some pre-defined criterion) 9. Conduct forecast using the final, chosen model 10.Inverse-transform the forecast 11. Conduct forecast evaluation Iterative
  • 26. Index of Consumer Sentiment 26 autocorrelation function (ACF) graph Partial autocorrelation function (PACF) graph
  • 29. Transforming the Series 29 Take the simple-difference of the natural logarithmic transformation of the series note: difference-transformation generates missing values
  • 31. Is the method we propose capable of answering the following questions? ● What are the dynamic properties of these series? Own lagged coefficients ● How are these series interact, if at all? Cross-series lagged coefficients VAR Model Proposed 31
  • 32. VAR Model Estimation and Output 32
  • 33. VAR Model Output - Estimated Coefficients 33put your #assignedhashtag here by setting the footer in view-header/footer
  • 34. VAR Model Output - Var-Covar Matrix 34
  • 36. VAR Model Selection 36 Model selection, in the case of VAR(p), is the choice of the order and the specification of each equation Information criterion can be used for model selection:
  • 37. VAR Model - Inverse Transform 37 Don’t forget to inverse-transform the forecasted series!
  • 38. VAR Model - Forecast Using the Model 38 The Forecast Equation:
  • 39. VAR Model Forecast 39 where T is the last observation period and l is the lag
  • 40. What do the result mean in this context? 40 Don’t forget to put the result in the existing context!
  • 41. Section III.B Section I: Time series forecasting problem formulation Section II: Univariate & Multivariate time series forecasting Section III: Selected approaches to this problem: v Autoregressive Integrated Moving Average (ARIMA) Model v Vector Autoregressive (VAR) Model v Recurrent Neural Network Ø Formulation Ø Python Implementation Section IV: Comparison of the statistical and neural network approaches Section V: Spark Implementation Consideration 41
  • 42. Feed-Forward Network with a Single Output 42 inputs output Ø Information does not account for time ordering Ø inputs are processed independently Ø no “device” to keep the past information Network architecture does not have "memory" built in Hidden layers
  • 43. Recurrent Neural Network (RNN) 43 A network architecture that can • retain past information • track the state of the world, and • update the state of the world as the network moves forward Handles variable-length sequence by having a recurrent hidden state whose activation at each time is dependent on that of the previous time.
  • 44. Standard Recurrent Neural Network (RNN) 44
  • 45. Limitation of Vanilla RNN Architecture 45 Exploding (and vanishing) gradient problems (Sepp Hochreiter, 1991 Diploma Thesis)
  • 46. Long Short Term Memory (LSTM) Network 46
  • 47. LSTM: Hochreiter and Schmidhuber (1997) 47 The architecture of memory cells and gate units from the original Hochreiter and Schmidhuber (1997) paper
  • 48. Long Short Term Memory (LSTM) Network 48 Another representation of the architecture of memory cells and gate units: Greff, Srivastava, Koutnık, Steunebrink, Schmidhuber (2016)
  • 49. LSTM: An Overview 49 LSTM Memory Cell ht-1 ht Use memory cells and gated units for information flow hidden state (value from activation function) in time step t-1 hidden state (value from activation function) in time step t
  • 50. LSTM: An Overview 50 LSTM Memory Cell ht-1 ht hidden state memory cell (state) Output gate Forget gate Input gate Training uses Backward Propagation Through Time (BPTT)
  • 51. LSTM: An Overview 51 LSTM Memory Cell ht-1 ht hidden state(t) memory cell (t) Training uses Backward Propagation Through Time (BPTT) Candidate memory cell (t) Output gate Input gate Forget gate
  • 52. Implementation in Keras Some steps to highlight: • Formulate the series for a RNN supervised learning regression problem (i.e. (Define target and input tensors)) • Scale all the series • Split the series for training/development/testing • Reshape the series for (Keras) RNN implementation • Define the (initial) architecture of the LSTM Model ○ Define a network of layers that maps your inputs to your targets and the complexity of each layer (i.e. number of memory cells) ○ Configure the learning process by picking a loss function, an optimizer, and metrics to monitor • Produce the forecasts and then reverse-scale the forecasted series • Calculate loss metrics (e.g. RMSE, MAE) 52 Note that stationarity, as defined previously, is not a requirement
  • 53. 53 Continue with the Same Example
  • 54. LSTM Architecture Design, Training, Evaluation 54
  • 56. Section IV Section I: Time series forecasting problem formulation Section II: Univariate & Multivariate time series forecasting Section III: Selected approaches to this problem: v Autoregressive Integrated Moving Average (ARIMA) Model v Vector Autoregressive (VAR) Model v Recurrent Neural Network Ø Formulation Ø Python Implementation Section IV: Comparison of the statistical and neural network approaches Section V: Spark Implementation Consideration 56
  • 57. VAR vs. LSTM: Data Type macroeconomic time series, financial time series, business time series, and other numeric series 57 DNA sequences, images, voice sequences, texts, all the numeric time series (that can be modeled by VAR) VAR LSTM
  • 58. Layer(s) of many non- linear transformations VAR vs. LSTM: Parametric form A linear system of equations - highly parameterized (can be formulated in the general state space model) 58 VAR LSTM
  • 59. • stationarity not a requirement VAR vs. LSTM: Stationarity Requirement • applied to stationary time series only • its variant (e.g. Vector Error Correction Model) can be applied to co-integrated series 59 VAR LSTM
  • 60. • data preprocessing is a lot more involved • network architecture design, model training and hyperparameter tuning requires much more efforts VAR vs. LSTM: Model Implementation • data preprocessing is straight-forward • model specification is relative straight- forward, model training time is fast 60 VAR LSTM
  • 61. Section V Section I: Time series forecasting problem formulation Section II: Univariate & Multivariate time series forecasting Section III: Selected approaches to this problem: v Autoregressive Integrated Moving Average (ARIMA) Model v Vector Autoregressive (VAR) Model v Recurrent Neural Network Ø Formulation Ø Python Implementation Section IV: Comparison of the statistical and neural network approaches Section V: Spark Implementation Consideration 61
  • 62. Implementation using Spark 62 One may think it is straightforward to use Spark for all the steps (outlined above) up to model training, such as ... Ordering in the series matterWait … Not So Fast!
  • 63. Flint: A Time Series Library for Spark 63 Need time-series aware libraries whose operations preserve the natural order of the series, such as Flint, open sourced by TwoSigma (https://github.com/twosigma/flint) - can handle a variety of datafile types, manipulate and aggregate time series, and provide time-series-specific functions, such as (time-series) join, (time-series) filtering, time-based windowing, cycles, and intervalizing) Example: Defining a moving average of the beer consumption series
  • 64. A Few Words on TSA Workflow in Spark 64 Load, pre-process, analyze, transform series using time- series aware libraries Conduct Hyperparameter Tuning in a distributed model and use Spark ML Pipeline