SlideShare a Scribd company logo
1 of 55
Download to read offline
Jeffrey Yau
Chief Data Scientist, AllianceBernstein, L.P.
Lecturer, UC Berkeley Masters of Information Data
Science
Time Series Forecasting Using
Recurrent Neural Network and
Vector Autoregressive Model
Motivation - Why MTSA?
2
Daily Interest Rates
Days
InterestRates
Agenda
Section I: Time series forecasting problem formulation
Section II: Multivariate (vs. univariate) time series
forecasting
Section III: Two (of the many) approaches to this problem:
– Vector Autoregressive (VAR) Models
– Long Short Term Memory (LSTM) Network
• Formulation
• Implementation
Section IV: Comparison of the two approaches
Section V: A few words on Spark Implementation
3
Section I
• Time series forecasting problem formulation
• Multivariate (vs. univariate) time series forecasting
• Two (of the many) approaches to this problem:
– Vector Autoregressive (VAR) Models
– Long Short Term Memory (LSTM) Network
• Formulation
• Implementation
• Comparison of the two approaches
• A few words on Spark Implementation
4
Forecasting: Problem Formulation
• Forecasting: predicting the future values of the series
using current information set
• Current information set consists of current and past
values of the series and other exogenous series
5
Time Series Forecasting Requires Models
7
A statistical model or a
machine learning
algorithm
Information Set:
Forecast
horizon: h
What models?
9
A model, f(), could be as simple as “a rule” - naive model:
The forecast for tomorrow is the observed value today
Forecast
horizon: h=1
Information Set:
“Rolling” Average Model
10
It could be a slightly more complicated model
The forecast for time t+1 is an average of the observed
values from a predefined, past k time periods
Forecast horizon: h=1 Information Set:
More Sophisticated Model
11
… but there are other, more sophisticated model that
utilize the information set much more efficiently and aim
at optimizing some metrics
Section II
• Time series forecasting problem formulation
• Multivariate (vs. univariate) time series forecasting
• Two (of the many) approaches to this problem:
– Vector Autoregressive (VAR) Models
– Long Short Term Memory (LSTM) Network
• Formulation
• Implementation
• Comparison of the two approaches
• A few words on Spark Implementation
12
Univariate Time Series Models
Statistical relationship is unidirectional
13
values from its own series
exogenous seriesDynamics of series y
Autoregressive Integrated Moving Average
(ARIMA) Model
14
values from the series shocks / “error” terms exogenous series
Limitation: cannot model cross equation dynamics (or
cross series dependency)
Multivariate Time Series Modeling
15
Kequations
lag-1 of the K series
lag-p of the K series
exogenous series
Dynamics of each of the series Interdependence among the series
Section III.A
• Time series forecasting problem formulation
• Multivariate (vs. univariate) time series forecasting
• Two (of the many) approaches to this problem:
– Vector Autoregressive (VAR) Models
– Long Short Term Memory (LSTM) Network
• Formulation
• Implementation
• Comparison of the two approaches
• A few words on Spark Implementation
16
Joint Modeling of Multiple Time Series
17
● a system of linear equations of the K series being
modeled
● only applies to stationary series
● non-stationary series can be transformed into
stationary ones using simple differencing (note: if the
series are not co-integrated, then we can still apply
VAR ("VAR in differences"))
Vector Autoregressive (VAR) Models
18
Vector Autoregressive (VAR) Model of
Order 1
19
K series 1-lag each
AsystemofKequations
General Steps to Build VAR Model
20
1. Load the series
2. Train/validation/test split the series
3. Conduct exploratory time series data analysis on the training set
4. Determine if the series are stationary
5. Transform the series
6. Build a model on the transformed series
7. Model diagnostic
8. Model selection (based on some pre-defined criterion)
9. Conduct forecast using the final, chosen model
10. Inverse-transform the forecast
11. Conduct forecast evaluation
Iterative
ETSDA - Index of Consumer Sentiment
21
autocorrelation function
(ACF) graph
Partial autocorrelation
function (PACF) graph
ETSDA - Beer Consumption Series
22
Autocorrelation function
Transforming the Series
23
Take the simple-difference of the natural logarithmic transformation of
the series
note: difference-transformation
generates missing values
ETSDA on the Transformed Series
24
Is the method we propose capable of answering the following questions?
● What are the dynamic properties of these series? Own lagged
coefficients
● How are these series interact, if at all? Cross-series lagged coefficients
VAR Model Proposed
25
VAR Model Estimation and Output
26
VAR Model Output - Estimated Coefficients
27put your #assignedhashtag here by setting the footer in view-header/footer
VAR Model Output - Var-Covar Matrix
28
VAR Model Diagnostic
29
UMCSENT Beer
VAR Model Selection
30
Model selection, in the case of VAR(p), is the choice of the order and
the specification of each equation
Information criterion can be used for model selection:
VAR Model - Inverse Transform
31
Don’t forget to inverse-transform the forecasted series!
VAR Model - Forecast Using the Model
32
The Forecast Equation:
VAR Model Forecast
33
where T is the last observation period and l is the lag
What do the result mean in this context?
34
Don’t forget to put the result in the existing context!
Section III.B
• Time series forecasting problem formulation
• Multivariate (vs. univariate) time series forecasting
• Two (of the many) approaches to this problem:
– Vector Autoregressive (VAR) Models
– Long Short Term Memory (LSTM) Network
• Formulation
• Implementation
• Comparison of the two approaches
• A few words on Spark Implementation
35
Standard feed-forward network
36
• network architecture does
not have "memory" built in
• inputs are processed
independently
• no “device” to keep the past
information
Vanilla Recurrent Neural Network (RNN)
37
A structure that can retain past information, track the state of the
world, and update the state of the world as the network moves forward
Limitation of Vanilla RNN Architecture
38
Exploding (and vanishing) gradient problems
(Sepp Hochreiter, 1991 Diploma Thesis)
Long Short Term Memory (LSTM) Network
39
LSTM: Hochreiter and Schmidhuber (1997)
40
The architecture of memory cells and gate units from the original
Hochreiter and Schmidhuber (1997) paper
LSTM
41
Another representation of the architecture of memory cells and gate
units: Greff, Srivastava, Koutnık, Steunebrink, Schmidhuber (2016)
(Vanilla) LSTM: 1-minute Overview
42
Chris Colah’s blog gives an excellent introduction to vanilla RNN and LSTM:
http://colah.github.io/posts/2015-08-Understanding-LSTMs/#fnref1
•
Use memory cells and gated units to information flow
LSTM Memory
Cell
ht-1 h
t
Training uses Backward Propagation Through Time (BPTT)
memory cell (state)
hidden state
current input
output gate
forget gate
input gate
Implementation in Keras
Some steps to highlight:
• Formulate the series for a RNN supervised learning regression problem
(i.e. (Define target and input tensors)
• Scale all the series
• Define the (initial) architecture of the LSTM Model
○ Define a network of layers that maps your inputs to your targets and
the complexity of each layer (i.e. number of neurons)
○ Configure the learning process by picking a loss function, an
optimizer, and metrics to monitor
• Produce the forecasts and then reverse-scale the forecasted series
• Calculate loss metrics (e.g. RMSE, MAE)
43
Note that stationarity, as defined previously, is not a requirement
Consumer Sentiment and Housing Starts
44
LSTM Architecture Design and Training -
A Simple Illustration
45
LSTM: Forecast Results (1)
46
Monthly series:
● Training set: 1981 Jan - 2016 Dec
● Test set: 2017 Jan to 2018 Apr
LSTM: Forecast vs Observed Series
47
Housing Stars Consumer Sentiment
Section IV
• Time series forecasting problem formulation
• Multivariate (vs. univariate) time series forecasting
• Two (of the many) approaches to this problem:
– Vector Autoregressive (VAR) Models
– Long Short Term Memory (LSTM) Network
• Formulation
• Implementation
• Comparison of the two approaches
• A few words on Spark Implementation
48put your #assignedhashtag here by setting the footer in view-header/footer
VAR vs. LSTM: Data Type
macroeconomic time
series, financial time
series, business time
series, and other numeric
series
49
images, texts, all the
numeric time series (that
can be modeled by VAR)
VAR LSTM
layers of non-linear
transformations of input
series
VAR vs. LSTM: Parametric form
a linear system of
equations - highly
parameterized
50
VAR LSTM
• not require
stationarity
VAR vs. LSTM: Stationarity Requirement
• applied to stationary
time series only
• its variant (e.g. Vector
Error Correction
Model) can be applied
to co-integrated series
51
VAR LSTM
• data preprocessing is a
lot more involved
• model training and
hyper-parameter
tuning are time
consuming
VAR vs. LSTM: Model Implementation
• data preprocessing is
straight-forward
• model training time is
fast
52
VAR LSTM
Section IV
• Time series forecasting problem formulation
• Multivariate (vs. univariate) time series forecasting
• Two (of the many) approaches to this problem:
– Vector Autoregressive (VAR) Models
– Long Short Term Memory (LSTM) Network
• Formulation
• Implementation
• Comparison of the two approaches
• A few words on Spark Implementation
53put your #assignedhashtag here by setting the footer in view-header/footer
Implementation using Spark
54
Straightforward to use Spark for all the steps (outlined above) up to model
training, such as ...
Ordering in the series matterWait … Not So Fast!
Flint: A Time Series Library for Spark
55
Need time-series aware libraries whose operations preserve the natural
order of the series, such as Flint, open sourced by TwoSigma
(https://github.com/twosigma/flint)
- can handle a variety of datafile types, manipulate and aggregate time
series, and provide time-series-specific functions, such as (time-series)
join, (time-series) filtering, time-based windowing, cycles, and
intervalizing)
Example: Defining a moving average of the beer consumption series
A Few Words on TSA Workflow in Spark
56
Load, pre-process, analyze,
transform series using
time-series aware libraries
Conduct Hyperparameter
Tuning in a distributed
model and use Spark ML
Pipeline
Thank You!
57

More Related Content

What's hot

Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Databricks
 
Introduction to Simulation
Introduction to SimulationIntroduction to Simulation
Introduction to Simulation
chimco.net
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
Rajiv Advani
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Simplilearn
 
Simulation theory
Simulation theorySimulation theory
Simulation theory
Abu Bashar
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del Pra
Data Science Milan
 

What's hot (20)

Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
 
Machine learning & Time Series Analysis
Machine learning & Time Series AnalysisMachine learning & Time Series Analysis
Machine learning & Time Series Analysis
 
Lesson 5 arima
Lesson 5 arimaLesson 5 arima
Lesson 5 arima
 
Time series analysis
Time series analysisTime series analysis
Time series analysis
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
 
Introduction to Simulation
Introduction to SimulationIntroduction to Simulation
Introduction to Simulation
 
Markov chain
Markov chainMarkov chain
Markov chain
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
 
Time-series Analysis in Minutes
Time-series Analysis in MinutesTime-series Analysis in Minutes
Time-series Analysis in Minutes
 
Simulation theory
Simulation theorySimulation theory
Simulation theory
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del Pra
 
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
 
Demand forecasting by time series analysis
Demand forecasting by time series analysisDemand forecasting by time series analysis
Demand forecasting by time series analysis
 
Seasonal ARIMA
Seasonal ARIMASeasonal ARIMA
Seasonal ARIMA
 
Box jenkins method of forecasting
Box jenkins method of forecastingBox jenkins method of forecasting
Box jenkins method of forecasting
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMs
 

Similar to Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model: When and How with Jeffrey Yau

Power System Simulation: History, State of the Art, and Challenges
Power System Simulation: History, State of the Art, and ChallengesPower System Simulation: History, State of the Art, and Challenges
Power System Simulation: History, State of the Art, and Challenges
Luigi Vanfretti
 
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Luigi Vanfretti
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
inside-BigData.com
 
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Luigi Vanfretti
 
Machine Learning @NECST
Machine Learning @NECSTMachine Learning @NECST
Machine Learning @NECST
NECST Lab @ Politecnico di Milano
 
Nafems15 systeme
Nafems15 systemeNafems15 systeme
Nafems15 systeme
SDTools
 
Seminar_New -CESG
Seminar_New -CESGSeminar_New -CESG
Seminar_New -CESG
Qian Wang
 

Similar to Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model: When and How with Jeffrey Yau (20)

Power System Simulation: History, State of the Art, and Challenges
Power System Simulation: History, State of the Art, and ChallengesPower System Simulation: History, State of the Art, and Challenges
Power System Simulation: History, State of the Art, and Challenges
 
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
 
Co-Simulation Interfacing Capabilities in Device-Level Power Electronic Circu...
Co-Simulation Interfacing Capabilities in Device-Level Power Electronic Circu...Co-Simulation Interfacing Capabilities in Device-Level Power Electronic Circu...
Co-Simulation Interfacing Capabilities in Device-Level Power Electronic Circu...
 
Presentation vision transformersppt.pptx
Presentation vision transformersppt.pptxPresentation vision transformersppt.pptx
Presentation vision transformersppt.pptx
 
CFD_Lecture_1.pdf
CFD_Lecture_1.pdfCFD_Lecture_1.pdf
CFD_Lecture_1.pdf
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
 
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
 
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
 
Machine Learning @NECST
Machine Learning @NECSTMachine Learning @NECST
Machine Learning @NECST
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine Learning
 
Nafems15 systeme
Nafems15 systemeNafems15 systeme
Nafems15 systeme
 
Eclipse VIATRA Overview 2017
Eclipse VIATRA Overview 2017Eclipse VIATRA Overview 2017
Eclipse VIATRA Overview 2017
 
OPAL-RT real-time simulation at RTE
OPAL-RT real-time simulation at RTEOPAL-RT real-time simulation at RTE
OPAL-RT real-time simulation at RTE
 
Simulation Software Performances And Examples
Simulation Software Performances And ExamplesSimulation Software Performances And Examples
Simulation Software Performances And Examples
 
Seminar_New -CESG
Seminar_New -CESGSeminar_New -CESG
Seminar_New -CESG
 
Nafems15 Technical meeting on system modeling
Nafems15 Technical meeting on system modelingNafems15 Technical meeting on system modeling
Nafems15 Technical meeting on system modeling
 
F017423643
F017423643F017423643
F017423643
 
Hardware Implementation of Cascade SVM
Hardware Implementation of Cascade SVMHardware Implementation of Cascade SVM
Hardware Implementation of Cascade SVM
 
Threads and Concurrency Identifying Performance Deviations in Thread Pools
Threads and Concurrency Identifying Performance Deviations in Thread PoolsThreads and Concurrency Identifying Performance Deviations in Thread Pools
Threads and Concurrency Identifying Performance Deviations in Thread Pools
 
Architecture innovations in POWER ISA v3.01 and POWER10
Architecture innovations in POWER ISA v3.01 and POWER10Architecture innovations in POWER ISA v3.01 and POWER10
Architecture innovations in POWER ISA v3.01 and POWER10
 

More from Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 

Recently uploaded (20)

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model: When and How with Jeffrey Yau

  • 1. Jeffrey Yau Chief Data Scientist, AllianceBernstein, L.P. Lecturer, UC Berkeley Masters of Information Data Science Time Series Forecasting Using Recurrent Neural Network and Vector Autoregressive Model
  • 2. Motivation - Why MTSA? 2 Daily Interest Rates Days InterestRates
  • 3. Agenda Section I: Time series forecasting problem formulation Section II: Multivariate (vs. univariate) time series forecasting Section III: Two (of the many) approaches to this problem: – Vector Autoregressive (VAR) Models – Long Short Term Memory (LSTM) Network • Formulation • Implementation Section IV: Comparison of the two approaches Section V: A few words on Spark Implementation 3
  • 4. Section I • Time series forecasting problem formulation • Multivariate (vs. univariate) time series forecasting • Two (of the many) approaches to this problem: – Vector Autoregressive (VAR) Models – Long Short Term Memory (LSTM) Network • Formulation • Implementation • Comparison of the two approaches • A few words on Spark Implementation 4
  • 5. Forecasting: Problem Formulation • Forecasting: predicting the future values of the series using current information set • Current information set consists of current and past values of the series and other exogenous series 5
  • 6. Time Series Forecasting Requires Models 7 A statistical model or a machine learning algorithm Information Set: Forecast horizon: h
  • 7. What models? 9 A model, f(), could be as simple as “a rule” - naive model: The forecast for tomorrow is the observed value today Forecast horizon: h=1 Information Set:
  • 8. “Rolling” Average Model 10 It could be a slightly more complicated model The forecast for time t+1 is an average of the observed values from a predefined, past k time periods Forecast horizon: h=1 Information Set:
  • 9. More Sophisticated Model 11 … but there are other, more sophisticated model that utilize the information set much more efficiently and aim at optimizing some metrics
  • 10. Section II • Time series forecasting problem formulation • Multivariate (vs. univariate) time series forecasting • Two (of the many) approaches to this problem: – Vector Autoregressive (VAR) Models – Long Short Term Memory (LSTM) Network • Formulation • Implementation • Comparison of the two approaches • A few words on Spark Implementation 12
  • 11. Univariate Time Series Models Statistical relationship is unidirectional 13 values from its own series exogenous seriesDynamics of series y
  • 12. Autoregressive Integrated Moving Average (ARIMA) Model 14 values from the series shocks / “error” terms exogenous series Limitation: cannot model cross equation dynamics (or cross series dependency)
  • 13. Multivariate Time Series Modeling 15 Kequations lag-1 of the K series lag-p of the K series exogenous series Dynamics of each of the series Interdependence among the series
  • 14. Section III.A • Time series forecasting problem formulation • Multivariate (vs. univariate) time series forecasting • Two (of the many) approaches to this problem: – Vector Autoregressive (VAR) Models – Long Short Term Memory (LSTM) Network • Formulation • Implementation • Comparison of the two approaches • A few words on Spark Implementation 16
  • 15. Joint Modeling of Multiple Time Series 17
  • 16. ● a system of linear equations of the K series being modeled ● only applies to stationary series ● non-stationary series can be transformed into stationary ones using simple differencing (note: if the series are not co-integrated, then we can still apply VAR ("VAR in differences")) Vector Autoregressive (VAR) Models 18
  • 17. Vector Autoregressive (VAR) Model of Order 1 19 K series 1-lag each AsystemofKequations
  • 18. General Steps to Build VAR Model 20 1. Load the series 2. Train/validation/test split the series 3. Conduct exploratory time series data analysis on the training set 4. Determine if the series are stationary 5. Transform the series 6. Build a model on the transformed series 7. Model diagnostic 8. Model selection (based on some pre-defined criterion) 9. Conduct forecast using the final, chosen model 10. Inverse-transform the forecast 11. Conduct forecast evaluation Iterative
  • 19. ETSDA - Index of Consumer Sentiment 21 autocorrelation function (ACF) graph Partial autocorrelation function (PACF) graph
  • 20. ETSDA - Beer Consumption Series 22 Autocorrelation function
  • 21. Transforming the Series 23 Take the simple-difference of the natural logarithmic transformation of the series note: difference-transformation generates missing values
  • 22. ETSDA on the Transformed Series 24
  • 23. Is the method we propose capable of answering the following questions? ● What are the dynamic properties of these series? Own lagged coefficients ● How are these series interact, if at all? Cross-series lagged coefficients VAR Model Proposed 25
  • 24. VAR Model Estimation and Output 26
  • 25. VAR Model Output - Estimated Coefficients 27put your #assignedhashtag here by setting the footer in view-header/footer
  • 26. VAR Model Output - Var-Covar Matrix 28
  • 28. VAR Model Selection 30 Model selection, in the case of VAR(p), is the choice of the order and the specification of each equation Information criterion can be used for model selection:
  • 29. VAR Model - Inverse Transform 31 Don’t forget to inverse-transform the forecasted series!
  • 30. VAR Model - Forecast Using the Model 32 The Forecast Equation:
  • 31. VAR Model Forecast 33 where T is the last observation period and l is the lag
  • 32. What do the result mean in this context? 34 Don’t forget to put the result in the existing context!
  • 33. Section III.B • Time series forecasting problem formulation • Multivariate (vs. univariate) time series forecasting • Two (of the many) approaches to this problem: – Vector Autoregressive (VAR) Models – Long Short Term Memory (LSTM) Network • Formulation • Implementation • Comparison of the two approaches • A few words on Spark Implementation 35
  • 34. Standard feed-forward network 36 • network architecture does not have "memory" built in • inputs are processed independently • no “device” to keep the past information
  • 35. Vanilla Recurrent Neural Network (RNN) 37 A structure that can retain past information, track the state of the world, and update the state of the world as the network moves forward
  • 36. Limitation of Vanilla RNN Architecture 38 Exploding (and vanishing) gradient problems (Sepp Hochreiter, 1991 Diploma Thesis)
  • 37. Long Short Term Memory (LSTM) Network 39
  • 38. LSTM: Hochreiter and Schmidhuber (1997) 40 The architecture of memory cells and gate units from the original Hochreiter and Schmidhuber (1997) paper
  • 39. LSTM 41 Another representation of the architecture of memory cells and gate units: Greff, Srivastava, Koutnık, Steunebrink, Schmidhuber (2016)
  • 40. (Vanilla) LSTM: 1-minute Overview 42 Chris Colah’s blog gives an excellent introduction to vanilla RNN and LSTM: http://colah.github.io/posts/2015-08-Understanding-LSTMs/#fnref1 • Use memory cells and gated units to information flow LSTM Memory Cell ht-1 h t Training uses Backward Propagation Through Time (BPTT) memory cell (state) hidden state current input output gate forget gate input gate
  • 41. Implementation in Keras Some steps to highlight: • Formulate the series for a RNN supervised learning regression problem (i.e. (Define target and input tensors) • Scale all the series • Define the (initial) architecture of the LSTM Model ○ Define a network of layers that maps your inputs to your targets and the complexity of each layer (i.e. number of neurons) ○ Configure the learning process by picking a loss function, an optimizer, and metrics to monitor • Produce the forecasts and then reverse-scale the forecasted series • Calculate loss metrics (e.g. RMSE, MAE) 43 Note that stationarity, as defined previously, is not a requirement
  • 42. Consumer Sentiment and Housing Starts 44
  • 43. LSTM Architecture Design and Training - A Simple Illustration 45
  • 44. LSTM: Forecast Results (1) 46 Monthly series: ● Training set: 1981 Jan - 2016 Dec ● Test set: 2017 Jan to 2018 Apr
  • 45. LSTM: Forecast vs Observed Series 47 Housing Stars Consumer Sentiment
  • 46. Section IV • Time series forecasting problem formulation • Multivariate (vs. univariate) time series forecasting • Two (of the many) approaches to this problem: – Vector Autoregressive (VAR) Models – Long Short Term Memory (LSTM) Network • Formulation • Implementation • Comparison of the two approaches • A few words on Spark Implementation 48put your #assignedhashtag here by setting the footer in view-header/footer
  • 47. VAR vs. LSTM: Data Type macroeconomic time series, financial time series, business time series, and other numeric series 49 images, texts, all the numeric time series (that can be modeled by VAR) VAR LSTM
  • 48. layers of non-linear transformations of input series VAR vs. LSTM: Parametric form a linear system of equations - highly parameterized 50 VAR LSTM
  • 49. • not require stationarity VAR vs. LSTM: Stationarity Requirement • applied to stationary time series only • its variant (e.g. Vector Error Correction Model) can be applied to co-integrated series 51 VAR LSTM
  • 50. • data preprocessing is a lot more involved • model training and hyper-parameter tuning are time consuming VAR vs. LSTM: Model Implementation • data preprocessing is straight-forward • model training time is fast 52 VAR LSTM
  • 51. Section IV • Time series forecasting problem formulation • Multivariate (vs. univariate) time series forecasting • Two (of the many) approaches to this problem: – Vector Autoregressive (VAR) Models – Long Short Term Memory (LSTM) Network • Formulation • Implementation • Comparison of the two approaches • A few words on Spark Implementation 53put your #assignedhashtag here by setting the footer in view-header/footer
  • 52. Implementation using Spark 54 Straightforward to use Spark for all the steps (outlined above) up to model training, such as ... Ordering in the series matterWait … Not So Fast!
  • 53. Flint: A Time Series Library for Spark 55 Need time-series aware libraries whose operations preserve the natural order of the series, such as Flint, open sourced by TwoSigma (https://github.com/twosigma/flint) - can handle a variety of datafile types, manipulate and aggregate time series, and provide time-series-specific functions, such as (time-series) join, (time-series) filtering, time-based windowing, cycles, and intervalizing) Example: Defining a moving average of the beer consumption series
  • 54. A Few Words on TSA Workflow in Spark 56 Load, pre-process, analyze, transform series using time-series aware libraries Conduct Hyperparameter Tuning in a distributed model and use Spark ML Pipeline