SlideShare a Scribd company logo
1 of 19
Mohamed Baddar
Senior Data Scientist at Careem
Networks GmbH
mohamed.baddar@careem.com
Hybrid Linear and Non-Linear models for time
series prediction - MarketPlace case study
1
2
• Problem Statement
• Notations
• Motivation
• Background
• Hybrid Model
• Marketplace Case Study : Supply Prediction for P2P ride sharing
• Model Pitfalls and possible improvements
• Questions
Agenda
• Customer side objective : Reliable ride sharing service
• Reliability means , whenever a customer asks for ride , he finds a captain
• Captain side :High utilization
• Captains receives requests immediately after he declares himself “free”
• Core task to achieve this objective is to predict supply (number of free captains) and demand (number
of bookings), at each location and time instances
• If a significant gap between supply and demand found, we can fill it by increasing supply
• One way is to apply surge to incentify captains to move to areas in hours where this gap is
expected to happen
• We need to be proactive by predicting the problem and acting before it happen
• Supply prediction can help significantly in that problem
Problem Statement (MarketPlace Objective)
3
Notation
Y Supply
X Surge (Peak) types
T Time patterns (trend , seasonality)
TS Time Series
ARIMA Autoregressive Integrated Moving Average : model for time series forecasting
NN Neural Networks
NL Non Linear Model
L Linear Model
E White Noise Error
4
Currently implemented algorithms
• TS Forecasting : ES,ARIMA : Focus on TS patterns , ARIMA with
covariates assume linear relationship between X and Y
• Machine Learning model : CART, NN capture non linearity
between external factors and predicted quantity but don’t focus
on TS patterns
One possible solution ? Hybridization
• Hybrid model that captures both non linearities between Y and X
and time series patterns T
• Inspired by how ARIMA with covariates is designed
Motivation
5
• Non linear model , output of each layer
is a combination of the set of function in
previous layers. Function are
categorized into, propagation, activation
and output functions
• Parameters
• Number of hidden layers
• Number of neurons in each layer
• (+) Capture complex non linear patterns
• (-) slow training , non interpretable
Neural Networks
6
ARMA (p,q)
• Quantity modeled as linear function of previous values
and fit errors
• Data must be stationary (mean and variance don’t
change over time)
• More complex models are used to capture seasonality
• Combined with Regression to capture effect of external
factors on Time series
• (+) Capture Time series patterns , ARMA structure and
seasonality
• (-) assume linear relationships
ARIMA
7
Data is not (weak) stationary if mean
and variance vary over time
* Differencing
Stationarity and differencing
1-Input Data
2-Log Transformation
Fig 3
8
3- Seasonal and Lag differencing
Hybrid Model (PoC)
NL
(NN) L(ARIMA)
NL.Fitted
L.Fitted
Y,X
E
Y = NL(X) + L(X) + E
* T-D : Transformation and differencing
* Applied for Y and X to preserve
Interpretability
T-D
9
Transform and Stationarize (Y,X) via log transformation and differencing if necessary
NL_M = NULL// Non Linear Model
L_M = NULL // Linear Model
RMSE = Inf
L = 0’s // Assume initial Linear components as 0’s
while(less than max iterations AND delta(RMSE) > threshold)
NL = Y - L
NL_M (Y~X) <- build NN from NL data
K = Y - NL_M_fitted // Remainder from NL_M
L_M = build ARMA(p,q) given K
//If L_M is not NULL model (for ex. p,q both = 0) then hybridization was actually needed
Y.fitted = NL.fitted + L.fitted //assuming mean of E is zero, white noise
E = Y-fitted
sanity check => E is white noise
//updating phase
RMSE = RMSE_calc(E)
Calculate delta(RMSE)
L = L_M_fitted
* RMSE conversion means NL and L models becomes stable and converges
Hybrid Model Building (PoC)
10
Marketplace Case Study (1)
11
Data Description
• Data is partitioned by Zone (For example Berlin Mitte , Dubai Al Barsha)
• For Each zone , data is aggregated on time granularity level (hour, 15 or 30 min)
• A time series is create for the supply level for this zone at this time window
• Hybrid model is applied to model supply relation with time and surge. Also to predict future values with
difference surge values. It works like a what-if analysis tool
MarketPlace Case Study (3)
12
• If needed , seasonal differencing (frequency = 4*24) then Lag1 differencing
• Neural network with backpropagation training is used as a non-linear model
Y <- NN(dow,Hour,Minute,Surge_1,Surge_2)
Y = (average) number of captains , in zone and time window
Dow => day of week
Hour => factor with 24 levels
minute => 4 level factor : 0,15,30,45
Surge_1,2 : different types of Surges (peak)
• Number of neurons per level = 10 , 1 level
• ARMA model for linear model max p, q = 5
• Model applied for each zone in each city
• Maximum number of Hybrid model iterations = 3
Model implementation
13
In-sample data performance
• E = Y - Y.fitted
Y.fitted = NN.fitted +L.fitted
• NN build with 1 layer with 10 neurons
• Sample ARMA model, for one of the dataset :
ARMA(4,2)
• Accuracy, On 5 Zones :
• Average RMSE for NN only = 39
• Average RMSE for NN+ARIMA = 32
• Improvement = approx. 18%
Model Accuracy and diagnostics
14
White noise
Remaining Work after first POC
• On Algorithmic side
• Formal verification of the hybridization method
• Experimenting other NL models (CART, GBM, RF) and L Models for TS (ARIMAX, Transfer
functions)
• Further analysis on algorithm convergence
• Explore modification of core NN optimization to adapt with error AR and MA patterns
• On Implementation
• For R neuralnet packages, sometimes NN fails to build , building algorithm doesn’t converge.
• Scaling method for more zones, doing experimentation on more datasets
• On accuracy measures
• Cross validation for NN and rolling origin for ARIMA (as a kind of unit testing)
• Cross validation and rolling origin for the hybrid model
This still WIP !
15
16
• Apply LSTM RNN for time series prediction
• More complex NN
• More complex TS models
• Estimator variance, stability , multiple initial models to avoid local optima
• Revisiet algorithm convergence
Feedback from audience
Questions
17
We are Hiring !
18
Shukran! Thank you! Danke Schön
19

More Related Content

What's hot

Lesson08_new
Lesson08_newLesson08_new
Lesson08_new
shengvn
 
FPGA-Sketch Board
FPGA-Sketch BoardFPGA-Sketch Board
FPGA-Sketch Board
shahparin
 
Improving Sketch Reconstruction Accuracy
Improving Sketch Reconstruction AccuracyImproving Sketch Reconstruction Accuracy
Improving Sketch Reconstruction Accuracy
Gene Moo Lee
 
Rightand wrong[1]
Rightand wrong[1]Rightand wrong[1]
Rightand wrong[1]
slange
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Simplilearn
 

What's hot (20)

Keeping the same rules 2
Keeping the same rules 2Keeping the same rules 2
Keeping the same rules 2
 
Block diagram representation
Block diagram representationBlock diagram representation
Block diagram representation
 
Chapter 7
Chapter 7Chapter 7
Chapter 7
 
Lesson08_new
Lesson08_newLesson08_new
Lesson08_new
 
Md university cmis 102 week 3 hands
Md university cmis 102 week 3 handsMd university cmis 102 week 3 hands
Md university cmis 102 week 3 hands
 
A Tutorial on Computational Geometry
A Tutorial on Computational GeometryA Tutorial on Computational Geometry
A Tutorial on Computational Geometry
 
FPGA-Sketch Board
FPGA-Sketch BoardFPGA-Sketch Board
FPGA-Sketch Board
 
Adj Exp Smoothing
Adj Exp SmoothingAdj Exp Smoothing
Adj Exp Smoothing
 
Hungarian Method
Hungarian MethodHungarian Method
Hungarian Method
 
Improving Sketch Reconstruction Accuracy
Improving Sketch Reconstruction AccuracyImproving Sketch Reconstruction Accuracy
Improving Sketch Reconstruction Accuracy
 
Ge6757 unit2
Ge6757   unit2Ge6757   unit2
Ge6757 unit2
 
Intro to Forecasting in R - Part 4
Intro to Forecasting in R - Part 4Intro to Forecasting in R - Part 4
Intro to Forecasting in R - Part 4
 
CS6491Project4
CS6491Project4CS6491Project4
CS6491Project4
 
X‾ and r charts
X‾ and r chartsX‾ and r charts
X‾ and r charts
 
Boundary fill algm
Boundary fill algmBoundary fill algm
Boundary fill algm
 
Rightand wrong[1]
Rightand wrong[1]Rightand wrong[1]
Rightand wrong[1]
 
Graph-based SLAM
Graph-based SLAMGraph-based SLAM
Graph-based SLAM
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
 
Trend adjusted exponential smoothing forecasting metho ds
Trend adjusted exponential smoothing forecasting metho dsTrend adjusted exponential smoothing forecasting metho ds
Trend adjusted exponential smoothing forecasting metho ds
 
Run chart
Run chartRun chart
Run chart
 

Similar to ANN ARIMA Hybrid Models for Time Series Prediction

Analysis of algorithn class 2
Analysis of algorithn class 2Analysis of algorithn class 2
Analysis of algorithn class 2
Kumar
 
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
AntareepMajumder
 
GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2O
Sri Ambati
 

Similar to ANN ARIMA Hybrid Models for Time Series Prediction (20)

Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIML
 
DSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfDSJ_Unit I & II.pdf
DSJ_Unit I & II.pdf
 
Data structure and algorithm using java
Data structure and algorithm using javaData structure and algorithm using java
Data structure and algorithm using java
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptx
 
TINET_FRnOG_2008_public
TINET_FRnOG_2008_publicTINET_FRnOG_2008_public
TINET_FRnOG_2008_public
 
timeseries cheat sheet with example code for R
timeseries cheat sheet with example code for Rtimeseries cheat sheet with example code for R
timeseries cheat sheet with example code for R
 
Chpt7
Chpt7Chpt7
Chpt7
 
Applications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency TradingApplications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency Trading
 
Design and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxDesign and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptx
 
Analysis of algorithn class 2
Analysis of algorithn class 2Analysis of algorithn class 2
Analysis of algorithn class 2
 
Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
 
Space time & power.
Space time & power.Space time & power.
Space time & power.
 
iterativealgorithms.ppsx
iterativealgorithms.ppsxiterativealgorithms.ppsx
iterativealgorithms.ppsx
 
Iterative Algorithms.ppsx
Iterative Algorithms.ppsxIterative Algorithms.ppsx
Iterative Algorithms.ppsx
 
GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2O
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
 

Recently uploaded

Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 

Recently uploaded (20)

Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 

ANN ARIMA Hybrid Models for Time Series Prediction

  • 1. Mohamed Baddar Senior Data Scientist at Careem Networks GmbH mohamed.baddar@careem.com Hybrid Linear and Non-Linear models for time series prediction - MarketPlace case study 1
  • 2. 2 • Problem Statement • Notations • Motivation • Background • Hybrid Model • Marketplace Case Study : Supply Prediction for P2P ride sharing • Model Pitfalls and possible improvements • Questions Agenda
  • 3. • Customer side objective : Reliable ride sharing service • Reliability means , whenever a customer asks for ride , he finds a captain • Captain side :High utilization • Captains receives requests immediately after he declares himself “free” • Core task to achieve this objective is to predict supply (number of free captains) and demand (number of bookings), at each location and time instances • If a significant gap between supply and demand found, we can fill it by increasing supply • One way is to apply surge to incentify captains to move to areas in hours where this gap is expected to happen • We need to be proactive by predicting the problem and acting before it happen • Supply prediction can help significantly in that problem Problem Statement (MarketPlace Objective) 3
  • 4. Notation Y Supply X Surge (Peak) types T Time patterns (trend , seasonality) TS Time Series ARIMA Autoregressive Integrated Moving Average : model for time series forecasting NN Neural Networks NL Non Linear Model L Linear Model E White Noise Error 4
  • 5. Currently implemented algorithms • TS Forecasting : ES,ARIMA : Focus on TS patterns , ARIMA with covariates assume linear relationship between X and Y • Machine Learning model : CART, NN capture non linearity between external factors and predicted quantity but don’t focus on TS patterns One possible solution ? Hybridization • Hybrid model that captures both non linearities between Y and X and time series patterns T • Inspired by how ARIMA with covariates is designed Motivation 5
  • 6. • Non linear model , output of each layer is a combination of the set of function in previous layers. Function are categorized into, propagation, activation and output functions • Parameters • Number of hidden layers • Number of neurons in each layer • (+) Capture complex non linear patterns • (-) slow training , non interpretable Neural Networks 6
  • 7. ARMA (p,q) • Quantity modeled as linear function of previous values and fit errors • Data must be stationary (mean and variance don’t change over time) • More complex models are used to capture seasonality • Combined with Regression to capture effect of external factors on Time series • (+) Capture Time series patterns , ARMA structure and seasonality • (-) assume linear relationships ARIMA 7
  • 8. Data is not (weak) stationary if mean and variance vary over time * Differencing Stationarity and differencing 1-Input Data 2-Log Transformation Fig 3 8 3- Seasonal and Lag differencing
  • 9. Hybrid Model (PoC) NL (NN) L(ARIMA) NL.Fitted L.Fitted Y,X E Y = NL(X) + L(X) + E * T-D : Transformation and differencing * Applied for Y and X to preserve Interpretability T-D 9
  • 10. Transform and Stationarize (Y,X) via log transformation and differencing if necessary NL_M = NULL// Non Linear Model L_M = NULL // Linear Model RMSE = Inf L = 0’s // Assume initial Linear components as 0’s while(less than max iterations AND delta(RMSE) > threshold) NL = Y - L NL_M (Y~X) <- build NN from NL data K = Y - NL_M_fitted // Remainder from NL_M L_M = build ARMA(p,q) given K //If L_M is not NULL model (for ex. p,q both = 0) then hybridization was actually needed Y.fitted = NL.fitted + L.fitted //assuming mean of E is zero, white noise E = Y-fitted sanity check => E is white noise //updating phase RMSE = RMSE_calc(E) Calculate delta(RMSE) L = L_M_fitted * RMSE conversion means NL and L models becomes stable and converges Hybrid Model Building (PoC) 10
  • 12. Data Description • Data is partitioned by Zone (For example Berlin Mitte , Dubai Al Barsha) • For Each zone , data is aggregated on time granularity level (hour, 15 or 30 min) • A time series is create for the supply level for this zone at this time window • Hybrid model is applied to model supply relation with time and surge. Also to predict future values with difference surge values. It works like a what-if analysis tool MarketPlace Case Study (3) 12
  • 13. • If needed , seasonal differencing (frequency = 4*24) then Lag1 differencing • Neural network with backpropagation training is used as a non-linear model Y <- NN(dow,Hour,Minute,Surge_1,Surge_2) Y = (average) number of captains , in zone and time window Dow => day of week Hour => factor with 24 levels minute => 4 level factor : 0,15,30,45 Surge_1,2 : different types of Surges (peak) • Number of neurons per level = 10 , 1 level • ARMA model for linear model max p, q = 5 • Model applied for each zone in each city • Maximum number of Hybrid model iterations = 3 Model implementation 13
  • 14. In-sample data performance • E = Y - Y.fitted Y.fitted = NN.fitted +L.fitted • NN build with 1 layer with 10 neurons • Sample ARMA model, for one of the dataset : ARMA(4,2) • Accuracy, On 5 Zones : • Average RMSE for NN only = 39 • Average RMSE for NN+ARIMA = 32 • Improvement = approx. 18% Model Accuracy and diagnostics 14 White noise
  • 15. Remaining Work after first POC • On Algorithmic side • Formal verification of the hybridization method • Experimenting other NL models (CART, GBM, RF) and L Models for TS (ARIMAX, Transfer functions) • Further analysis on algorithm convergence • Explore modification of core NN optimization to adapt with error AR and MA patterns • On Implementation • For R neuralnet packages, sometimes NN fails to build , building algorithm doesn’t converge. • Scaling method for more zones, doing experimentation on more datasets • On accuracy measures • Cross validation for NN and rolling origin for ARIMA (as a kind of unit testing) • Cross validation and rolling origin for the hybrid model This still WIP ! 15
  • 16. 16 • Apply LSTM RNN for time series prediction • More complex NN • More complex TS models • Estimator variance, stability , multiple initial models to avoid local optima • Revisiet algorithm convergence Feedback from audience
  • 19. Shukran! Thank you! Danke Schön 19