SlideShare a Scribd company logo
1 of 45
Download to read offline
國立臺北護理健康大學 NTUHS
Forecasting Model
Orozco Hsu
2021-05-20
1
About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
「How can you not get romantic about baseball ? 」
Tutorial
Content
3
What is the forecasting
Random Walk for Time-Series
Hidden Markov Model for stock prediction
Homework
Multi-variables or Periodic attributes Time-Series
prediction (WEKA)
Code
• Download code
• https://github.com/orozcohsu/ntunhs_2020/tree/master/alg_20210520
4
What is the forecasting
5
What is the forecasting
• Key Terms and differences
Forecasting
Predictive
Modeling
6
What is the forecasting
• Forecasting is a process of predicting or estimating the future based
on past and present data.
• Examples:
• How many passengers can we expect in a given flight?
• How many customer calls can we expect in next hour? (Poisson distribution)
• Weather forecasting
• Stock market forecasting
7
What is the forecasting
• Passengers count forecasting for next one year
Time series
8
What is the forecasting
• Predictive modeling, used to perform prediction more granular.
• Complex variable test
• Data sampling
• Feature selection
• Data visualization
• Examples:
• Who are the customers who are likely to buy a product in next month?
• And then taking action accordingly.
9
What is the forecasting
• Some useful tools
• Moving average forecasting (MA)
• Simple Exponential Smoothing forecasting(SES; Holt-Winters seasonal method)
• Stepwise Autoregressive forecasting
• Autoregressive Integrated Moving Average model forecasting (ARIMA)
• Autoregressive conditional heteroskedasticity forecasting (ARCH)
• Value at Risk
• Hidden Markov Model forecasting (HMM)
• Multi-variables or Periodic attributes Time-Series forecasting
10
What is the forecasting
• Random or non-random series
• Yes, it’s a random series, so we can’t forecast it anything.
• No, it’s a non-random series, but it’s chaotic time series. The purpose of
chaos theory is to reveal the simple laws that may be hidden behind
seemingly random series.
11
It’s a real mess. It looks nothing like a time series.
It also NOT a random walk series.
Chaotic time series: it looks like the random series, but it
generates from a specific function behind. Refer to chaotic
time series analysis.
Question: A series of random number were generated by computer (seed), is that a RANDOM OR NON-RANDOM?
What is the forecasting
• Stationary or non-stationary series
• An assumption must be made that our observations all come from the same
distribution function, it is called stationary.
• Indicates that the standard deviation and/or average value of time series
data will NOT change over time and it has a certain regularity.
• Time series data will be stationary if seasonality and trend are removed.
• From this point of view, we discuss correlation.
12
What is the forecasting
• Correlation
• Time series data can be analyzed and evaluated from the past few data to see
the correlation, such as measuring the relationship between Xt, Xt-1, Xt-2, …Xt-n
• Indicates the degree of correlation between time series data and previous data,
Autocorrelation function (ACF) can be used to determine whether this time
series data has stationarity or seasonality or periodicity properties.
• ACF value between [-1,1], 1: Positive correlation, -1: Negative correlation, 0: Not
relevant.
13
ACF value by lag0~lag24 with higher than statistically significant (fixed value),
It means the series is non-stationary
Fixed value
Random Walk for Time-Series
14
Random Walk for Time-Series
• We know autocorrelation is closer to zero, it is a random walk series.
• A random walk is unpredictable; it cannot reasonably be predicted.
• Why?
• Because we know that the next time step will be a function of the prior time step,
highly relatived with time.
• We called it as naive forecast, or a persistence model.
• Many time series are random walks, particularly those of security prices
over time.
• The random walk hypothesis is a theory that stock market prices are a
random walk and cannot be predicted.
15
Random Walk for Time-Series
• A random walk is different from a list of random numbers because the
next value in the sequence is a modification of the previous value in
the sequence.
• Start with a random number of either -1 or 1
• Randomly select a -1 or 1 and add it to the observation from the previous
time step.
• Repeat step 2 for as long as you like.
• From step-to-step rather than the large jumps that a series of
independent, random numbers provides.
y(t) = B0 + B1*X(t-1) + e(t) White noise/ random function
Observation at the previous time step
Coefficient to weight
16
The next value
in the series
Coefficient
Random Walk for Time-Series
• Random Walk and Autocorrelation
• Calculate the correlation between each observation and the observations at
previous time steps.
• It is constructed, we would expect a strong autocorrelation with the previous
observation and a linear fall off from there with previous lag values.
17
Lag
Random walk is constructed, so the beginning of lag the
autocorrelation value is high. (The current observation is a
random step from the previous observation)
Random Walk for Time-Series
• Random Walk and Stationarity (ADF)
• A stationary time series is one where the values are NOT a function of time.
• Use Augmented Dickey-Fuller test (ADF). Range: [-1,1] => Most positive/ negative
series autocorrelation; 0: non-autocorrelation
• Predicting a Random Walk
• A random walk is unpredictable; it cannot reasonably be predicted.
• Make a stationary Random Walk Time-Series
• Difference (Lag).
• All correlations are small, close to zero.
random_walk_demo.ipynb
predicting_a_random_walk.ipynb
18
Hidden Markov Model for stock prediction
19
Hidden Markov Model for stock prediction
• A hidden process of unknown parameters.
• The most difficult part is unknown parameters of observation.
20
Hidden Markov Model for stock prediction
• Definition
• Roll the dice 10 times, the series are [1 6 3 5 2 7 3 5 2 4] => 「Visible status chain」
• Invisible status chain may be from those dices [D6 D8 D8 D6 D4 D8 D6 D6 D4 D8]
• Visible status
• Status emission probability
• Invisible status
• Status transition probability, it is a HMM
21
Human: Status chain
Computer: Transition probability
Hidden Markov Model for stock prediction
• Question
• Known types of dice, and transition probability
• From visible status results to find the which of dice.
• Knows types of dice, and transition probability
• From visible status results to find the probability of dice.
• Only known types of dice.
• From visible status results to find the transition probability.
22
Hidden Markov Model for stock prediction
• Find the probability
• P = P(D6)*P(D6 to 1)*P(D6 to D8)*P(D8 to 6)*P(D8 to D8)*P(D8 to 3)
• => 1/3 * 1/6 * 1/3 * 1/8 * 1/3 * 1/8
23
Hidden Markov Model for stock prediction
• Find the probability
• Visible state are in series [1 6 3 5 2 7 3 5 2 4]
24
P1(D4) => 1/4
P2(D6) = P1(D4)*P(D4 to 1)*P(D4 to D6)*P(D6 to 6) => 1/3* 1/4* 1/3* 1/6
P3(D4) = P2(D6)*P(D6 to D4)*P(D4 to 3) => 1/216 * 1/3 * 1/4
1 point with the highest probability to dice D4
Three types of dice we have
• D4
• D6
• D8
Hidden Markov Model for stock prediction
• Sum all of probabilities
25
We have 3 types of dice, and randomly pick one of dice and roll with 1 point.
The probability is 18%
Hidden Markov Model for stock prediction
• Sum all of probabilities
26
Hidden Markov Model for stock prediction
• Sum all of probabilities
27
Hidden Markov Model for stock prediction
• How to calculate the most probability of invisible state
• Brute force (for each result of all combination, p21~p23)
• Viterbi algorithm (make a hidden status chain from visible states)
28
x: hidden status
y: visible status
a: transition probabilities
b: emission probabilities
Dice: X2, X1, X3
Homework1: Continue to the visible series of dice [1 6 3 5 2 7 3 5 2 4], P4 ~ P10
Hidden Markov Model for stock prediction
• FB market stock prediction (Using Markov property, not pure time series)
• Download historical data from
• https://www.nasdaq.com/market-activity/stocks/fb/historical
29
hmm_makrtket_stock_prediction.ipynb
Stock data we used is not based on random walk time series data, check the code!
Hidden Markov Model for stock prediction
• Extract feature
30
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
31
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• WEKA
• The workbench for machine learning.
• It is widely used for teaching, research, and industrial applications, contains a
plethora of built-in tools for standard machine learning tasks, and additionally
gives transparent access to well-known toolboxes such as scikit-learn, R.
• WEKA Installation and import packages
• Demo
https://www.cs.waikato.ac.nz/ml/weka/
https://sourceforge.net/projects/weka/
32
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
33
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
34
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Sales order prediction
Observation
Overlay data
Row data remove or not
• WEKA requires
• Input data format must be ARFF
• You need to convert your csv
• Covert to WEKA ARFF format
• https://pulipulichen.github.io/jieba-
js/weka/spreadsheet2arff/index.html
• Download the ARFF files
35
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Load Data
• Find the ARFF file
36
4 columns
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Basic configuration
• Number of time units forecast = 25
• Periodicity = Daily
• Copy the content from skip_list-
sales_order_1_8_p9.txt and paste
it
• Check Perform evaluation
37
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Base learner
• LinearRegression
• MultilayerPerceptron
• SMOreg (SVM)
38
Homework2: Continue to improve
performance in WEKA sales order
prediction
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Click 「periodic attributes」
• Check Customize
• Load file form periodics_set-
sales_order_1_8_p9.periodics
39
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Output
• Check Graph target at steps
• Press Start button
40
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Output
Prediction with
asterisk 41
Using Linear Regression Model with those variables
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Output/ Prediction at steps
42
NOT so good, is it good using prediction model to
forecast future value?
Multi-variables or Periodic attributes Time-
Series prediction (WEKA)
• Output/ Future prediction
43
Homework
44
Homework
• Calculate the invisible states from visible series of dice [1 6 3 5 2 7 3 5
2 4]
• Continue to improve performance in WEKA sales order prediction
• Change algorithms
• Add more observation data
• If we don’t use Periodic attributes or Overlay data
45

More Related Content

What's hot

Design Test Case Technique (Equivalence partitioning And Boundary value analy...
Design Test Case Technique (Equivalence partitioning And Boundary value analy...Design Test Case Technique (Equivalence partitioning And Boundary value analy...
Design Test Case Technique (Equivalence partitioning And Boundary value analy...Ryan Tran
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniquesAshutosh Garg
 
Test design techniques: Structured and Experienced-based techniques
Test design techniques: Structured and Experienced-based techniquesTest design techniques: Structured and Experienced-based techniques
Test design techniques: Structured and Experienced-based techniquesKhuong Nguyen
 
Feature selection
Feature selectionFeature selection
Feature selectiondkpawar
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and TechniqueSachin-QA
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and TechniqueANKUR-BA
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and TechniqueFayis-QA
 
State transition testing-software_testing
State transition testing-software_testingState transition testing-software_testing
State transition testing-software_testingMidhun S
 
Test design techniques
Test design techniquesTest design techniques
Test design techniquesOksana
 
Test design techniques
Test design techniquesTest design techniques
Test design techniquesBipul Roy Bpl
 

What's hot (10)

Design Test Case Technique (Equivalence partitioning And Boundary value analy...
Design Test Case Technique (Equivalence partitioning And Boundary value analy...Design Test Case Technique (Equivalence partitioning And Boundary value analy...
Design Test Case Technique (Equivalence partitioning And Boundary value analy...
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
Test design techniques: Structured and Experienced-based techniques
Test design techniques: Structured and Experienced-based techniquesTest design techniques: Structured and Experienced-based techniques
Test design techniques: Structured and Experienced-based techniques
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and Technique
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and Technique
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and Technique
 
State transition testing-software_testing
State transition testing-software_testingState transition testing-software_testing
State transition testing-software_testing
 
Test design techniques
Test design techniquesTest design techniques
Test design techniques
 
Test design techniques
Test design techniquesTest design techniques
Test design techniques
 

Similar to 國立臺北護理健康大學 NTUHS Forecasting Model Analysis

Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...tboubez
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureMarco Parenzan
 
prediction of_inventory_management
prediction of_inventory_managementprediction of_inventory_management
prediction of_inventory_managementFEG
 
Outlier analysis for Temporal Datasets
Outlier analysis for Temporal DatasetsOutlier analysis for Temporal Datasets
Outlier analysis for Temporal DatasetsQuantUniversity
 
Discovering signal in financial time series- where and how to start
Discovering signal in financial time series- where and how to startDiscovering signal in financial time series- where and how to start
Discovering signal in financial time series- where and how to startNicholasSherman11
 
Lesson 2 stationary_time_series
Lesson 2 stationary_time_seriesLesson 2 stationary_time_series
Lesson 2 stationary_time_seriesankit_ppt
 
Deep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentationDeep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentation경훈 김
 
ARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptREFOTDEBuea
 
6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdf6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdfssuserdca880
 
New software testing-techniques
New software testing-techniquesNew software testing-techniques
New software testing-techniquesFincy V.J
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionChittagong Independent University
 
Applications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency TradingApplications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency TradingAyan Sengupta
 
Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01Mr. Jhon
 
Seii unit6 software-testing-techniques
Seii unit6 software-testing-techniquesSeii unit6 software-testing-techniques
Seii unit6 software-testing-techniquesAhmad sohail Kakar
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...tboubez
 
Deep dive time series anomaly detection with different Azure Data Services
Deep dive time series anomaly detection with different Azure Data ServicesDeep dive time series anomaly detection with different Azure Data Services
Deep dive time series anomaly detection with different Azure Data ServicesMarco Parenzan
 

Similar to 國立臺北護理健康大學 NTUHS Forecasting Model Analysis (20)

Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and Azure
 
prediction of_inventory_management
prediction of_inventory_managementprediction of_inventory_management
prediction of_inventory_management
 
Outlier analysis for Temporal Datasets
Outlier analysis for Temporal DatasetsOutlier analysis for Temporal Datasets
Outlier analysis for Temporal Datasets
 
ARIMA
ARIMA ARIMA
ARIMA
 
Discovering signal in financial time series- where and how to start
Discovering signal in financial time series- where and how to startDiscovering signal in financial time series- where and how to start
Discovering signal in financial time series- where and how to start
 
Lesson 2 stationary_time_series
Lesson 2 stationary_time_seriesLesson 2 stationary_time_series
Lesson 2 stationary_time_series
 
Deep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentationDeep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentation
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
 
ARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.ppt
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
 
6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdf6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdf
 
New software testing-techniques
New software testing-techniquesNew software testing-techniques
New software testing-techniques
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector Regression
 
Applications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency TradingApplications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency Trading
 
Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01
 
Seii unit6 software-testing-techniques
Seii unit6 software-testing-techniquesSeii unit6 software-testing-techniques
Seii unit6 software-testing-techniques
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
 
Deep dive time series anomaly detection with different Azure Data Services
Deep dive time series anomaly detection with different Azure Data ServicesDeep dive time series anomaly detection with different Azure Data Services
Deep dive time series anomaly detection with different Azure Data Services
 

More from FEG

Sequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfSequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfFEG
 
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdfFEG
 
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdfFEG
 
Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318FEG
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practicesFEG
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratchFEG
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratchFEG
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratchFEG
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_RulesFEG
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)FEG
 
202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis VisualizationFEG
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)FEG
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)FEG
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)FEG
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised LearningFEG
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning ClusteringFEG
 
Data Visualization in Excel
Data Visualization in ExcelData Visualization in Excel
Data Visualization in ExcelFEG
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdfFEG
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdfFEG
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdfFEG
 

More from FEG (20)

Sequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfSequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdf
 
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
 
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
 
Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)
 
202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning Clustering
 
Data Visualization in Excel
Data Visualization in ExcelData Visualization in Excel
Data Visualization in Excel
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdf
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

國立臺北護理健康大學 NTUHS Forecasting Model Analysis

  • 2. About me • Education • NCU (MIS)、NCCU (CS) • Work Experience • Telecom big data Innovation • AI projects • Retail marketing technology • User Group • TW Spark User Group • TW Hadoop User Group • Taiwan Data Engineer Association Director • Research • Big Data/ ML/ AIOT/ AI Columnist 2 「How can you not get romantic about baseball ? 」
  • 3. Tutorial Content 3 What is the forecasting Random Walk for Time-Series Hidden Markov Model for stock prediction Homework Multi-variables or Periodic attributes Time-Series prediction (WEKA)
  • 4. Code • Download code • https://github.com/orozcohsu/ntunhs_2020/tree/master/alg_20210520 4
  • 5. What is the forecasting 5
  • 6. What is the forecasting • Key Terms and differences Forecasting Predictive Modeling 6
  • 7. What is the forecasting • Forecasting is a process of predicting or estimating the future based on past and present data. • Examples: • How many passengers can we expect in a given flight? • How many customer calls can we expect in next hour? (Poisson distribution) • Weather forecasting • Stock market forecasting 7
  • 8. What is the forecasting • Passengers count forecasting for next one year Time series 8
  • 9. What is the forecasting • Predictive modeling, used to perform prediction more granular. • Complex variable test • Data sampling • Feature selection • Data visualization • Examples: • Who are the customers who are likely to buy a product in next month? • And then taking action accordingly. 9
  • 10. What is the forecasting • Some useful tools • Moving average forecasting (MA) • Simple Exponential Smoothing forecasting(SES; Holt-Winters seasonal method) • Stepwise Autoregressive forecasting • Autoregressive Integrated Moving Average model forecasting (ARIMA) • Autoregressive conditional heteroskedasticity forecasting (ARCH) • Value at Risk • Hidden Markov Model forecasting (HMM) • Multi-variables or Periodic attributes Time-Series forecasting 10
  • 11. What is the forecasting • Random or non-random series • Yes, it’s a random series, so we can’t forecast it anything. • No, it’s a non-random series, but it’s chaotic time series. The purpose of chaos theory is to reveal the simple laws that may be hidden behind seemingly random series. 11 It’s a real mess. It looks nothing like a time series. It also NOT a random walk series. Chaotic time series: it looks like the random series, but it generates from a specific function behind. Refer to chaotic time series analysis. Question: A series of random number were generated by computer (seed), is that a RANDOM OR NON-RANDOM?
  • 12. What is the forecasting • Stationary or non-stationary series • An assumption must be made that our observations all come from the same distribution function, it is called stationary. • Indicates that the standard deviation and/or average value of time series data will NOT change over time and it has a certain regularity. • Time series data will be stationary if seasonality and trend are removed. • From this point of view, we discuss correlation. 12
  • 13. What is the forecasting • Correlation • Time series data can be analyzed and evaluated from the past few data to see the correlation, such as measuring the relationship between Xt, Xt-1, Xt-2, …Xt-n • Indicates the degree of correlation between time series data and previous data, Autocorrelation function (ACF) can be used to determine whether this time series data has stationarity or seasonality or periodicity properties. • ACF value between [-1,1], 1: Positive correlation, -1: Negative correlation, 0: Not relevant. 13 ACF value by lag0~lag24 with higher than statistically significant (fixed value), It means the series is non-stationary Fixed value
  • 14. Random Walk for Time-Series 14
  • 15. Random Walk for Time-Series • We know autocorrelation is closer to zero, it is a random walk series. • A random walk is unpredictable; it cannot reasonably be predicted. • Why? • Because we know that the next time step will be a function of the prior time step, highly relatived with time. • We called it as naive forecast, or a persistence model. • Many time series are random walks, particularly those of security prices over time. • The random walk hypothesis is a theory that stock market prices are a random walk and cannot be predicted. 15
  • 16. Random Walk for Time-Series • A random walk is different from a list of random numbers because the next value in the sequence is a modification of the previous value in the sequence. • Start with a random number of either -1 or 1 • Randomly select a -1 or 1 and add it to the observation from the previous time step. • Repeat step 2 for as long as you like. • From step-to-step rather than the large jumps that a series of independent, random numbers provides. y(t) = B0 + B1*X(t-1) + e(t) White noise/ random function Observation at the previous time step Coefficient to weight 16 The next value in the series Coefficient
  • 17. Random Walk for Time-Series • Random Walk and Autocorrelation • Calculate the correlation between each observation and the observations at previous time steps. • It is constructed, we would expect a strong autocorrelation with the previous observation and a linear fall off from there with previous lag values. 17 Lag Random walk is constructed, so the beginning of lag the autocorrelation value is high. (The current observation is a random step from the previous observation)
  • 18. Random Walk for Time-Series • Random Walk and Stationarity (ADF) • A stationary time series is one where the values are NOT a function of time. • Use Augmented Dickey-Fuller test (ADF). Range: [-1,1] => Most positive/ negative series autocorrelation; 0: non-autocorrelation • Predicting a Random Walk • A random walk is unpredictable; it cannot reasonably be predicted. • Make a stationary Random Walk Time-Series • Difference (Lag). • All correlations are small, close to zero. random_walk_demo.ipynb predicting_a_random_walk.ipynb 18
  • 19. Hidden Markov Model for stock prediction 19
  • 20. Hidden Markov Model for stock prediction • A hidden process of unknown parameters. • The most difficult part is unknown parameters of observation. 20
  • 21. Hidden Markov Model for stock prediction • Definition • Roll the dice 10 times, the series are [1 6 3 5 2 7 3 5 2 4] => 「Visible status chain」 • Invisible status chain may be from those dices [D6 D8 D8 D6 D4 D8 D6 D6 D4 D8] • Visible status • Status emission probability • Invisible status • Status transition probability, it is a HMM 21 Human: Status chain Computer: Transition probability
  • 22. Hidden Markov Model for stock prediction • Question • Known types of dice, and transition probability • From visible status results to find the which of dice. • Knows types of dice, and transition probability • From visible status results to find the probability of dice. • Only known types of dice. • From visible status results to find the transition probability. 22
  • 23. Hidden Markov Model for stock prediction • Find the probability • P = P(D6)*P(D6 to 1)*P(D6 to D8)*P(D8 to 6)*P(D8 to D8)*P(D8 to 3) • => 1/3 * 1/6 * 1/3 * 1/8 * 1/3 * 1/8 23
  • 24. Hidden Markov Model for stock prediction • Find the probability • Visible state are in series [1 6 3 5 2 7 3 5 2 4] 24 P1(D4) => 1/4 P2(D6) = P1(D4)*P(D4 to 1)*P(D4 to D6)*P(D6 to 6) => 1/3* 1/4* 1/3* 1/6 P3(D4) = P2(D6)*P(D6 to D4)*P(D4 to 3) => 1/216 * 1/3 * 1/4 1 point with the highest probability to dice D4 Three types of dice we have • D4 • D6 • D8
  • 25. Hidden Markov Model for stock prediction • Sum all of probabilities 25 We have 3 types of dice, and randomly pick one of dice and roll with 1 point. The probability is 18%
  • 26. Hidden Markov Model for stock prediction • Sum all of probabilities 26
  • 27. Hidden Markov Model for stock prediction • Sum all of probabilities 27
  • 28. Hidden Markov Model for stock prediction • How to calculate the most probability of invisible state • Brute force (for each result of all combination, p21~p23) • Viterbi algorithm (make a hidden status chain from visible states) 28 x: hidden status y: visible status a: transition probabilities b: emission probabilities Dice: X2, X1, X3 Homework1: Continue to the visible series of dice [1 6 3 5 2 7 3 5 2 4], P4 ~ P10
  • 29. Hidden Markov Model for stock prediction • FB market stock prediction (Using Markov property, not pure time series) • Download historical data from • https://www.nasdaq.com/market-activity/stocks/fb/historical 29 hmm_makrtket_stock_prediction.ipynb Stock data we used is not based on random walk time series data, check the code!
  • 30. Hidden Markov Model for stock prediction • Extract feature 30
  • 31. Multi-variables or Periodic attributes Time- Series prediction (WEKA) 31
  • 32. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • WEKA • The workbench for machine learning. • It is widely used for teaching, research, and industrial applications, contains a plethora of built-in tools for standard machine learning tasks, and additionally gives transparent access to well-known toolboxes such as scikit-learn, R. • WEKA Installation and import packages • Demo https://www.cs.waikato.ac.nz/ml/weka/ https://sourceforge.net/projects/weka/ 32
  • 33. Multi-variables or Periodic attributes Time- Series prediction (WEKA) 33
  • 34. Multi-variables or Periodic attributes Time- Series prediction (WEKA) 34
  • 35. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Sales order prediction Observation Overlay data Row data remove or not • WEKA requires • Input data format must be ARFF • You need to convert your csv • Covert to WEKA ARFF format • https://pulipulichen.github.io/jieba- js/weka/spreadsheet2arff/index.html • Download the ARFF files 35
  • 36. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Load Data • Find the ARFF file 36 4 columns
  • 37. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Basic configuration • Number of time units forecast = 25 • Periodicity = Daily • Copy the content from skip_list- sales_order_1_8_p9.txt and paste it • Check Perform evaluation 37
  • 38. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Base learner • LinearRegression • MultilayerPerceptron • SMOreg (SVM) 38 Homework2: Continue to improve performance in WEKA sales order prediction
  • 39. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Click 「periodic attributes」 • Check Customize • Load file form periodics_set- sales_order_1_8_p9.periodics 39
  • 40. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Output • Check Graph target at steps • Press Start button 40
  • 41. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Output Prediction with asterisk 41 Using Linear Regression Model with those variables
  • 42. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Output/ Prediction at steps 42 NOT so good, is it good using prediction model to forecast future value?
  • 43. Multi-variables or Periodic attributes Time- Series prediction (WEKA) • Output/ Future prediction 43
  • 45. Homework • Calculate the invisible states from visible series of dice [1 6 3 5 2 7 3 5 2 4] • Continue to improve performance in WEKA sales order prediction • Change algorithms • Add more observation data • If we don’t use Periodic attributes or Overlay data 45