SlideShare a Scribd company logo
1 of 25
Prepared By:
Ishani Desai
Karan Shah
Ketu Shah
Shubham Gupta
Flight Delay
Prediction Model
Shubham Gupta
Karan ShahKetu Shah
Ishani Desai
• Flight Delay has emerged as a prime factor for economic
loss for airlines.
• Flight Delay has negative impact on business reputation
and demand of airlines as well.
Business Problem Overview
• Develop a business model to predict flight delays.
• Optimize flight operations.
• Reduce further economic loss for airlines.
• Lessen inconvenience occurred to passengers.
Goal and Objective
• As of 2007, airline industries
incurred average cost of
around $11,300 per delayed
flight based upon 61,000
delayed flights per month
average.
• According to latest
estimates, the cost of aircraft
block time for U.S. passenger
airlines was $81.18 per
minute.
Literature Review
• The data is taken from United States Bureau of Transportation
Statistics.
• The dataset represents 4 years of flight delay information for
the state of Washington.
• Dataset contains over 2500 records and 48 attributes for
February month.
• Attributes:
• Origin Airport
• Destination Airport
• Flight Number
• Airline
• Date of Flight
• Delay Information
Data Source
Original Dataset
Selected Attributes
Derived attributes for Model
Data Preparation
Selected attributes from years 2013,
2014, and 2015
Derived attributes from years 2013,
2014, and 2015
Selected attributes from year 2016
Derived attributes from year 2016
Training Data Testing Data
• K-Nearest Neighbors (K-NN)
• Weighted K-Nearest Neighbors (KK-NN)
• Decision Tree: CART
• Decision Tree: C4.5
Algorithms Used
• In Pattern recognition and statistical estimation, the k-nearest neighbors
algorithm is used for classification and regression.
• For classification, K-nearest neighbors is a simple algorithm that stores
all available cases and classifies new cases based on a distance matrix.
Euclidean Distance Function
K-Nearest Neighbors
attributestherepresent
,...,,and,,...,,where
)(),(
2121
2
Euclidean
m
yyyxxx
yxd
mm
i
ii

 
yx
yx
86.05
82.68
81.44 81.34 81.36
81.18
78
79
80
81
82
83
84
85
86
87
K=2 K=5 K=10 K=15 K=20 K=25
K-NN
Series1
K-NN Graph
K Value Result
K=2 86.05
K=5 82.68
K=10 81.44
K=15 81.34
K=20 81.36
K=25 81.18
• This extension is based on the idea, that such observations within the
learning set, which are particularly close to the new observation (y, x),
should get a higher weight in the decision than such neighbors that are
far away from (y, x).
• This is not the case with kNN: Indeed only the k nearest neighbors
influence the prediction; however, this influence is the same for each of
these neighbors, although the individual similarity to (y, x) might be
widely different.
• To reach this aim, the distances, on which the search for the nearest
neighbors is based in the first step, have to be transformed into similarity
measures, which can be used as weights
Weighted K-Nearest Neighbors
KK-NN Graph
85.35
84.82 84.87
84.52
83.11 83.07
82.59
81
81.5
82
82.5
83
83.5
84
84.5
85
85.5
86
K=1 K=2 K=5 K=10 K=15 K=20 K=25
KK-NN
Series1
K Value Result
K=1 85.35
K=2 84.52
K=5 84.87
K=10 84.52
K=15 83.11
K=20 83.07
K=25 82.59
• The purpose of the analysis via tree-building algorithm is to determine a
set of if-then logical split conditions that permit accurate predictions or
classification of cases.
• A Classification tree will determine a set of logical if-then conditions
instead of linear equations for predicting or classifying cases.
• The general approach to derive predictions from if-then conditions can
also be applied to regression tree as well.
• Advantages of CART:
• Simplicity
• Nonparametric and Nonlinear
Classification and Regression Tree
CART Implementation
Predictions 0 1
0 1742 397
1 100 41
Accuracy: 78.20%
• C4.5 algorithm is used to generate decision tree that can be used for
classification and so referred as Statistical Classifier.
• C4.5 permits numeric attributes and deals sensibly with missing values.
• C4.5 uses attributes with continuous data and different weights.
• C4.5 follows Post-pruning approach to deal with noisy data and remove
a sub-tree from fully developed decision tree.
C4.5 Algorithm
C4.5 Decision Tree
Predictions 0 1
0 1815 431
1 27 7
Accuracy: 79.91%
Analysis and
Statistics of Flight
Delays
Total Delayed for 2016
1.89%
67.23%
5.37%
7.45%
1.07%
2.65%
8.21%
2.46%
3.66%
Total Delayed for 2015
Flight Delay [ Airlines ]
1.58%
65.26%
7.97%
6.92%
0.60%
2.41%
9.55%
2.03%
3.68%
AA
AS
B6
DL
F9
HA
OO
UA
VX
Popular Route Delays Information
0
10
20
30
40
50
60
70
80
90
100
NO. OF DELAYS IN POPULAR ROUTES
NO. OF DELAY 2015 NO. OF DELAY 2016
0%
1%
2%
3%
4%
5%
6%
7%
SEA-LAX SEA-SFO SEA-JFK SEA-ORD SEA-DFW SEA-Boston SEA-ATL
% SHARE OF POPULAR ROUTES IN TOTAL
DELAY
% SHARE IN TOTAL DELAY 2015 % SHARE IN TOTAL DELAY 2016
0
20
40
60
80
100
120
1-Feb
2-Feb
3-Feb
4-Feb
5-Feb
6-Feb
7-Feb
8-Feb
9-Feb
10-Feb
11-Feb
12-Feb
13-Feb
14-Feb
15-Feb
16-Feb
17-Feb
18-Feb
19-Feb
20-Feb
21-Feb
22-Feb
23-Feb
24-Feb
25-Feb
26-Feb
27-Feb
28-Feb
TOTAL NO. OF DELAYS FOR EACH DAY
NO. OF DELAY 2015
NO. OF DELAY 2016
 Outcome:
• After studying different models on the dataset, it is observed that KNN
provides us the best results with the accuracy of about 86%.
 Future Scope:
• The model accuracy can be increased by taking into the account variables
like weather conditions and airline employees efficiency.
 Application:
• Airlines can determine efficient routes with minimum delay possibility.
• Opt for secondary airports for particular routes between cities. E.g. SEA-
LGA instead of SEA-JFK since SEA-JFK flights are more likely to be
delayed.
• This model can help passengers to plan layover at particular airport.
Outcome-Application-Future Scope
Thank You!

More Related Content

What's hot

Airline Management System [for presentation]
Airline Management System [for presentation]Airline Management System [for presentation]
Airline Management System [for presentation]SH Rajøn
 
Airport flight schedule System UML diagrams
Airport flight schedule System UML diagramsAirport flight schedule System UML diagrams
Airport flight schedule System UML diagramsuow
 
2015 Flight Delay/Cancellation Analysis
2015 Flight Delay/Cancellation Analysis2015 Flight Delay/Cancellation Analysis
2015 Flight Delay/Cancellation AnalysisSwapnil Patil
 
IT Strategy in Airlines Industry
IT Strategy in Airlines IndustryIT Strategy in Airlines Industry
IT Strategy in Airlines IndustryChris Furton
 
Airline ticket reservation system
Airline ticket reservation systemAirline ticket reservation system
Airline ticket reservation systemSH Rajøn
 
Airline Reservation system(project report of six week training)-ppt
Airline Reservation system(project report of six week training)-pptAirline Reservation system(project report of six week training)-ppt
Airline Reservation system(project report of six week training)-pptPunjab technical University
 
Airline reservation system db design
Airline reservation system db designAirline reservation system db design
Airline reservation system db designUC San Diego
 
Flight delay detection data mining project
Flight delay detection data mining projectFlight delay detection data mining project
Flight delay detection data mining projectAkshay Kumar Bhushan
 
Airline reservation system 1
Airline reservation system 1Airline reservation system 1
Airline reservation system 1_faisalkhan
 
Introduction to Airline Information System
Introduction to Airline Information SystemIntroduction to Airline Information System
Introduction to Airline Information SystemSiddhartha Tripathi
 
Crop predction ppt using ANN
Crop predction ppt using ANNCrop predction ppt using ANN
Crop predction ppt using ANNAstha Jain
 
Online Airline Ticket reservation System
Online Airline Ticket reservation SystemOnline Airline Ticket reservation System
Online Airline Ticket reservation Systemsathyakawthar
 
Air line reservation system software engeniring
Air line reservation system software engeniringAir line reservation system software engeniring
Air line reservation system software engeniringAsfand Sheraz Khan Niazi
 
Artificial Intelligent in Aviation /AI in Aviation/Intelligent Aviation
Artificial Intelligent in Aviation /AI in Aviation/Intelligent AviationArtificial Intelligent in Aviation /AI in Aviation/Intelligent Aviation
Artificial Intelligent in Aviation /AI in Aviation/Intelligent AviationAssem mousa
 
Airline Flight Tracking
Airline Flight TrackingAirline Flight Tracking
Airline Flight Trackingmariasinha81
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Simplilearn
 
Airline Reservation presentation (1) (1).pptx
Airline Reservation presentation (1) (1).pptxAirline Reservation presentation (1) (1).pptx
Airline Reservation presentation (1) (1).pptxPrathameshKanse
 
Flight reservation and ticketing system Final PPT
Flight reservation and ticketing system Final PPTFlight reservation and ticketing system Final PPT
Flight reservation and ticketing system Final PPTmarcorelano
 

What's hot (20)

Airline Management System [for presentation]
Airline Management System [for presentation]Airline Management System [for presentation]
Airline Management System [for presentation]
 
Airport flight schedule System UML diagrams
Airport flight schedule System UML diagramsAirport flight schedule System UML diagrams
Airport flight schedule System UML diagrams
 
2015 Flight Delay/Cancellation Analysis
2015 Flight Delay/Cancellation Analysis2015 Flight Delay/Cancellation Analysis
2015 Flight Delay/Cancellation Analysis
 
IT Strategy in Airlines Industry
IT Strategy in Airlines IndustryIT Strategy in Airlines Industry
IT Strategy in Airlines Industry
 
Airline ticket reservation system
Airline ticket reservation systemAirline ticket reservation system
Airline ticket reservation system
 
Airline Reservation system(project report of six week training)-ppt
Airline Reservation system(project report of six week training)-pptAirline Reservation system(project report of six week training)-ppt
Airline Reservation system(project report of six week training)-ppt
 
Airline reservation system db design
Airline reservation system db designAirline reservation system db design
Airline reservation system db design
 
Flight delay detection data mining project
Flight delay detection data mining projectFlight delay detection data mining project
Flight delay detection data mining project
 
Airline reservation system 1
Airline reservation system 1Airline reservation system 1
Airline reservation system 1
 
Introduction to Airline Information System
Introduction to Airline Information SystemIntroduction to Airline Information System
Introduction to Airline Information System
 
Crop predction ppt using ANN
Crop predction ppt using ANNCrop predction ppt using ANN
Crop predction ppt using ANN
 
Online Airline Ticket reservation System
Online Airline Ticket reservation SystemOnline Airline Ticket reservation System
Online Airline Ticket reservation System
 
Airline security new
Airline security newAirline security new
Airline security new
 
Air line reservation system software engeniring
Air line reservation system software engeniringAir line reservation system software engeniring
Air line reservation system software engeniring
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
Artificial Intelligent in Aviation /AI in Aviation/Intelligent Aviation
Artificial Intelligent in Aviation /AI in Aviation/Intelligent AviationArtificial Intelligent in Aviation /AI in Aviation/Intelligent Aviation
Artificial Intelligent in Aviation /AI in Aviation/Intelligent Aviation
 
Airline Flight Tracking
Airline Flight TrackingAirline Flight Tracking
Airline Flight Tracking
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
 
Airline Reservation presentation (1) (1).pptx
Airline Reservation presentation (1) (1).pptxAirline Reservation presentation (1) (1).pptx
Airline Reservation presentation (1) (1).pptx
 
Flight reservation and ticketing system Final PPT
Flight reservation and ticketing system Final PPTFlight reservation and ticketing system Final PPT
Flight reservation and ticketing system Final PPT
 

Similar to Flight Delay Prediction Model Analysis

KNN and regression Tree
KNN and regression TreeKNN and regression Tree
KNN and regression TreeAsmar Farooq
 
A statistical approach to predict flight delay
A statistical approach to predict flight delayA statistical approach to predict flight delay
A statistical approach to predict flight delayiDTechTechnologies
 
IRJET- Different Data Mining Techniques for Weather Prediction
IRJET-  	  Different Data Mining Techniques for Weather PredictionIRJET-  	  Different Data Mining Techniques for Weather Prediction
IRJET- Different Data Mining Techniques for Weather PredictionIRJET Journal
 
IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...
IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...
IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...IRJET Journal
 
Hard landing predection
Hard landing predectionHard landing predection
Hard landing predectionRAJUPADHYAY44
 
DOC245-20240219-WA0000_240219_090212.pdf
DOC245-20240219-WA0000_240219_090212.pdfDOC245-20240219-WA0000_240219_090212.pdf
DOC245-20240219-WA0000_240219_090212.pdfShaizaanKhan
 
IRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining ToolIRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining ToolIRJET Journal
 
IRJET - Airplane Crash Analysis and Prediction using Machine Learning
IRJET - Airplane Crash Analysis and Prediction using Machine LearningIRJET - Airplane Crash Analysis and Prediction using Machine Learning
IRJET - Airplane Crash Analysis and Prediction using Machine LearningIRJET Journal
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONIRJET Journal
 
Predicting flight cancellation likelihood
Predicting flight cancellation likelihoodPredicting flight cancellation likelihood
Predicting flight cancellation likelihoodAashish Jain
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Peter Tröger
 
Predicting aircraft landing overruns using quadratic linear regression
Predicting aircraft landing overruns using quadratic linear regressionPredicting aircraft landing overruns using quadratic linear regression
Predicting aircraft landing overruns using quadratic linear regressionPrerit Saxena
 
DSUS_MAO_2012_Jie
DSUS_MAO_2012_JieDSUS_MAO_2012_Jie
DSUS_MAO_2012_JieMDO_Lab
 
Ppt gu scs team2_a_metro project
Ppt gu scs team2_a_metro projectPpt gu scs team2_a_metro project
Ppt gu scs team2_a_metro projectMicah Melling
 
IRJET- Novel based Stock Value Prediction Method
IRJET- Novel based Stock Value Prediction MethodIRJET- Novel based Stock Value Prediction Method
IRJET- Novel based Stock Value Prediction MethodIRJET Journal
 
Sampling-SDM2012_Jun
Sampling-SDM2012_JunSampling-SDM2012_Jun
Sampling-SDM2012_JunMDO_Lab
 

Similar to Flight Delay Prediction Model Analysis (20)

KNN and regression Tree
KNN and regression TreeKNN and regression Tree
KNN and regression Tree
 
A statistical approach to predict flight delay
A statistical approach to predict flight delayA statistical approach to predict flight delay
A statistical approach to predict flight delay
 
Presentation
PresentationPresentation
Presentation
 
IRJET- Different Data Mining Techniques for Weather Prediction
IRJET-  	  Different Data Mining Techniques for Weather PredictionIRJET-  	  Different Data Mining Techniques for Weather Prediction
IRJET- Different Data Mining Techniques for Weather Prediction
 
IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...
IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...
IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapi...
 
Hard landing predection
Hard landing predectionHard landing predection
Hard landing predection
 
DOC245-20240219-WA0000_240219_090212.pdf
DOC245-20240219-WA0000_240219_090212.pdfDOC245-20240219-WA0000_240219_090212.pdf
DOC245-20240219-WA0000_240219_090212.pdf
 
IRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining ToolIRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining Tool
 
IRJET - Airplane Crash Analysis and Prediction using Machine Learning
IRJET - Airplane Crash Analysis and Prediction using Machine LearningIRJET - Airplane Crash Analysis and Prediction using Machine Learning
IRJET - Airplane Crash Analysis and Prediction using Machine Learning
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
Predicting flight cancellation likelihood
Predicting flight cancellation likelihoodPredicting flight cancellation likelihood
Predicting flight cancellation likelihood
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
 
Predicting aircraft landing overruns using quadratic linear regression
Predicting aircraft landing overruns using quadratic linear regressionPredicting aircraft landing overruns using quadratic linear regression
Predicting aircraft landing overruns using quadratic linear regression
 
DSUS_MAO_2012_Jie
DSUS_MAO_2012_JieDSUS_MAO_2012_Jie
DSUS_MAO_2012_Jie
 
Six sigma
Six sigma Six sigma
Six sigma
 
Six sigma pedagogy
Six sigma pedagogySix sigma pedagogy
Six sigma pedagogy
 
Ppt gu scs team2_a_metro project
Ppt gu scs team2_a_metro projectPpt gu scs team2_a_metro project
Ppt gu scs team2_a_metro project
 
Case Study
Case StudyCase Study
Case Study
 
IRJET- Novel based Stock Value Prediction Method
IRJET- Novel based Stock Value Prediction MethodIRJET- Novel based Stock Value Prediction Method
IRJET- Novel based Stock Value Prediction Method
 
Sampling-SDM2012_Jun
Sampling-SDM2012_JunSampling-SDM2012_Jun
Sampling-SDM2012_Jun
 

Flight Delay Prediction Model Analysis

  • 1. Prepared By: Ishani Desai Karan Shah Ketu Shah Shubham Gupta Flight Delay Prediction Model
  • 2. Shubham Gupta Karan ShahKetu Shah Ishani Desai
  • 3. • Flight Delay has emerged as a prime factor for economic loss for airlines. • Flight Delay has negative impact on business reputation and demand of airlines as well. Business Problem Overview
  • 4. • Develop a business model to predict flight delays. • Optimize flight operations. • Reduce further economic loss for airlines. • Lessen inconvenience occurred to passengers. Goal and Objective
  • 5. • As of 2007, airline industries incurred average cost of around $11,300 per delayed flight based upon 61,000 delayed flights per month average. • According to latest estimates, the cost of aircraft block time for U.S. passenger airlines was $81.18 per minute. Literature Review
  • 6. • The data is taken from United States Bureau of Transportation Statistics. • The dataset represents 4 years of flight delay information for the state of Washington. • Dataset contains over 2500 records and 48 attributes for February month. • Attributes: • Origin Airport • Destination Airport • Flight Number • Airline • Date of Flight • Delay Information Data Source
  • 10. Data Preparation Selected attributes from years 2013, 2014, and 2015 Derived attributes from years 2013, 2014, and 2015 Selected attributes from year 2016 Derived attributes from year 2016 Training Data Testing Data
  • 11. • K-Nearest Neighbors (K-NN) • Weighted K-Nearest Neighbors (KK-NN) • Decision Tree: CART • Decision Tree: C4.5 Algorithms Used
  • 12. • In Pattern recognition and statistical estimation, the k-nearest neighbors algorithm is used for classification and regression. • For classification, K-nearest neighbors is a simple algorithm that stores all available cases and classifies new cases based on a distance matrix. Euclidean Distance Function K-Nearest Neighbors attributestherepresent ,...,,and,,...,,where )(),( 2121 2 Euclidean m yyyxxx yxd mm i ii    yx yx
  • 13. 86.05 82.68 81.44 81.34 81.36 81.18 78 79 80 81 82 83 84 85 86 87 K=2 K=5 K=10 K=15 K=20 K=25 K-NN Series1 K-NN Graph K Value Result K=2 86.05 K=5 82.68 K=10 81.44 K=15 81.34 K=20 81.36 K=25 81.18
  • 14. • This extension is based on the idea, that such observations within the learning set, which are particularly close to the new observation (y, x), should get a higher weight in the decision than such neighbors that are far away from (y, x). • This is not the case with kNN: Indeed only the k nearest neighbors influence the prediction; however, this influence is the same for each of these neighbors, although the individual similarity to (y, x) might be widely different. • To reach this aim, the distances, on which the search for the nearest neighbors is based in the first step, have to be transformed into similarity measures, which can be used as weights Weighted K-Nearest Neighbors
  • 15. KK-NN Graph 85.35 84.82 84.87 84.52 83.11 83.07 82.59 81 81.5 82 82.5 83 83.5 84 84.5 85 85.5 86 K=1 K=2 K=5 K=10 K=15 K=20 K=25 KK-NN Series1 K Value Result K=1 85.35 K=2 84.52 K=5 84.87 K=10 84.52 K=15 83.11 K=20 83.07 K=25 82.59
  • 16. • The purpose of the analysis via tree-building algorithm is to determine a set of if-then logical split conditions that permit accurate predictions or classification of cases. • A Classification tree will determine a set of logical if-then conditions instead of linear equations for predicting or classifying cases. • The general approach to derive predictions from if-then conditions can also be applied to regression tree as well. • Advantages of CART: • Simplicity • Nonparametric and Nonlinear Classification and Regression Tree
  • 17. CART Implementation Predictions 0 1 0 1742 397 1 100 41 Accuracy: 78.20%
  • 18. • C4.5 algorithm is used to generate decision tree that can be used for classification and so referred as Statistical Classifier. • C4.5 permits numeric attributes and deals sensibly with missing values. • C4.5 uses attributes with continuous data and different weights. • C4.5 follows Post-pruning approach to deal with noisy data and remove a sub-tree from fully developed decision tree. C4.5 Algorithm
  • 19. C4.5 Decision Tree Predictions 0 1 0 1815 431 1 27 7 Accuracy: 79.91%
  • 20. Analysis and Statistics of Flight Delays
  • 21. Total Delayed for 2016 1.89% 67.23% 5.37% 7.45% 1.07% 2.65% 8.21% 2.46% 3.66% Total Delayed for 2015 Flight Delay [ Airlines ] 1.58% 65.26% 7.97% 6.92% 0.60% 2.41% 9.55% 2.03% 3.68% AA AS B6 DL F9 HA OO UA VX
  • 22. Popular Route Delays Information 0 10 20 30 40 50 60 70 80 90 100 NO. OF DELAYS IN POPULAR ROUTES NO. OF DELAY 2015 NO. OF DELAY 2016 0% 1% 2% 3% 4% 5% 6% 7% SEA-LAX SEA-SFO SEA-JFK SEA-ORD SEA-DFW SEA-Boston SEA-ATL % SHARE OF POPULAR ROUTES IN TOTAL DELAY % SHARE IN TOTAL DELAY 2015 % SHARE IN TOTAL DELAY 2016
  • 24.  Outcome: • After studying different models on the dataset, it is observed that KNN provides us the best results with the accuracy of about 86%.  Future Scope: • The model accuracy can be increased by taking into the account variables like weather conditions and airline employees efficiency.  Application: • Airlines can determine efficient routes with minimum delay possibility. • Opt for secondary airports for particular routes between cities. E.g. SEA- LGA instead of SEA-JFK since SEA-JFK flights are more likely to be delayed. • This model can help passengers to plan layover at particular airport. Outcome-Application-Future Scope

Editor's Notes

  1. Classification can be made through KNN and KKNN algorithms
  2. - Distance matrix for continuous variables is Euclidean Distance and for text classification; hamming distance metric is used frequently. - Also Extensive K-NN algorithm; known as ENN method which specifies two-way communication of classification; it considers not only who are nearest to test sample, but also who consider the test sample as their nearest neighbors.
  3. Simplicity:  In most cases, the interpretation of results summarized in a tree is very simple. Tree methods are nonparametric and nonlinear. The final results of using tree methods for classification or regression can be summarized in a series of (usually few) logical if-then conditions (tree nodes).
  4. - C4.5, which was the successor of ID3. ID3 and C4.5 adopt a greedy approach. In this algorithm, there is no backtracking; the trees are constructed in a top-down recursive divide-and-conquer manner. Statistical Classifier: It chose the attribute with highest gain to split the sample sets in subsets of one or more classes. Information gain is used for splitting criteria.
  5. Split is done accordingly to categorical variables. Rweka library is used SFO: three possible values: 0 means no flight 1 means flights
  6. Various charts and graphs are shown in next some slides of flight delaying.
  7. As you can see in the pie chart. Alaska Airlines has record of highest delay; the large proportion almost 2/3rd is covered for consecutive both 2015 and 2016 years. On the other hand, JetBlue airlines has record of lowest flight delaying over past two years.
  8. In above column chart, the data made to focused for Seattle, WA airport. Total flights delayed [due to arrival delay, technical problem, passenger waiting delay] for popular routes is shown in column chart 1; in second chart; percentage delay for different popular route is shown considering data from all airports delays.
  9. - In above line chart, you can see the number of flight delay for each day of February Month.