SlideShare a Scribd company logo
1 of 67
Introduction to Linear Regression and Correlation Analysis
Chapter Goals ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Chapter Goals ,[object Object],[object Object],[object Object],[object Object],[object Object],(continued)
Scatter Plots and Correlation ,[object Object],[object Object],[object Object],[object Object]
Scatter Plot Examples y x y x y y x x Linear relationships Curvilinear relationships
Scatter Plot Examples y x y x y y x x Strong relationships Weak relationships (continued)
Scatter Plot Examples y x y x No relationship (continued)
Correlation Coefficient ,[object Object],[object Object],(continued)
Features of  ρ  and  r ,[object Object],[object Object],[object Object],[object Object],[object Object]
Examples of Approximate  r  Values r = +.3 r = +1 y x y x y x y x y x r = -1 r = -.6 r = 0
Calculating the  Correlation Coefficient where: r = Sample correlation coefficient n = Sample size x = Value of the independent variable y = Value of the dependent variable Sample correlation coefficient: or the algebraic equivalent:
Calculation Example Tree Height Trunk Diameter y x xy y 2 x 2 35 8 280 1225 64 49 9 441 2401 81 27 7 189 729 49 33 6 198 1089 36 60 13 780 3600 169 21 7 147 441 49 45 11 495 2025 121 51 12 612 2601 144  =321  =73  =3142  =14111  =713
Calculation Example Trunk Diameter, x Tree Height,  y (continued) r = 0.886  -> relatively strong positive  linear association between x and y
Excel Output Excel Correlation Output Tools / data analysis / correlation… Correlation between  Tree Height and Trunk Diameter
Significance Test for Correlation ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example: Produce Stores Is there evidence of a linear relationship between tree height and trunk diameter at the .05 level of significance? H 0 : ρ   = 0  (No correlation) H 1 : ρ ≠ 0  (correlation exists)    =.05 ,  df   =   8 - 2  = 6
Example: Test Solution Conclusion: There  is evidence  of a linear relationship at the 5% level of significance Decision: Reject H 0 Reject H 0 Reject H 0  /2=.025 -t α/2 Do not reject H 0 0 t α/2  /2=.025 -2.4469 2.4469 4.68 d.f. = 8-2 = 6
Introduction to Regression Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object]
Simple Linear Regression Model ,[object Object],[object Object],[object Object]
Types of Regression Models Positive Linear Relationship Negative Linear Relationship Relationship NOT Linear No Relationship
Population Linear Regression Linear component The population regression model: Population  y  intercept  Population Slope Coefficient  Random Error term, or residual Dependent Variable Independent Variable Random Error component
Linear Regression Assumptions ,[object Object],[object Object],[object Object],[object Object],[object Object]
Population Linear Regression (continued) Random Error for this x value y x Observed Value of y for x i Predicted Value of y for x i   x i Slope = β 1 Intercept = β 0   ε i
Estimated Regression Model The sample regression line provides an  estimate  of the population regression line Estimate of the regression  intercept Estimate of the regression slope Estimated  (or predicted) y value Independent variable The individual random error terms  e i   have a mean of zero
Least Squares Criterion ,[object Object]
The Least Squares Equation ,[object Object],algebraic equivalent: and
[object Object],[object Object],Interpretation of the  Slope and the Intercept
Finding the Least Squares Equation ,[object Object],[object Object]
Simple Linear Regression Example ,[object Object],[object Object],[object Object],[object Object]
Sample Data for House Price Model House Price in $1000s (y) Square Feet  (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700
Regression Using Excel ,[object Object]
Excel Output The regression equation is: Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Graphical Presentation ,[object Object],Slope  = 0.10977 Intercept  = 98.248
Interpretation of the  Intercept,  b 0 ,[object Object],[object Object]
Interpretation of the  Slope Coefficient,  b 1 ,[object Object],[object Object]
Least Squares Regression Properties ,[object Object],[object Object],[object Object],[object Object]
Explained and Unexplained Variation ,[object Object],Total sum of Squares Sum of Squares Regression Sum of Squares Error where:   = Average value of the dependent variable y  = Observed values of the dependent variable   = Estimated value of y for the given x value
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Explained and Unexplained Variation (continued)
Explained and Unexplained Variation (continued) X i y x y i SST   =    ( y i   -   y ) 2 SSE   =   ( y i   -   y i  ) 2    SSR =   ( y i  -   y ) 2    _ _ _ y  y y _ y 
[object Object],[object Object],Coefficient of Determination, R 2 where
[object Object],Coefficient of Determination, R 2 (continued) Note:   In the single independent variable case, the coefficient of determination is where: R 2  = Coefficient of determination   r = Simple correlation coefficient
Examples of Approximate  R 2   Values R 2  = +1 y x y x R 2  = 1 R 2  = 1 Perfect linear relationship between x and y:  100% of the variation in y is explained by variation in x
Examples of Approximate  R 2   Values y x y x 0 < R 2  < 1 Weaker linear relationship between x and y:  Some but not all of the variation in y is explained by variation in x
Examples of Approximate  R 2   Values R 2  = 0 No linear relationship between x and y:  The value of Y does not depend on x.  (None of the variation in y is explained by variation in x) y x R 2  = 0
Excel Output 58.08% of the variation in house prices is explained by variation in square feet Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Standard Error of Estimate ,[object Object],Where SSE  = Sum of squares error   n = Sample size   k = number of independent variables in the model
The Standard Deviation of the Regression Slope ,[object Object],where: = Estimate of the standard error of the least squares slope = Sample standard error of the estimate
Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Comparing Standard Errors y y y x x x y x Variation of observed y values from the regression line Variation in the slope of regression lines from different possible samples
Inference about the Slope:  t Test ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],where: b 1  = Sample regression slope coefficient β 1  = Hypothesized slope s b1  = Estimator of the standard error of the slope
Inference about the Slope:  t Test Estimated Regression Equation: The slope of this model is 0.1098  Does square footage of the house affect its sales price? (continued) House Price in $1000s (y) Square Feet  (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700
Inferences about the Slope:  t   Test Example ,[object Object],[object Object],Test Statistic:  t = 3.329 There is sufficient evidence that square footage affects house price From Excel output:  Reject H 0 t b 1 Decision: Conclusion: Reject H 0 Reject H 0  /2=.025 -t α/2 Do not reject H 0 0 t α/2  /2=.025 -2.3060 2.3060 3.329 d.f. = 10-2 = 8   Coefficients Standard Error t Stat P-value Intercept 98.24833 58.03348 1.69296 0.12892 Square Feet 0.10977 0.03297 3.32938 0.01039
Regression Analysis for Description Confidence Interval Estimate of the Slope: Excel Printout for House Prices: At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858) d.f. = n - 2   Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Regression Analysis for Description Since the units of the house price variable is $1000s, we are 95% confident that the average impact on sales price is between $33.70 and $185.80 per square foot of house size This 95% confidence interval  does not include 0 . Conclusion:  There is a significant relationship between house price and square feet at the .05 level of significance    Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Confidence Interval for  the Average y, Given x Confidence interval estimate for the  mean of y   given a particular x p Size of interval varies according to distance away from mean, x
Confidence Interval for  an Individual y, Given x Confidence interval estimate for an  Individual value of y   given a particular x p This extra term adds to the interval width to reflect the added uncertainty for an individual case
Interval Estimates  for Different Values of x y x Prediction Interval for an individual y, given x p x p y = b 0  + b 1 x  x Confidence Interval for the mean of y, given x p
Example: House Prices Estimated Regression Equation: Predict the price for a house with 2000 square feet House Price in $1000s (y) Square Feet  (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700
Example: House Prices Predict the price for a house with 2000 square feet: The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850 (continued)
Estimation of Mean Values: Example Find the 95% confidence interval for the average price of 2,000 square-foot houses Predicted Price Y i  = 317.85 ($1,000s)  Confidence Interval Estimate for E(y)|x p The confidence interval endpoints are 280.66 -- 354.90, or from $280,660 -- $354,900
Estimation of Individual Values: Example Find the 95% confidence interval for an individual house with 2,000 square feet Predicted Price Y i  = 317.85 ($1,000s)  Prediction Interval Estimate for y|x p The prediction interval endpoints are 215.50 -- 420.07, or from $215,500 -- $420,070
Finding Confidence and Prediction Intervals PHStat ,[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],Finding Confidence and Prediction Intervals PHStat (continued) Confidence Interval Estimate for E(y)|x p Prediction Interval Estimate for y|x p
Residual Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Residual Analysis for Linearity Not Linear Linear  x residuals x y x y x residuals
Residual Analysis for  Constant Variance  Non-constant variance  Constant variance x x y x x y residuals residuals
Excel Output RESIDUAL OUTPUT Predicted House Price  Residuals 1 251.92316 -6.923162 2 273.87671 38.12329 3 284.85348 -5.853484 4 304.06284 3.937162 5 218.99284 -19.99284 6 268.38832 -49.38832 7 356.20251 48.79749 8 367.17929 -43.17929 9 254.6674 64.33264 10 284.85348 -29.85348

More Related Content

What's hot

Spearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSpearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation Coefficient
Sharlaine Ruth
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
dessybudiyanti
 

What's hot (20)

z-test
z-testz-test
z-test
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Spearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSpearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation Coefficient
 
Correlation
CorrelationCorrelation
Correlation
 
Introduction to t-tests (statistics)
Introduction to t-tests (statistics)Introduction to t-tests (statistics)
Introduction to t-tests (statistics)
 
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Correlation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft ExcelCorrelation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft Excel
 
multiple regression
multiple regressionmultiple regression
multiple regression
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Simple Linear Regression: Step-By-Step
Simple Linear Regression: Step-By-StepSimple Linear Regression: Step-By-Step
Simple Linear Regression: Step-By-Step
 
Binary Logistic Regression
Binary Logistic RegressionBinary Logistic Regression
Binary Logistic Regression
 
INFERENTIAL STATISTICS: AN INTRODUCTION
INFERENTIAL STATISTICS: AN INTRODUCTIONINFERENTIAL STATISTICS: AN INTRODUCTION
INFERENTIAL STATISTICS: AN INTRODUCTION
 
Correlation and Regression ppt
Correlation and Regression pptCorrelation and Regression ppt
Correlation and Regression ppt
 
regression and correlation
regression and correlationregression and correlation
regression and correlation
 
Hypothesis and Hypothesis Testing
Hypothesis and Hypothesis TestingHypothesis and Hypothesis Testing
Hypothesis and Hypothesis Testing
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Two-Way ANOVA
Two-Way ANOVATwo-Way ANOVA
Two-Way ANOVA
 
Introduction to ANOVAs
Introduction to ANOVAsIntroduction to ANOVAs
Introduction to ANOVAs
 

Similar to Linear regression and correlation analysis ppt @ bec doms

Regression analysis
Regression analysisRegression analysis
Regression analysis
Ravi shankar
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
Kemal İnciroğlu
 
Presentasi Tentang Regresi Linear
Presentasi Tentang Regresi LinearPresentasi Tentang Regresi Linear
Presentasi Tentang Regresi Linear
dessybudiyanti
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari
 

Similar to Linear regression and correlation analysis ppt @ bec doms (20)

Regression analysis
Regression analysisRegression analysis
Regression analysis
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annova
 
Regression
RegressionRegression
Regression
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
Presentasi Tentang Regresi Linear
Presentasi Tentang Regresi LinearPresentasi Tentang Regresi Linear
Presentasi Tentang Regresi Linear
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
Regression
RegressionRegression
Regression
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
 
Regression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with exampleRegression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with example
 
Chapter05
Chapter05Chapter05
Chapter05
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
 
Chapter13
Chapter13Chapter13
Chapter13
 
lecture No. 3a.ppt
lecture No. 3a.pptlecture No. 3a.ppt
lecture No. 3a.ppt
 
Correlation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxCorrelation and Regression Analysis.pptx
Correlation and Regression Analysis.pptx
 

More from Babasab Patil

Marketing management module 2 marketing environment mba 1st sem by babasab pa...
Marketing management module 2 marketing environment mba 1st sem by babasab pa...Marketing management module 2 marketing environment mba 1st sem by babasab pa...
Marketing management module 2 marketing environment mba 1st sem by babasab pa...
Babasab Patil
 
Marketing management module 4 measuring andforecasting demand mba 1st sem by...
Marketing management module 4  measuring andforecasting demand mba 1st sem by...Marketing management module 4  measuring andforecasting demand mba 1st sem by...
Marketing management module 4 measuring andforecasting demand mba 1st sem by...
Babasab Patil
 
Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...
Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...
Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...
Babasab Patil
 
Notes managerial communication 3 business correspondence and report writing ...
Notes managerial communication  3 business correspondence and report writing ...Notes managerial communication  3 business correspondence and report writing ...
Notes managerial communication 3 business correspondence and report writing ...
Babasab Patil
 
Notes managerial communication mod 2 basic communication skills mba 1st sem ...
Notes managerial communication mod 2  basic communication skills mba 1st sem ...Notes managerial communication mod 2  basic communication skills mba 1st sem ...
Notes managerial communication mod 2 basic communication skills mba 1st sem ...
Babasab Patil
 
Notes managerial communication mod 4 the job application process mba 1st sem ...
Notes managerial communication mod 4 the job application process mba 1st sem ...Notes managerial communication mod 4 the job application process mba 1st sem ...
Notes managerial communication mod 4 the job application process mba 1st sem ...
Babasab Patil
 
Notes managerial communication mod 5 interviews mba 1st sem by babasab patil...
Notes managerial communication mod 5 interviews  mba 1st sem by babasab patil...Notes managerial communication mod 5 interviews  mba 1st sem by babasab patil...
Notes managerial communication mod 5 interviews mba 1st sem by babasab patil...
Babasab Patil
 
Notes managerial communication part 1 mba 1st sem by babasab patil (karrisatte)
Notes managerial communication part 1  mba 1st sem by babasab patil (karrisatte)Notes managerial communication part 1  mba 1st sem by babasab patil (karrisatte)
Notes managerial communication part 1 mba 1st sem by babasab patil (karrisatte)
Babasab Patil
 
Principles of marketing mba 1st sem by babasab patil (karrisatte)
Principles of marketing mba 1st sem by babasab patil (karrisatte)Principles of marketing mba 1st sem by babasab patil (karrisatte)
Principles of marketing mba 1st sem by babasab patil (karrisatte)
Babasab Patil
 
Marketing management module 1 important questions of marketing mba 1st sem...
Marketing management module 1  important questions of marketing   mba 1st sem...Marketing management module 1  important questions of marketing   mba 1st sem...
Marketing management module 1 important questions of marketing mba 1st sem...
Babasab Patil
 

More from Babasab Patil (20)

Segmentation module 4 mba 1st sem by babasab patil (karrisatte)
Segmentation module 4  mba 1st sem by babasab patil (karrisatte)Segmentation module 4  mba 1st sem by babasab patil (karrisatte)
Segmentation module 4 mba 1st sem by babasab patil (karrisatte)
 
Marketing management module 1 core concepts of marketing mba 1st sem by baba...
Marketing management module 1 core concepts of marketing  mba 1st sem by baba...Marketing management module 1 core concepts of marketing  mba 1st sem by baba...
Marketing management module 1 core concepts of marketing mba 1st sem by baba...
 
Marketing management module 2 marketing environment mba 1st sem by babasab pa...
Marketing management module 2 marketing environment mba 1st sem by babasab pa...Marketing management module 2 marketing environment mba 1st sem by babasab pa...
Marketing management module 2 marketing environment mba 1st sem by babasab pa...
 
Marketing management module 4 measuring andforecasting demand mba 1st sem by...
Marketing management module 4  measuring andforecasting demand mba 1st sem by...Marketing management module 4  measuring andforecasting demand mba 1st sem by...
Marketing management module 4 measuring andforecasting demand mba 1st sem by...
 
Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...
Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...
Measuring and forecasting demand module 4 mba 1st sem by babasab patil (karri...
 
Notes managerial communication 3 business correspondence and report writing ...
Notes managerial communication  3 business correspondence and report writing ...Notes managerial communication  3 business correspondence and report writing ...
Notes managerial communication 3 business correspondence and report writing ...
 
Notes managerial communication mod 2 basic communication skills mba 1st sem ...
Notes managerial communication mod 2  basic communication skills mba 1st sem ...Notes managerial communication mod 2  basic communication skills mba 1st sem ...
Notes managerial communication mod 2 basic communication skills mba 1st sem ...
 
Notes managerial communication mod 4 the job application process mba 1st sem ...
Notes managerial communication mod 4 the job application process mba 1st sem ...Notes managerial communication mod 4 the job application process mba 1st sem ...
Notes managerial communication mod 4 the job application process mba 1st sem ...
 
Notes managerial communication mod 5 interviews mba 1st sem by babasab patil...
Notes managerial communication mod 5 interviews  mba 1st sem by babasab patil...Notes managerial communication mod 5 interviews  mba 1st sem by babasab patil...
Notes managerial communication mod 5 interviews mba 1st sem by babasab patil...
 
Notes managerial communication part 1 mba 1st sem by babasab patil (karrisatte)
Notes managerial communication part 1  mba 1st sem by babasab patil (karrisatte)Notes managerial communication part 1  mba 1st sem by babasab patil (karrisatte)
Notes managerial communication part 1 mba 1st sem by babasab patil (karrisatte)
 
Principles of marketing mba 1st sem by babasab patil (karrisatte)
Principles of marketing mba 1st sem by babasab patil (karrisatte)Principles of marketing mba 1st sem by babasab patil (karrisatte)
Principles of marketing mba 1st sem by babasab patil (karrisatte)
 
Segmentation module 4 mba 1st sem by babasab patil (karrisatte)
Segmentation module 4  mba 1st sem by babasab patil (karrisatte)Segmentation module 4  mba 1st sem by babasab patil (karrisatte)
Segmentation module 4 mba 1st sem by babasab patil (karrisatte)
 
Marketing management module 1 important questions of marketing mba 1st sem...
Marketing management module 1  important questions of marketing   mba 1st sem...Marketing management module 1  important questions of marketing   mba 1st sem...
Marketing management module 1 important questions of marketing mba 1st sem...
 
Discovery shuttle processing NASA before launching the rocket by babasab ...
Discovery shuttle processing  NASA   before  launching the rocket by babasab ...Discovery shuttle processing  NASA   before  launching the rocket by babasab ...
Discovery shuttle processing NASA before launching the rocket by babasab ...
 
Corporate lessons from__iim__calcutta by babasab patil
Corporate lessons from__iim__calcutta by babasab patil Corporate lessons from__iim__calcutta by babasab patil
Corporate lessons from__iim__calcutta by babasab patil
 
Communication problems between men and women by babasab patil
Communication problems between men and women by babasab patil Communication problems between men and women by babasab patil
Communication problems between men and women by babasab patil
 
Brasil waterfall byy babasab patil
Brasil waterfall  byy babasab patil Brasil waterfall  byy babasab patil
Brasil waterfall byy babasab patil
 
Best aviation photography_ever__bar_none by babasab patil
Best aviation photography_ever__bar_none by babasab patil Best aviation photography_ever__bar_none by babasab patil
Best aviation photography_ever__bar_none by babasab patil
 
Attitude stone cutter
Attitude stone cutterAttitude stone cutter
Attitude stone cutter
 
Attitude stone cutter
Attitude stone cutterAttitude stone cutter
Attitude stone cutter
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Linear regression and correlation analysis ppt @ bec doms

  • 1. Introduction to Linear Regression and Correlation Analysis
  • 2.
  • 3.
  • 4.
  • 5. Scatter Plot Examples y x y x y y x x Linear relationships Curvilinear relationships
  • 6. Scatter Plot Examples y x y x y y x x Strong relationships Weak relationships (continued)
  • 7. Scatter Plot Examples y x y x No relationship (continued)
  • 8.
  • 9.
  • 10. Examples of Approximate r Values r = +.3 r = +1 y x y x y x y x y x r = -1 r = -.6 r = 0
  • 11. Calculating the Correlation Coefficient where: r = Sample correlation coefficient n = Sample size x = Value of the independent variable y = Value of the dependent variable Sample correlation coefficient: or the algebraic equivalent:
  • 12. Calculation Example Tree Height Trunk Diameter y x xy y 2 x 2 35 8 280 1225 64 49 9 441 2401 81 27 7 189 729 49 33 6 198 1089 36 60 13 780 3600 169 21 7 147 441 49 45 11 495 2025 121 51 12 612 2601 144  =321  =73  =3142  =14111  =713
  • 13. Calculation Example Trunk Diameter, x Tree Height, y (continued) r = 0.886 -> relatively strong positive linear association between x and y
  • 14. Excel Output Excel Correlation Output Tools / data analysis / correlation… Correlation between Tree Height and Trunk Diameter
  • 15.
  • 16. Example: Produce Stores Is there evidence of a linear relationship between tree height and trunk diameter at the .05 level of significance? H 0 : ρ = 0 (No correlation) H 1 : ρ ≠ 0 (correlation exists)  =.05 , df = 8 - 2 = 6
  • 17. Example: Test Solution Conclusion: There is evidence of a linear relationship at the 5% level of significance Decision: Reject H 0 Reject H 0 Reject H 0  /2=.025 -t α/2 Do not reject H 0 0 t α/2  /2=.025 -2.4469 2.4469 4.68 d.f. = 8-2 = 6
  • 18.
  • 19.
  • 20. Types of Regression Models Positive Linear Relationship Negative Linear Relationship Relationship NOT Linear No Relationship
  • 21. Population Linear Regression Linear component The population regression model: Population y intercept Population Slope Coefficient Random Error term, or residual Dependent Variable Independent Variable Random Error component
  • 22.
  • 23. Population Linear Regression (continued) Random Error for this x value y x Observed Value of y for x i Predicted Value of y for x i x i Slope = β 1 Intercept = β 0 ε i
  • 24. Estimated Regression Model The sample regression line provides an estimate of the population regression line Estimate of the regression intercept Estimate of the regression slope Estimated (or predicted) y value Independent variable The individual random error terms e i have a mean of zero
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. Sample Data for House Price Model House Price in $1000s (y) Square Feet (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700
  • 31.
  • 32. Excel Output The regression equation is: Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39. Explained and Unexplained Variation (continued) X i y x y i SST =  ( y i - y ) 2 SSE =  ( y i - y i ) 2  SSR =  ( y i - y ) 2  _ _ _ y  y y _ y 
  • 40.
  • 41.
  • 42. Examples of Approximate R 2 Values R 2 = +1 y x y x R 2 = 1 R 2 = 1 Perfect linear relationship between x and y: 100% of the variation in y is explained by variation in x
  • 43. Examples of Approximate R 2 Values y x y x 0 < R 2 < 1 Weaker linear relationship between x and y: Some but not all of the variation in y is explained by variation in x
  • 44. Examples of Approximate R 2 Values R 2 = 0 No linear relationship between x and y: The value of Y does not depend on x. (None of the variation in y is explained by variation in x) y x R 2 = 0
  • 45. Excel Output 58.08% of the variation in house prices is explained by variation in square feet Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
  • 46.
  • 47.
  • 48. Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
  • 49. Comparing Standard Errors y y y x x x y x Variation of observed y values from the regression line Variation in the slope of regression lines from different possible samples
  • 50.
  • 51. Inference about the Slope: t Test Estimated Regression Equation: The slope of this model is 0.1098 Does square footage of the house affect its sales price? (continued) House Price in $1000s (y) Square Feet (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700
  • 52.
  • 53. Regression Analysis for Description Confidence Interval Estimate of the Slope: Excel Printout for House Prices: At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858) d.f. = n - 2   Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
  • 54. Regression Analysis for Description Since the units of the house price variable is $1000s, we are 95% confident that the average impact on sales price is between $33.70 and $185.80 per square foot of house size This 95% confidence interval does not include 0 . Conclusion: There is a significant relationship between house price and square feet at the .05 level of significance   Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
  • 55. Confidence Interval for the Average y, Given x Confidence interval estimate for the mean of y given a particular x p Size of interval varies according to distance away from mean, x
  • 56. Confidence Interval for an Individual y, Given x Confidence interval estimate for an Individual value of y given a particular x p This extra term adds to the interval width to reflect the added uncertainty for an individual case
  • 57. Interval Estimates for Different Values of x y x Prediction Interval for an individual y, given x p x p y = b 0 + b 1 x  x Confidence Interval for the mean of y, given x p
  • 58. Example: House Prices Estimated Regression Equation: Predict the price for a house with 2000 square feet House Price in $1000s (y) Square Feet (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700
  • 59. Example: House Prices Predict the price for a house with 2000 square feet: The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850 (continued)
  • 60. Estimation of Mean Values: Example Find the 95% confidence interval for the average price of 2,000 square-foot houses Predicted Price Y i = 317.85 ($1,000s)  Confidence Interval Estimate for E(y)|x p The confidence interval endpoints are 280.66 -- 354.90, or from $280,660 -- $354,900
  • 61. Estimation of Individual Values: Example Find the 95% confidence interval for an individual house with 2,000 square feet Predicted Price Y i = 317.85 ($1,000s)  Prediction Interval Estimate for y|x p The prediction interval endpoints are 215.50 -- 420.07, or from $215,500 -- $420,070
  • 62.
  • 63.
  • 64.
  • 65. Residual Analysis for Linearity Not Linear Linear  x residuals x y x y x residuals
  • 66. Residual Analysis for Constant Variance Non-constant variance  Constant variance x x y x x y residuals residuals
  • 67. Excel Output RESIDUAL OUTPUT Predicted House Price Residuals 1 251.92316 -6.923162 2 273.87671 38.12329 3 284.85348 -5.853484 4 304.06284 3.937162 5 218.99284 -19.99284 6 268.38832 -49.38832 7 356.20251 48.79749 8 367.17929 -43.17929 9 254.6674 64.33264 10 284.85348 -29.85348