SlideShare a Scribd company logo
1 of 8
Download to read offline
Sampling Bias
Dr.K.Prabhakar
Bias
• Once we collect the data we represent the data by way of a model.
Let us assume a linear model.
• This may be written as y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error
• Therefore we predict that there will be an error as the outcome is
expressed as a set of predictor variables multiplied by a set of
coefficients the parameters the a in the equation and tell us about
the relationship between the predictor and outcome variable.
• The prediction will not be perfect as there will be an error as we are
using sample data to predict the outcome variable.
The contexts for bias
• Things that bias the parameter estimates
• Things that bias standard errors and confidence intervals
• Things that bias test statistics and p-values. These bias are related. If
the test statistics are bias then the confidence intervals will be biased.
A bias in confidence intervals will bias the test statistics.
• If the test statistics is biased then the results will be biased and we
need to identify and eliminate the biases as much as possible.
Assumptions that lead to bias
1. Presence of outliners
2. Additivity and linearity
3. Normality
4. Homoscedasticity or homogeneity of variance
5. Independence
Outliers
• Presence of outliers in data will bias the data.
• For example if the class average marks is 60 and standard deviation is
10 marks then if there is a presence of zero marks or 100 marks by
few students may bias the data.
• The outliers need to be identified and removed or replaced to have a
better representation of the data. It generally affect the mean of the
data as well as some of the squares errors. The sum of the squares is
used to compute the standard deviation, which in turn is used to
estimate the standard error. The standard error is used for confidence
intervals around the parameter estimates. This it will have a domino
effect on the results.
Additivity and Linearity
• The assumption is the outcome variable is linearly related to all
predictors. That means the relationship may be summed up as a
straight line.
• If there are several predictors as we have see the equation
y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error
their combined effect is described by adding their effects together.
The model can described accurately by the equation given here.
Assumption of Normality
• There is a mistaken belief that assumption of normality = the data need to be
from normally distributed. This misconception stems from the fact that if the
data is normally distributed then errors in the model as well as sampling
distribution is also normally distributed.
• The central limit theorem means that there are different situations in which we
can assume normality regardless of the shape of the sample data.
• Normality matters when you construct confidence intervals around parameters of
the model or compute significance tests relating to those parameters then
assumption of normality matters in small samples.
• As long as the sample size is fairly large, outliers are taken into account then
assumption of normality will not be a pressing concern.
• Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The importance of
the normality assumption in large public health data sets. Annual review of
public health, 23(1), 151-169.
Homoscedasticity or homogeneity of variance

More Related Content

What's hot

Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysisIrfan Hussain
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation ModelingAzmi Mohd Tamil
 
M1 regression metrics_middleschool
M1 regression metrics_middleschoolM1 regression metrics_middleschool
M1 regression metrics_middleschoolaiclub_slides
 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataTianfan Song
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataHopkinsCFAR
 
Lab report walk through
Lab report walk throughLab report walk through
Lab report walk throughserenaasya
 
Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Rankingijsrd.com
 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methodsguest2137aa
 
CS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by SlazbergCS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by Slazbergmustafa sarac
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Mohammed Musah
 
Lecture note 2
Lecture note 2Lecture note 2
Lecture note 2sreenu t
 
Polynomials 12.2 12.4
Polynomials 12.2 12.4Polynomials 12.2 12.4
Polynomials 12.2 12.4RobinFilter
 
Lesson 10 rm psych stats & graphs 2013
Lesson 10   rm psych stats & graphs 2013Lesson 10   rm psych stats & graphs 2013
Lesson 10 rm psych stats & graphs 2013coburgpsych
 

What's hot (19)

Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysis
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling
 
M1 regression metrics_middleschool
M1 regression metrics_middleschoolM1 regression metrics_middleschool
M1 regression metrics_middleschool
 
Methods of point estimation
Methods of point estimationMethods of point estimation
Methods of point estimation
 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing Data
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing Data
 
Lab report walk through
Lab report walk throughLab report walk through
Lab report walk through
 
Estimation Theory
Estimation TheoryEstimation Theory
Estimation Theory
 
Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Ranking
 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
 
R - Multiple Regression
R - Multiple RegressionR - Multiple Regression
R - Multiple Regression
 
CS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by SlazbergCS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by Slazberg
 
Regression
RegressionRegression
Regression
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)
 
Point estimation
Point estimationPoint estimation
Point estimation
 
Lecture note 2
Lecture note 2Lecture note 2
Lecture note 2
 
Polynomials 12.2 12.4
Polynomials 12.2 12.4Polynomials 12.2 12.4
Polynomials 12.2 12.4
 
The Chi Square Test
The Chi Square TestThe Chi Square Test
The Chi Square Test
 
Lesson 10 rm psych stats & graphs 2013
Lesson 10   rm psych stats & graphs 2013Lesson 10   rm psych stats & graphs 2013
Lesson 10 rm psych stats & graphs 2013
 

Similar to Bias in Research Methods

regression.pptx
regression.pptxregression.pptx
regression.pptxaneeshs28
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionRione Drevale
 
Unit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxUnit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxsmithashetty24
 
Error in chemical analysis
Error in chemical analysisError in chemical analysis
Error in chemical analysisSuresh Selvaraj
 
Normal distribtion curve
Normal distribtion curveNormal distribtion curve
Normal distribtion curveAliRaza1767
 
L1 statistics
L1 statisticsL1 statistics
L1 statisticsdapdai
 
statistical estimation
statistical estimationstatistical estimation
statistical estimationAmish Akbar
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfVamshi962726
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
SGEN ALLIED PPT - Errors & Uncertainties.pptx
SGEN ALLIED PPT - Errors & Uncertainties.pptxSGEN ALLIED PPT - Errors & Uncertainties.pptx
SGEN ALLIED PPT - Errors & Uncertainties.pptxJhunLerryTayan3
 
Physics 1.2b Errors and Uncertainties
Physics 1.2b Errors and UncertaintiesPhysics 1.2b Errors and Uncertainties
Physics 1.2b Errors and UncertaintiesJohnPaul Kennedy
 

Similar to Bias in Research Methods (20)

regression.pptx
regression.pptxregression.pptx
regression.pptx
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regression
 
Unit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxUnit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptx
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Errors2
Errors2Errors2
Errors2
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Error in chemical analysis
Error in chemical analysisError in chemical analysis
Error in chemical analysis
 
chapter12.ppt
chapter12.pptchapter12.ppt
chapter12.ppt
 
Correlation in Statistics
Correlation in StatisticsCorrelation in Statistics
Correlation in Statistics
 
Normal distribtion curve
Normal distribtion curveNormal distribtion curve
Normal distribtion curve
 
L1 statistics
L1 statisticsL1 statistics
L1 statistics
 
statistical estimation
statistical estimationstatistical estimation
statistical estimation
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdf
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
template.pptx
template.pptxtemplate.pptx
template.pptx
 
R training4
R training4R training4
R training4
 
SGEN ALLIED PPT - Errors & Uncertainties.pptx
SGEN ALLIED PPT - Errors & Uncertainties.pptxSGEN ALLIED PPT - Errors & Uncertainties.pptx
SGEN ALLIED PPT - Errors & Uncertainties.pptx
 
DSE-2, ANALYTICAL METHODS.pptx
DSE-2, ANALYTICAL METHODS.pptxDSE-2, ANALYTICAL METHODS.pptx
DSE-2, ANALYTICAL METHODS.pptx
 
Physics 1.2b Errors and Uncertainties
Physics 1.2b Errors and UncertaintiesPhysics 1.2b Errors and Uncertainties
Physics 1.2b Errors and Uncertainties
 

More from Central University of Jammu

The twelve commandments to live better by one of my friend
 The twelve commandments to live better by one of my friend  The twelve commandments to live better by one of my friend
The twelve commandments to live better by one of my friend Central University of Jammu
 
Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility Central University of Jammu
 

More from Central University of Jammu (20)

The Crooked Timber of New India [Autosaved].pptx
The Crooked Timber of New India [Autosaved].pptxThe Crooked Timber of New India [Autosaved].pptx
The Crooked Timber of New India [Autosaved].pptx
 
Qualitative research and use of Nvivo
Qualitative research and use of NvivoQualitative research and use of Nvivo
Qualitative research and use of Nvivo
 
Impact of covid pandemic on indian economy future
Impact of covid pandemic on indian economy futureImpact of covid pandemic on indian economy future
Impact of covid pandemic on indian economy future
 
Learning
LearningLearning
Learning
 
Introduction to qualitative research and nvivo 12
Introduction to qualitative research and nvivo 12Introduction to qualitative research and nvivo 12
Introduction to qualitative research and nvivo 12
 
Examiners Expectations from PhD Thesis
Examiners Expectations from PhD ThesisExaminers Expectations from PhD Thesis
Examiners Expectations from PhD Thesis
 
Fundamental of Research
Fundamental of Research Fundamental of Research
Fundamental of Research
 
Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis
 
Sample Size Determination
Sample Size DeterminationSample Size Determination
Sample Size Determination
 
Sampling Concepts
 Sampling Concepts Sampling Concepts
Sampling Concepts
 
Sampling
 Sampling Sampling
Sampling
 
Variables, Theory and Sampling Map
Variables, Theory and Sampling MapVariables, Theory and Sampling Map
Variables, Theory and Sampling Map
 
Role of Good Governance Practices
Role of Good Governance Practices Role of Good Governance Practices
Role of Good Governance Practices
 
Individualization
IndividualizationIndividualization
Individualization
 
The twelve commandments to live better by one of my friend
 The twelve commandments to live better by one of my friend  The twelve commandments to live better by one of my friend
The twelve commandments to live better by one of my friend
 
Innovations for next 30 years and business
Innovations for next 30 years and businessInnovations for next 30 years and business
Innovations for next 30 years and business
 
Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility
 
Sight Care Foundation
Sight Care Foundation Sight Care Foundation
Sight Care Foundation
 
Project guidelines for mba
Project guidelines for mbaProject guidelines for mba
Project guidelines for mba
 
Web 2.0 Opportunities and Risks
Web 2.0 Opportunities and RisksWeb 2.0 Opportunities and Risks
Web 2.0 Opportunities and Risks
 

Recently uploaded

The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 

Recently uploaded (16)

The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 

Bias in Research Methods

  • 2. Bias • Once we collect the data we represent the data by way of a model. Let us assume a linear model. • This may be written as y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error • Therefore we predict that there will be an error as the outcome is expressed as a set of predictor variables multiplied by a set of coefficients the parameters the a in the equation and tell us about the relationship between the predictor and outcome variable. • The prediction will not be perfect as there will be an error as we are using sample data to predict the outcome variable.
  • 3. The contexts for bias • Things that bias the parameter estimates • Things that bias standard errors and confidence intervals • Things that bias test statistics and p-values. These bias are related. If the test statistics are bias then the confidence intervals will be biased. A bias in confidence intervals will bias the test statistics. • If the test statistics is biased then the results will be biased and we need to identify and eliminate the biases as much as possible.
  • 4. Assumptions that lead to bias 1. Presence of outliners 2. Additivity and linearity 3. Normality 4. Homoscedasticity or homogeneity of variance 5. Independence
  • 5. Outliers • Presence of outliers in data will bias the data. • For example if the class average marks is 60 and standard deviation is 10 marks then if there is a presence of zero marks or 100 marks by few students may bias the data. • The outliers need to be identified and removed or replaced to have a better representation of the data. It generally affect the mean of the data as well as some of the squares errors. The sum of the squares is used to compute the standard deviation, which in turn is used to estimate the standard error. The standard error is used for confidence intervals around the parameter estimates. This it will have a domino effect on the results.
  • 6. Additivity and Linearity • The assumption is the outcome variable is linearly related to all predictors. That means the relationship may be summed up as a straight line. • If there are several predictors as we have see the equation y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error their combined effect is described by adding their effects together. The model can described accurately by the equation given here.
  • 7. Assumption of Normality • There is a mistaken belief that assumption of normality = the data need to be from normally distributed. This misconception stems from the fact that if the data is normally distributed then errors in the model as well as sampling distribution is also normally distributed. • The central limit theorem means that there are different situations in which we can assume normality regardless of the shape of the sample data. • Normality matters when you construct confidence intervals around parameters of the model or compute significance tests relating to those parameters then assumption of normality matters in small samples. • As long as the sample size is fairly large, outliers are taken into account then assumption of normality will not be a pressing concern. • Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The importance of the normality assumption in large public health data sets. Annual review of public health, 23(1), 151-169.