SlideShare a Scribd company logo
1 of 63
UNIVARIATE ANALYSIS
Dr. Soumya Swaroop Sahoo
JR, Community Medicine
PGIMS Rohtak
Contents
• Introduction
• Variables
• Types of variables
• Scales of measurement
• Types of analysis
• Components of univariate analysis
• Advantages and limitations
Introduction
• The word “statistics” has several meanings: data or numbers,
the process of analyzing the data, and the description of a field
of study.
• It is derived from the Latin word status, meaning “manner of
standing” or “position.”
• Statistics were first used by tax assessors to collect
information for determining assets and assessing taxes.
Introduction
• Variable -any character, characteristic or quality that varies is
termed a variable.
• E.g. - to collect basic clinical and demographic information on
patients with a particular illness. The variables of interest may
include the sex, age and height of the patients.
Nominal
Categories are
mutually
exclusive and
unordered
e. g- Sex (M/F)
Blood group
(A/B/AB/O)
Ordinal
Categories are
mutually
exclusive and
ordered
e.g- Disease
severity
(mild/moderate/
severe)
Discrete
Integer values,
typically counts
No notion of
magnitude
e.g- no. of
children
vaccinated, days
sick per year
VARIABLE
Categorical
(Qualitative)
Numerical
(Quantitative)
Continuous
Takes any value
in a range of
values
Have a
magnitude
e. g- weight in kg,
Height in cm
Scales of measurement
• Nominal or categorical
• Ordinal
• Interval
• Ratio
Nominal scale
• Simplest level of measurement when data values fit into
categories.
• Observations are dichotomous or binary in that the outcome
can take on only one of two values: yes or no.
• Mutually exclusive.
• E.g sex of patient(M/F), nationality
Ordinal scale
• When an inherent order occurs among the categories, the
observations are said to be measured on an ordinal scale.
• Clinicians often use ordinal scales to determine a patient's
amount of risk or the appropriate type of therapy.
• E.g socio-economic class, rank order of a class(1st,2nd, 3rd)
VAS( visual analog scale) for pain
Scales of measurement
• Interval Scale
• Data classified by ranking.
• Quantitative classification .
• Zero point of scale is arbitrary (differences are
meaningful).
• Fahrenheit temp. scale , Time
• Ratio Scale
• Data classified as the ratio of two numbers.
• Quantitative classification .
• Zero point of scale is absolute (data can be added,
subtracted, multiplied, and divided).
• E. g- Kelvin temp. scale, Weight, Height
Statistics
InferentialDescriptive
Collecting
Organising
Summarising
Presenting Data
Making inference
Hypothesis testing
Determining
relationships
Making predictions
Descriptive statistics
• Descriptive statistics is the term given to the analysis of data
that helps describe, show or summarize data in a meaningful
way such that patterns might emerge from the data.
• Does not allow us to make conclusions beyond the data we
have analysed or reach conclusions regarding any hypotheses
we might have made.
• Simply a way to describe the data.
Inferential statistics
• Inferential statistics is concerned with making predictions or
inferences about a population from observations and analysis
of a sample.
• We can take the results of an analysis using a sample and can
generalize it to the larger population that the sample
represents.
Three types of analysis
• Univariate analysis
– the examination of the distribution of cases on only one
variable at a time (e.g., weight of college students)
• Bivariate analysis
– the examination of two variables simultaneously (e.g., the
relation between gender and weight of college students )
• Multivariate analysis
– the examination of more than two variables simultaneously
(e.g., the relationship between gender, race and weight of
college students)
Purpose of diff. types of analysis
• Univariate analysis
– Purpose: mainly description
• Bivariate analysis
– Purpose: determining the empirical relationship between the
two variables
• Multivariate analysis
– Purpose: determining the empirical relationship among
multiple variables
Choosing the Statistical Technique
Specific research question or hypothesis
Determine # of variables in question
Univariate analysis Bivariate analysis Multivariate analysis
Determine level of
measurement of variables
Choose univariate
method of analysis
Choose relevant
descriptive statistics
Choose relevant
inferential statistics
UNIVARIATE ANALYSIS
INFERENTIAL STATISTICSDESCRIPTIVE STATISTICS
1)Measures of central tendency
Mean
Median
Mode
2)Measures of dispersion
Range
Variance
Standard deviation
1)Z Test
2)T test
3)Chi-square test
Mean
• Arithmetic, or simple, mean is used most frequently in
statistics.
• Arithmetic average of the observations.
Median
• The median is the middle observation, that is,
the point at which half the observations are
smaller and half are larger.
• Used in place of mean when the data is
skewed (not sensitive to outliers)
• E.g –median price of housing in real estate,
median household income
Mode
• Value that occurs most frequently.
• Commonly used for a large
number of observations when the
researcher wants to designate the
value that occurs most often.
• Bimodal: When a set of data has
two modes
• For frequency tables mode is
estimated by the modal class.
Example of mean median mode
related to incomes
Shapes of distribution
Normal distribution:
symmetrical Bell-shaped
curve
Positively skewed:
tail on the right, cluster towards low end
of the variable
Negatively skewed:
tail on the left, cluster towards high-end
of the variable
symmetrical
Bimodality: A double peak
asymmetrical
Position of mean median mode
The mean is used for numerical data and for symmetric (not
skewed) distributions.
• The median is used for ordinal data or for numerical data if the
distribution is skewed.
• The mode is used primarily for bimodal distributions.
• The geometric mean is
Mean>median>modeMode>median>mean
Mode>median>mean
Averages: advantages and disadvantages
Types of average Advantage Disadvantage
Mean 1.Uses all data values
2.Algebraically defined
1.Distorted by outliers
2.Distorted by skewness
Median 1.Not distorted by outliers
2.Not distorted by skewness
1.Ignores most of the
information
2.Not Algebraically defined
Mode 1.Easily determined for
categorical data
1.Not Algebraically defined
Geometric mean 1.Appropriate for skewed data 1.Only appropriate if log
transformation produces a
symmetrical distribution
Range
• Difference between minimum and maximum value in a data
set
• Larger range usually (but not always) indicates a large spread
or deviation in the values of the data set.
• (73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74,
50, 85, 45, 63, 100)
• Range- 100-45 =55
• Range defines the normal limits of a biological
characteristic.
• e. g- systolic BP – 100-140 mm of Hg
diastolic BP - 80-90 mm of Hg
Urea - 15-40 mg
Variance
• A measure of how data points differ from the mean
– Measure of dispersion for a given score, The larger the variance is,
the more the scores deviate, on average, away from the mean
• Average of the squared differences from the mean
– difficult to interpret because it is in the units of square of variables
For population variance,
 
2
2
x X
N

 

For sample variance,
 
2
2
1
x X
s
n
 


Standard Deviation
• A measure of variation that gives spread of data about the
mean
• Square root of the variance
• higher standard deviation indicates higher spread, less
consistency, and less clustering.
• sample standard deviation:
• population standard deviation:
• measured in the same units as the data, making it easy to
interpret.
 
2
1
x X
s
n
 


 
2
x
N


 

Relative or Standard Normal Deviate or Variate (Z)
• Deviation from the mean in a
normal distribution or curve
• Given by symbol “Z”
• Z =
𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏−𝑴𝒆𝒂𝒏
𝑺𝑫
• Indicates how much an
observation is bigger or smaller
than mean in units of SD
• 95% confidence level: Z = 1.96
• 99% confidence level: Z = 2.575
• 99.9 confidence level: Z = 3.27
•
Normal distribution curve with SD
Inferential univariate analysis
components
Test Description Test Statistic
Compare an Observed
Mean with Some
Predetermined Value
Z or t-test
Compare an Observed
Frequency with a
Predetermined Value
X 2
Compare an Observed
Proportion with Some
Predetermined Value
Z or t-test for Proportions
Type of variable
Nominal
proportions
t test of
proportion
NominalOrdinalInterval or ratio
Kolmogorov-
Smirnov test
Chi square testZ test or t test
Drawing sample from population
Interval estimation
Confidence interval (CI)
Provide us with a range of values that we believe, with a
given level of confidence, containes a true value
CI for the population means
n
SD
SEM
SEMxCI
SEMxCI



58.2%99
96.1%95
Testing of hypotheses
learning objectives:
» to understand the role of significance
» to distinguish the null and alternative
hypotheses
» to interpret p-value, type I and II errors
Hypothesis testing
• Hypotheses are defined as formal statements of explanations
stated in a testable form.
• To test statistical hypotheses two presumptions are made to
draw the inference from sample value.
• Logic- designed to detect significant differences: differences
that did not occur by random chance.
Formulate
hypotheses
Collect data to
test hypotheses
Accept hypothesis Reject hypothesis
Null and alternate hypothesis
1. Null Hypothesis (H0)
– The difference is caused by random chance.
– The H0 always states there is “no significant difference.” it
means that there is no significant difference between the
population mean and the sample mean.
2. Alternate hypothesis (H1)
– “The difference is real”.
– (H1) always contradicts the H0.
• One (and only one) of these explanations must be true.
Testing Hypotheses:
The Five Step Model
1. Make Assumptions and meet test requirements.
2. State the null hypothesis.
3. Select the sampling distribution and establish the critical
region.
4. Compute the test statistic.
5. Make a decision and interpret results.
Two-tailed vs. One-tailed Tests
• In a two-tailed test, the direction of the difference is not
predicted.
• A two-tailed test splits the critical region equally on both sides
of the curve.
• In a one-tailed test, the researcher predicts the direction (i.e.
greater or less than) of the difference.
• All of the critical region is placed on the side of the curve in
the direction of the prediction.
The Curve for Two- vs. One-tailed Tests at α = .05:
Two-tailed test:
“is there a significant
difference?”
One-tailed tests:
“is the sample mean
greater than µ or Pu?”
“is the sample mean
less than µ or Pu?”
Two tailed test
One tailed test
Type I and Type II Errors
• Type I, or alpha error:
– Rejecting a true null hypothesis
• Type II, or beta error:
– Failing to reject a false null hypothesis
Testing of hypotheses
Type I and Type II Errors. Example
Suppose there is a test for a particular disease.
If the disease really exists and is diagnosed early, it can be
successfully treated
If it is not diagnosed and not treated, the person will
become severely disabled
If a person is erroneously diagnosed as having the disease
and treated, no physical damage is done.
To which type of error we are willing to risk ?
Testing of hypotheses
Type I and Type II Errors. Example.
Decision No disease Disease
Not diagnosed OK Type II error
Diagnosed Type I error OK
treated but not harmed
by the treatment
irreparable damage
would be done
Inference: to avoid Type error II, have high level of
significance
Z-test
• For testing the significance of differences between two means
of samples
• It calculates the probability that the two samples could have
drawn from populations of the same mean, differences arising
merely from sampling variability.
• Assumptions for z-test
• Population approximately Normal or large sample. Sample
size must be larger than 30
• Data must be quantitative
• The variable is assumed to follow normal distribution in
the population
Example (z test of proportion)
• In a recent survey, 55% of the population were
found to be aware of modes of HIV
transmission. A random sample of 150 urban
persons showed that 49% of them were aware
of the same. Is the difference significant?
• Using the formula for proportions and 5 step
method to solve…
Solution:
Step 1:
Random sample
The sample is large (>30)
Step 2:
H0: Pu = .55 (converting % to proportion)
( H0: Ps = Pu )
H1: Pu≠ .55
Step 3:
The sample is large, so we use Z distribution
Alpha (α) = .05
Critical Z = ±1.96
Solution (cont.)
Step 4
Step 5
• Z (obtained) < Z (critical)
• Fail to reject Ho. There is no significant difference
between the state population and the urban sample.
48.1
150/)55.1(55.
55.49.
/)1(







nPP
PP
Z
uu
us
t test
• When the sample size is small (approximately < 30)
then the Student’s t distribution should be used
• The test statistic is known as “t”.
• The curve of the t distribution is flatter than that of
the Z distribution but as the sample size increases, the
t-curve starts to resemble the Z-curve
• t Z as n increases,
If n>100, t approaches Z.
Example of t test
• A random sample of 26 sociology graduates
scored 458 on the advanced sociology test with
a standard deviation of 20. Is this significantly
different from the population average (µ =
440)?
Solution (using five step model)
• Step 1: Make Assumptions and Meet Test
Requirements:
• 1. Random sample
• 2. Level of measurement is interval-ratio
• 3. The sample is small (<30)
Solution (cont.)
Step 2: State the null and alternate hypotheses.
H0: null hypothesis µ = 440 (or H0: = μ)
H1:alternate hypothesis µ ≠ 440
Solution (cont.)
• Step 3: Select Sampling Distribution and
Establish the Critical Region
1. Small sample, I-R scale of measurement, so t
test to be used.
2. Alpha (α) = .05
3. Degrees of Freedom = n-1 = 26-1 = 25
4. Critical t = ±2.060
Solution (cont.)
• Step 4: Using Formula to Compute the Test
Statistic
5.4
126
20
440458
1







n
S
t

Looking at the curve for the t distribution
Alpha (α) = .05
t= -2.060
c
t = +2.060
c t= +4.50
I
Step 5 Make a Decision and Interpret
Results
• The obtained t score fell in the Critical Region, so we
reject the H0 { t (obtained) > t (critical) }
– If the H0 were true, a sample outcome of 458 would
be unlikely.
– Therefore, the H0 is false and must be rejected.
• Sociology graduates have a score that is significantly
different from the general student body at (t = 4.5, df =
25, α = .05).
Main Considerations in Hypothesis
Testing:
• Sample size
– Use Z for large samples, t for small (<100)
• There are two other choices to be made:
– One-tailed or two-tailed test
• “Is there a difference?” = 2-tailed test
• “Is the difference less than or greater than?” = 1-tailed test
• Alpha (α) level
– .05, .01, or .001? (α=.05 is most common)
Selected nonparametric tests
Chi-Square goodness of fit test.
To determine whether a variable has a frequency distribution
comparable to the one expected
expected frequency can be based on
• theory
• previous experience
• comparison groups
2
)(
1
eioi
ei
ff
f



Selected nonparametric tests
Chi-Square goodness of fit test. Example
The average prognosis of total hip replacement in relation to pain
reduction in hip joint is
excellent - 80%
good - 10%
medium - 5%
bad - 5%
In our study of we had got a different outcome
excellent - 95%
good - 2%
medium - 2%
bad - 1%
expected
observed
Do observed frequencies differ from expected ?
Selected nonparametric tests
Chi-Square goodness of fit test. Example
fe1= 80, fe2= 10, fe3=5, fe4= 5;
fo1= 95, fo2= 2, fo3=2, fo4= 1;
2= 14.2, df=3 (4-1)
0.001 < p < 0.01
Null hypothesis is rejected
2 > 9.84 p < 0.05
2 > 11.34 p < 0.01
2 > 16.27 p < 0.001
Advantages
• Simpler model
– Easier to build, test and understand than other models
• More reliable
– More reliable as only one variable is used.
• Descriptive method
– A univariate model is a strong descriptive method. Analysts
can change one variable each time the model is run to
obtain results that show “what if” scenarios. For example,
changing the variables from age to income can show
different results which describe what happens when one
factor changes within the model.
Disadvantages
• Not comprehensive
– A univariate model is less comprehensive compared to
multivariate models. In the real world, there is often more
than just one factor at play and a univariate model is unable
to take this into account due to its inherent limitations.
• Does not establish relationships
– As only one variable can be changed at a time, univariate
models are unable to show relationships between different
factors.
References
• Methods in Biostatistics by BK Mahajan
• Statistical Methods by SP Gupta
• Basic & Clinical Biostatistics by Dawson and Beth
• Park’s textbook of preventive and social medicine 22nd
edition
 Univariate Analysis

More Related Content

What's hot

Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testShakehand with Life
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesisJags Jagdish
 
Bivariate analysis
Bivariate analysisBivariate analysis
Bivariate analysisariassam
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencevasu Chemistry
 
Testing of hypothesis - large sample test
Testing of hypothesis - large sample testTesting of hypothesis - large sample test
Testing of hypothesis - large sample testParag Shah
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distributionswarna dey
 
Data Collection-Primary & Secondary
Data Collection-Primary & SecondaryData Collection-Primary & Secondary
Data Collection-Primary & SecondaryPrathamesh Parab
 
PRIMARY & SECONDARY DATA COLLECTION
PRIMARY & SECONDARY DATA  COLLECTION PRIMARY & SECONDARY DATA  COLLECTION
PRIMARY & SECONDARY DATA COLLECTION Suneal Saini
 
Linear regression
Linear regressionLinear regression
Linear regressionTech_MX
 
Factor analysis
Factor analysisFactor analysis
Factor analysissaba khan
 
SAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORSSAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORSrambhu21
 
Testing of hypotheses
Testing of hypothesesTesting of hypotheses
Testing of hypothesesRajThakuri
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.pptNursing Path
 

What's hot (20)

Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Bivariate analysis
Bivariate analysisBivariate analysis
Bivariate analysis
 
Summarizing data
Summarizing dataSummarizing data
Summarizing data
 
Chi square test
Chi square testChi square test
Chi square test
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inference
 
Testing of hypothesis - large sample test
Testing of hypothesis - large sample testTesting of hypothesis - large sample test
Testing of hypothesis - large sample test
 
sampling ppt
sampling pptsampling ppt
sampling ppt
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Multivariate Analysis
Multivariate AnalysisMultivariate Analysis
Multivariate Analysis
 
Data Collection-Primary & Secondary
Data Collection-Primary & SecondaryData Collection-Primary & Secondary
Data Collection-Primary & Secondary
 
PRIMARY & SECONDARY DATA COLLECTION
PRIMARY & SECONDARY DATA  COLLECTION PRIMARY & SECONDARY DATA  COLLECTION
PRIMARY & SECONDARY DATA COLLECTION
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Type of data
Type of dataType of data
Type of data
 
SAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORSSAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORS
 
Testing of hypotheses
Testing of hypothesesTesting of hypotheses
Testing of hypotheses
 
Sample design
Sample designSample design
Sample design
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
 

Viewers also liked

Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis TechniquesMehul Gondaliya
 
ppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writingppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writingIVRI
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
Univariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squareUnivariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squarekongara
 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methodsguest9fa52
 
Questionnaire Construction
Questionnaire ConstructionQuestionnaire Construction
Questionnaire ConstructionPrakash Aryal
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and ProcessingMehul Gondaliya
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overviewguest3311ed
 
DIstinguish between Parametric vs nonparametric test
 DIstinguish between Parametric vs nonparametric test DIstinguish between Parametric vs nonparametric test
DIstinguish between Parametric vs nonparametric testsai prakash
 
Research report ppt
Research report pptResearch report ppt
Research report pptVikas Dalmia
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpointjamiebrandon
 
steps in Questionnaire design
steps in Questionnaire designsteps in Questionnaire design
steps in Questionnaire designheena pathan
 
My research proposal.ppt
My research proposal.pptMy research proposal.ppt
My research proposal.pptnanimamat
 
The Research Proposal
The Research ProposalThe Research Proposal
The Research Proposalguest349908
 

Viewers also liked (20)

Bivariate
BivariateBivariate
Bivariate
 
Univariate Analysis
Univariate AnalysisUnivariate Analysis
Univariate Analysis
 
Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis Techniques
 
ppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writingppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writing
 
Multivariate analysis
Multivariate analysisMultivariate analysis
Multivariate analysis
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Bivariate
BivariateBivariate
Bivariate
 
Manova
ManovaManova
Manova
 
Univariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squareUnivariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi square
 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
 
Questionnaire Construction
Questionnaire ConstructionQuestionnaire Construction
Questionnaire Construction
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and Processing
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overview
 
DIstinguish between Parametric vs nonparametric test
 DIstinguish between Parametric vs nonparametric test DIstinguish between Parametric vs nonparametric test
DIstinguish between Parametric vs nonparametric test
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Research report ppt
Research report pptResearch report ppt
Research report ppt
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
steps in Questionnaire design
steps in Questionnaire designsteps in Questionnaire design
steps in Questionnaire design
 
My research proposal.ppt
My research proposal.pptMy research proposal.ppt
My research proposal.ppt
 
The Research Proposal
The Research ProposalThe Research Proposal
The Research Proposal
 

Similar to Univariate Analysis

Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excelParag Shah
 
Statistical Methods in Research
Statistical Methods in ResearchStatistical Methods in Research
Statistical Methods in ResearchManoj Sharma
 
Medical Statistics.ppt
Medical Statistics.pptMedical Statistics.ppt
Medical Statistics.pptssuserf0d95a
 
Stats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptStats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptDiptoKumerSarker1
 
Review of Chapters 1-5.ppt
Review of Chapters 1-5.pptReview of Chapters 1-5.ppt
Review of Chapters 1-5.pptNobelFFarrar
 
Statistics for Medical students
Statistics for Medical studentsStatistics for Medical students
Statistics for Medical studentsANUSWARUM
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxDrLasya
 
Introduction to statistics.pptx
Introduction to statistics.pptxIntroduction to statistics.pptx
Introduction to statistics.pptxMuddaAbdo1
 
BIOSTATISTICS.pptx
BIOSTATISTICS.pptxBIOSTATISTICS.pptx
BIOSTATISTICS.pptxkajolbhavsar
 
Statistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuStatistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuSudhir INDIA
 
Basic for DoE ruchir
Basic for DoE ruchirBasic for DoE ruchir
Basic for DoE ruchirRuchir Shah
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statisticsdonthuraj
 
7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdf7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdfezaldeen2013
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminardrdeepika87
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesisJunaid Ijaz
 
Introduction to statistics RSS6 2014
Introduction to statistics RSS6 2014Introduction to statistics RSS6 2014
Introduction to statistics RSS6 2014RSS6
 

Similar to Univariate Analysis (20)

Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excel
 
Statistical Methods in Research
Statistical Methods in ResearchStatistical Methods in Research
Statistical Methods in Research
 
Medical Statistics.ppt
Medical Statistics.pptMedical Statistics.ppt
Medical Statistics.ppt
 
Stats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptStats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.ppt
 
Review of Chapters 1-5.ppt
Review of Chapters 1-5.pptReview of Chapters 1-5.ppt
Review of Chapters 1-5.ppt
 
Statistics for Medical students
Statistics for Medical studentsStatistics for Medical students
Statistics for Medical students
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptx
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Introduction to statistics.pptx
Introduction to statistics.pptxIntroduction to statistics.pptx
Introduction to statistics.pptx
 
BIOSTATISTICS.pptx
BIOSTATISTICS.pptxBIOSTATISTICS.pptx
BIOSTATISTICS.pptx
 
Statistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuStatistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahu
 
Basic for DoE ruchir
Basic for DoE ruchirBasic for DoE ruchir
Basic for DoE ruchir
 
1 introduction to psychological statistics
1 introduction to psychological statistics1 introduction to psychological statistics
1 introduction to psychological statistics
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statistics
 
7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdf7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdf
 
Statistics
StatisticsStatistics
Statistics
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesis
 
Introduction to statistics RSS6 2014
Introduction to statistics RSS6 2014Introduction to statistics RSS6 2014
Introduction to statistics RSS6 2014
 
Descriptive Analysis.pptx
Descriptive Analysis.pptxDescriptive Analysis.pptx
Descriptive Analysis.pptx
 

More from Soumya Sahoo

Public health model
Public health modelPublic health model
Public health modelSoumya Sahoo
 
Prebiotics and probiotics
Prebiotics and probioticsPrebiotics and probiotics
Prebiotics and probioticsSoumya Sahoo
 
Problem based learning
Problem based learningProblem based learning
Problem based learningSoumya Sahoo
 
HIV/AIDS RECENT ADVANCES
HIV/AIDS RECENT ADVANCESHIV/AIDS RECENT ADVANCES
HIV/AIDS RECENT ADVANCESSoumya Sahoo
 
EPIDEMIOLOGY OF HEPATITIS B AND C
EPIDEMIOLOGY OF HEPATITIS B AND CEPIDEMIOLOGY OF HEPATITIS B AND C
EPIDEMIOLOGY OF HEPATITIS B AND CSoumya Sahoo
 
NATURAL HISTORY OF DISEASE
NATURAL HISTORY OF DISEASENATURAL HISTORY OF DISEASE
NATURAL HISTORY OF DISEASESoumya Sahoo
 

More from Soumya Sahoo (7)

Public health model
Public health modelPublic health model
Public health model
 
Prebiotics and probiotics
Prebiotics and probioticsPrebiotics and probiotics
Prebiotics and probiotics
 
Problem based learning
Problem based learningProblem based learning
Problem based learning
 
HIV/AIDS RECENT ADVANCES
HIV/AIDS RECENT ADVANCESHIV/AIDS RECENT ADVANCES
HIV/AIDS RECENT ADVANCES
 
EPIDEMIOLOGY OF HEPATITIS B AND C
EPIDEMIOLOGY OF HEPATITIS B AND CEPIDEMIOLOGY OF HEPATITIS B AND C
EPIDEMIOLOGY OF HEPATITIS B AND C
 
women empowerment
women empowermentwomen empowerment
women empowerment
 
NATURAL HISTORY OF DISEASE
NATURAL HISTORY OF DISEASENATURAL HISTORY OF DISEASE
NATURAL HISTORY OF DISEASE
 

Recently uploaded

Call Girl Nagpur Sia 7001305949 Independent Escort Service Nagpur
Call Girl Nagpur Sia 7001305949 Independent Escort Service NagpurCall Girl Nagpur Sia 7001305949 Independent Escort Service Nagpur
Call Girl Nagpur Sia 7001305949 Independent Escort Service NagpurRiya Pathan
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Gabriel Guevara MD
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalorenarwatsonia7
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...saminamagar
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformKweku Zurek
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Timevijaych2041
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...narwatsonia7
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaPooja Gupta
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Servicesonalikaur4
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbaisonalikaur4
 

Recently uploaded (20)

Call Girl Nagpur Sia 7001305949 Independent Escort Service Nagpur
Call Girl Nagpur Sia 7001305949 Independent Escort Service NagpurCall Girl Nagpur Sia 7001305949 Independent Escort Service Nagpur
Call Girl Nagpur Sia 7001305949 Independent Escort Service Nagpur
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy Platform
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
 

Univariate Analysis

  • 1. UNIVARIATE ANALYSIS Dr. Soumya Swaroop Sahoo JR, Community Medicine PGIMS Rohtak
  • 2. Contents • Introduction • Variables • Types of variables • Scales of measurement • Types of analysis • Components of univariate analysis • Advantages and limitations
  • 3. Introduction • The word “statistics” has several meanings: data or numbers, the process of analyzing the data, and the description of a field of study. • It is derived from the Latin word status, meaning “manner of standing” or “position.” • Statistics were first used by tax assessors to collect information for determining assets and assessing taxes.
  • 4. Introduction • Variable -any character, characteristic or quality that varies is termed a variable. • E.g. - to collect basic clinical and demographic information on patients with a particular illness. The variables of interest may include the sex, age and height of the patients.
  • 5. Nominal Categories are mutually exclusive and unordered e. g- Sex (M/F) Blood group (A/B/AB/O) Ordinal Categories are mutually exclusive and ordered e.g- Disease severity (mild/moderate/ severe) Discrete Integer values, typically counts No notion of magnitude e.g- no. of children vaccinated, days sick per year VARIABLE Categorical (Qualitative) Numerical (Quantitative) Continuous Takes any value in a range of values Have a magnitude e. g- weight in kg, Height in cm
  • 6. Scales of measurement • Nominal or categorical • Ordinal • Interval • Ratio
  • 7. Nominal scale • Simplest level of measurement when data values fit into categories. • Observations are dichotomous or binary in that the outcome can take on only one of two values: yes or no. • Mutually exclusive. • E.g sex of patient(M/F), nationality
  • 8. Ordinal scale • When an inherent order occurs among the categories, the observations are said to be measured on an ordinal scale. • Clinicians often use ordinal scales to determine a patient's amount of risk or the appropriate type of therapy. • E.g socio-economic class, rank order of a class(1st,2nd, 3rd) VAS( visual analog scale) for pain
  • 9. Scales of measurement • Interval Scale • Data classified by ranking. • Quantitative classification . • Zero point of scale is arbitrary (differences are meaningful). • Fahrenheit temp. scale , Time • Ratio Scale • Data classified as the ratio of two numbers. • Quantitative classification . • Zero point of scale is absolute (data can be added, subtracted, multiplied, and divided). • E. g- Kelvin temp. scale, Weight, Height
  • 11. Descriptive statistics • Descriptive statistics is the term given to the analysis of data that helps describe, show or summarize data in a meaningful way such that patterns might emerge from the data. • Does not allow us to make conclusions beyond the data we have analysed or reach conclusions regarding any hypotheses we might have made. • Simply a way to describe the data.
  • 12. Inferential statistics • Inferential statistics is concerned with making predictions or inferences about a population from observations and analysis of a sample. • We can take the results of an analysis using a sample and can generalize it to the larger population that the sample represents.
  • 13. Three types of analysis • Univariate analysis – the examination of the distribution of cases on only one variable at a time (e.g., weight of college students) • Bivariate analysis – the examination of two variables simultaneously (e.g., the relation between gender and weight of college students ) • Multivariate analysis – the examination of more than two variables simultaneously (e.g., the relationship between gender, race and weight of college students)
  • 14. Purpose of diff. types of analysis • Univariate analysis – Purpose: mainly description • Bivariate analysis – Purpose: determining the empirical relationship between the two variables • Multivariate analysis – Purpose: determining the empirical relationship among multiple variables
  • 15. Choosing the Statistical Technique Specific research question or hypothesis Determine # of variables in question Univariate analysis Bivariate analysis Multivariate analysis Determine level of measurement of variables Choose univariate method of analysis Choose relevant descriptive statistics Choose relevant inferential statistics
  • 16. UNIVARIATE ANALYSIS INFERENTIAL STATISTICSDESCRIPTIVE STATISTICS 1)Measures of central tendency Mean Median Mode 2)Measures of dispersion Range Variance Standard deviation 1)Z Test 2)T test 3)Chi-square test
  • 17. Mean • Arithmetic, or simple, mean is used most frequently in statistics. • Arithmetic average of the observations.
  • 18. Median • The median is the middle observation, that is, the point at which half the observations are smaller and half are larger. • Used in place of mean when the data is skewed (not sensitive to outliers) • E.g –median price of housing in real estate, median household income
  • 19. Mode • Value that occurs most frequently. • Commonly used for a large number of observations when the researcher wants to designate the value that occurs most often. • Bimodal: When a set of data has two modes • For frequency tables mode is estimated by the modal class.
  • 20. Example of mean median mode related to incomes
  • 21. Shapes of distribution Normal distribution: symmetrical Bell-shaped curve Positively skewed: tail on the right, cluster towards low end of the variable Negatively skewed: tail on the left, cluster towards high-end of the variable symmetrical Bimodality: A double peak asymmetrical
  • 22. Position of mean median mode The mean is used for numerical data and for symmetric (not skewed) distributions. • The median is used for ordinal data or for numerical data if the distribution is skewed. • The mode is used primarily for bimodal distributions. • The geometric mean is Mean>median>modeMode>median>mean Mode>median>mean
  • 23. Averages: advantages and disadvantages Types of average Advantage Disadvantage Mean 1.Uses all data values 2.Algebraically defined 1.Distorted by outliers 2.Distorted by skewness Median 1.Not distorted by outliers 2.Not distorted by skewness 1.Ignores most of the information 2.Not Algebraically defined Mode 1.Easily determined for categorical data 1.Not Algebraically defined Geometric mean 1.Appropriate for skewed data 1.Only appropriate if log transformation produces a symmetrical distribution
  • 24. Range • Difference between minimum and maximum value in a data set • Larger range usually (but not always) indicates a large spread or deviation in the values of the data set. • (73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100) • Range- 100-45 =55 • Range defines the normal limits of a biological characteristic. • e. g- systolic BP – 100-140 mm of Hg diastolic BP - 80-90 mm of Hg Urea - 15-40 mg
  • 25. Variance • A measure of how data points differ from the mean – Measure of dispersion for a given score, The larger the variance is, the more the scores deviate, on average, away from the mean • Average of the squared differences from the mean – difficult to interpret because it is in the units of square of variables For population variance,   2 2 x X N     For sample variance,   2 2 1 x X s n    
  • 26. Standard Deviation • A measure of variation that gives spread of data about the mean • Square root of the variance • higher standard deviation indicates higher spread, less consistency, and less clustering. • sample standard deviation: • population standard deviation: • measured in the same units as the data, making it easy to interpret.   2 1 x X s n       2 x N     
  • 27. Relative or Standard Normal Deviate or Variate (Z) • Deviation from the mean in a normal distribution or curve • Given by symbol “Z” • Z = 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏−𝑴𝒆𝒂𝒏 𝑺𝑫 • Indicates how much an observation is bigger or smaller than mean in units of SD • 95% confidence level: Z = 1.96 • 99% confidence level: Z = 2.575 • 99.9 confidence level: Z = 3.27 •
  • 29. Inferential univariate analysis components Test Description Test Statistic Compare an Observed Mean with Some Predetermined Value Z or t-test Compare an Observed Frequency with a Predetermined Value X 2 Compare an Observed Proportion with Some Predetermined Value Z or t-test for Proportions
  • 30. Type of variable Nominal proportions t test of proportion NominalOrdinalInterval or ratio Kolmogorov- Smirnov test Chi square testZ test or t test
  • 31. Drawing sample from population
  • 32. Interval estimation Confidence interval (CI) Provide us with a range of values that we believe, with a given level of confidence, containes a true value CI for the population means n SD SEM SEMxCI SEMxCI    58.2%99 96.1%95
  • 33. Testing of hypotheses learning objectives: » to understand the role of significance » to distinguish the null and alternative hypotheses » to interpret p-value, type I and II errors
  • 34. Hypothesis testing • Hypotheses are defined as formal statements of explanations stated in a testable form. • To test statistical hypotheses two presumptions are made to draw the inference from sample value. • Logic- designed to detect significant differences: differences that did not occur by random chance. Formulate hypotheses Collect data to test hypotheses Accept hypothesis Reject hypothesis
  • 35. Null and alternate hypothesis 1. Null Hypothesis (H0) – The difference is caused by random chance. – The H0 always states there is “no significant difference.” it means that there is no significant difference between the population mean and the sample mean. 2. Alternate hypothesis (H1) – “The difference is real”. – (H1) always contradicts the H0. • One (and only one) of these explanations must be true.
  • 36. Testing Hypotheses: The Five Step Model 1. Make Assumptions and meet test requirements. 2. State the null hypothesis. 3. Select the sampling distribution and establish the critical region. 4. Compute the test statistic. 5. Make a decision and interpret results.
  • 37. Two-tailed vs. One-tailed Tests • In a two-tailed test, the direction of the difference is not predicted. • A two-tailed test splits the critical region equally on both sides of the curve. • In a one-tailed test, the researcher predicts the direction (i.e. greater or less than) of the difference. • All of the critical region is placed on the side of the curve in the direction of the prediction.
  • 38. The Curve for Two- vs. One-tailed Tests at α = .05: Two-tailed test: “is there a significant difference?” One-tailed tests: “is the sample mean greater than µ or Pu?” “is the sample mean less than µ or Pu?”
  • 41. Type I and Type II Errors • Type I, or alpha error: – Rejecting a true null hypothesis • Type II, or beta error: – Failing to reject a false null hypothesis
  • 42. Testing of hypotheses Type I and Type II Errors. Example Suppose there is a test for a particular disease. If the disease really exists and is diagnosed early, it can be successfully treated If it is not diagnosed and not treated, the person will become severely disabled If a person is erroneously diagnosed as having the disease and treated, no physical damage is done. To which type of error we are willing to risk ?
  • 43. Testing of hypotheses Type I and Type II Errors. Example. Decision No disease Disease Not diagnosed OK Type II error Diagnosed Type I error OK treated but not harmed by the treatment irreparable damage would be done Inference: to avoid Type error II, have high level of significance
  • 44. Z-test • For testing the significance of differences between two means of samples • It calculates the probability that the two samples could have drawn from populations of the same mean, differences arising merely from sampling variability. • Assumptions for z-test • Population approximately Normal or large sample. Sample size must be larger than 30 • Data must be quantitative • The variable is assumed to follow normal distribution in the population
  • 45. Example (z test of proportion) • In a recent survey, 55% of the population were found to be aware of modes of HIV transmission. A random sample of 150 urban persons showed that 49% of them were aware of the same. Is the difference significant? • Using the formula for proportions and 5 step method to solve…
  • 46. Solution: Step 1: Random sample The sample is large (>30) Step 2: H0: Pu = .55 (converting % to proportion) ( H0: Ps = Pu ) H1: Pu≠ .55 Step 3: The sample is large, so we use Z distribution Alpha (α) = .05 Critical Z = ±1.96
  • 47. Solution (cont.) Step 4 Step 5 • Z (obtained) < Z (critical) • Fail to reject Ho. There is no significant difference between the state population and the urban sample. 48.1 150/)55.1(55. 55.49. /)1(        nPP PP Z uu us
  • 48. t test • When the sample size is small (approximately < 30) then the Student’s t distribution should be used • The test statistic is known as “t”. • The curve of the t distribution is flatter than that of the Z distribution but as the sample size increases, the t-curve starts to resemble the Z-curve • t Z as n increases, If n>100, t approaches Z.
  • 49. Example of t test • A random sample of 26 sociology graduates scored 458 on the advanced sociology test with a standard deviation of 20. Is this significantly different from the population average (µ = 440)?
  • 50. Solution (using five step model) • Step 1: Make Assumptions and Meet Test Requirements: • 1. Random sample • 2. Level of measurement is interval-ratio • 3. The sample is small (<30)
  • 51. Solution (cont.) Step 2: State the null and alternate hypotheses. H0: null hypothesis µ = 440 (or H0: = μ) H1:alternate hypothesis µ ≠ 440
  • 52. Solution (cont.) • Step 3: Select Sampling Distribution and Establish the Critical Region 1. Small sample, I-R scale of measurement, so t test to be used. 2. Alpha (α) = .05 3. Degrees of Freedom = n-1 = 26-1 = 25 4. Critical t = ±2.060
  • 53. Solution (cont.) • Step 4: Using Formula to Compute the Test Statistic 5.4 126 20 440458 1        n S t 
  • 54. Looking at the curve for the t distribution Alpha (α) = .05 t= -2.060 c t = +2.060 c t= +4.50 I
  • 55. Step 5 Make a Decision and Interpret Results • The obtained t score fell in the Critical Region, so we reject the H0 { t (obtained) > t (critical) } – If the H0 were true, a sample outcome of 458 would be unlikely. – Therefore, the H0 is false and must be rejected. • Sociology graduates have a score that is significantly different from the general student body at (t = 4.5, df = 25, α = .05).
  • 56. Main Considerations in Hypothesis Testing: • Sample size – Use Z for large samples, t for small (<100) • There are two other choices to be made: – One-tailed or two-tailed test • “Is there a difference?” = 2-tailed test • “Is the difference less than or greater than?” = 1-tailed test • Alpha (α) level – .05, .01, or .001? (α=.05 is most common)
  • 57. Selected nonparametric tests Chi-Square goodness of fit test. To determine whether a variable has a frequency distribution comparable to the one expected expected frequency can be based on • theory • previous experience • comparison groups 2 )( 1 eioi ei ff f   
  • 58. Selected nonparametric tests Chi-Square goodness of fit test. Example The average prognosis of total hip replacement in relation to pain reduction in hip joint is excellent - 80% good - 10% medium - 5% bad - 5% In our study of we had got a different outcome excellent - 95% good - 2% medium - 2% bad - 1% expected observed Do observed frequencies differ from expected ?
  • 59. Selected nonparametric tests Chi-Square goodness of fit test. Example fe1= 80, fe2= 10, fe3=5, fe4= 5; fo1= 95, fo2= 2, fo3=2, fo4= 1; 2= 14.2, df=3 (4-1) 0.001 < p < 0.01 Null hypothesis is rejected 2 > 9.84 p < 0.05 2 > 11.34 p < 0.01 2 > 16.27 p < 0.001
  • 60. Advantages • Simpler model – Easier to build, test and understand than other models • More reliable – More reliable as only one variable is used. • Descriptive method – A univariate model is a strong descriptive method. Analysts can change one variable each time the model is run to obtain results that show “what if” scenarios. For example, changing the variables from age to income can show different results which describe what happens when one factor changes within the model.
  • 61. Disadvantages • Not comprehensive – A univariate model is less comprehensive compared to multivariate models. In the real world, there is often more than just one factor at play and a univariate model is unable to take this into account due to its inherent limitations. • Does not establish relationships – As only one variable can be changed at a time, univariate models are unable to show relationships between different factors.
  • 62. References • Methods in Biostatistics by BK Mahajan • Statistical Methods by SP Gupta • Basic & Clinical Biostatistics by Dawson and Beth • Park’s textbook of preventive and social medicine 22nd edition