SlideShare a Scribd company logo
1 of 43
Point and Interval Estimation
Presented by:
Shubham Mehta 0019
GOALS
1. Methods of Estimation
2. Difference between Point and Interval Estimation
3. Defining Level of Confidence
4. Constructing Confidence Intervals
5. Interpretation of these confidence intervals
6. Determine the sample size for attribute and variable
sampling.
7. Explanations using examples /case study
What are Estimators?
 In statistics, an estimator is a function of the data or
sample that is used to infer the value of an
unknown parameter in population in a statistical
model.
 Thus the estimator, the quantity of interest
(the estimand or parameter) and its result (the
estimate) are different from each other.
Qualities of Estimators…Statisticians have already
determined the “best” way to estimate a population
parameter.
 Qualities desirable in estimators include unbiasedness,
consistency, and relative efficiency:
• An unbiased estimator of a population parameter is an
estimator whose expected value is equal to that
parameter.
• An unbiased estimator is said to be consistent if the
difference between the estimator and the parameter
grows smaller as the sample size grows larger.
• If there are two unbiased estimators of a parameter, the
one whose variance is smaller is said to be relatively
efficient.
Estimation…
 There are two types of inference: estimation and
hypothesis testing; estimation is introduced first.
 The objective of estimation is to determine the
approximate value of a population parameter on the
basis of a sample statistic.
 E.g., the sample mean ( ) is employed to estimate
the population mean ( ).
Point & Interval
Estimation…
 For example, suppose we want to estimate the mean summer income of a class of
business students. For n=25 students.
 It is calculated and average is found to be 400 $/week.
point estimate interval estimate
 An alternative statement is:
 The mean income is between 380 and 420 $/week.
10.6
Estimation…
 The objective of estimation is to determine the
approximate value of a population parameter on the
basis of a sample statistic.
 There are two types of estimators:
 Point Estimator
 Interval Estimator
Point Estimator…
 A point estimator draws inferences about a population
by estimating the value of an unknown parameter using
a single value or point.
 We saw earlier that point probabilities in continuous
distributions were virtually zero. Likewise, we’d expect
that the point estimator gets closer to the parameter
value with an increased sample size, but point estimators
don’t reflect the effects of larger sample sizes. Hence we
will employ the interval estimator to estimate
population parameters…
Interval Estimator…
 An interval estimator draws inferences about a
population by estimating the value of an unknown
parameter using an interval.
 That is we say (with some ___% certainty) that the
population parameter of interest is between some lower
and upper bounds.
5. Statistical Inference: Estimation
Goal: How can we use sample data to estimate values of
population parameters?
Point estimate: A single statistic value that is the “best
guess” for the parameter value
Interval estimate: An interval of numbers around the
point estimate, that has a fixed “confidence level” of
containing the parameter value. Called a confidence
interval.
(Based on sampling distribution of the point estimate)
Point Estimators – Most common to use
sample values
 Sample mean estimates population mean m
ˆ iy
y
n
m  

• Sample std. dev. estimates population std. dev. s
2
( )
ˆ
1
iy y
s
n
s

 


• Sample proportion p estimates population
proportion p^.
Methods of finding
estimators…
Point estimators
 Method of moment
 Maximum likelihood
estimators
 Bayes estimators
 The EM algorithm
Interval estimators
 Inverting a test statistic
 Pivotal quantities
 Pivoting the CDF
 Bayesian intervals
Confidence Interval
 Confidence Interval of a parameter consists of an interval of
numbers along with a probability that the interval contains the
unknown parameter
 A confidence interval gives a range estimate of values:
 Takes into consideration variation in sample statistics from sample to
sample
 Based on all the observations from one sample
 Gives information about closeness to unknown population parameters
 Stated in terms of level of confidence
 Example: 95% confidence, 99% confidence
 Can never be 100% confident
 A confidence interval estimate of 100% would be so wide as to
be meaningless for practical decision making
Confidence Interval
 The larger the CI , the more confident we can be that the
given interval contains the unknown parameter.
 Ideally, we prefer a short interval with high degree of
confidence.
For Example: We will prefer (95,100) with 95% confidence
than (0,100) with 100% confidence.
 The value corresponding to a significance level that
determines those test statistics that lead to rejection of null
hypothesis and those that lead to a decision not to reject
null hypothesis is referred to as Critical Value.
The general formula for all
confidence intervals is:
Point Estimate ± (Critical Value) (Standard Error)
Sample Mean
or
Sample Proportion
The “z” or “t”
Critical Value
σ / √ n or s / √ n
Confidence Interval Estimates
The Level of Significance (α)
 Because we only select one sample, and μ or π
are unknown, we never really know whether the
confidence interval includes the true population
mean or proportion, or not.
 The level of significance, or “α” risk is the chance
we take that the true population parameter is not
contained in the confidence interval.
 Therefore, a 95% confidence interval would have
an “α” of 5%
95% Confidence Interval
“ α “ is the proportion in the tails of the sampling distribution
that is outside the established confidence interval.
If α = .05, then
each tail has
.025 area
a = .025a = .025
.9750.0250
+ 1.96 z- 1.96 z
Z .06
- 1.9 .0250
Z .06
+ 1.9 .9750
The critical values of “z” that
define the “α” areas are
-1.96 and + 1.96
Point Estimate
The Level of Significance (α)
Level of Confidence
 The Level of Confidence in a confidence interval is
a probability that represents the percentage of
intervals that will contain if a large number of
repeated samples are obtained. The level of
confidence is denoted
 For example, a 95% level of confidence
would mean that if 100 confidence intervals were
constructed, each based on a different sample from
the same population, we would expect 95 of the
intervals to contain the population mean.
Constructing Confidence Interval
 The construction of a confidence interval for the
population mean depends upon three factors
1. The point estimate of the population
2. The level of confidence
3. The standard deviation of the sample mean
1. Select our desired level of confidence
Let’s suppose we want to construct an interval
using the 95% confidence level
2. Calculate α and α/2
(1-α)*100% = 95%  α = 0.05, α/2 = 0.025
3. Look up the corresponding z-score
α/2 = 0.025  a z-score of 1.96
Constructing Confidence Interval
 4. Multiply the z-score by the standard error to
find the margin of error
 5. Find the interval by adding and subtracting this
product from the mean
).,.( 2/2/ errorstdZxerrorstdZx  
nn
Z
ss
  96.12/
)96.1,96.1(
n
x
n
x
ss

Constructing Confidence Interval
95% Confidence Interval
.9750.0250
+ 1.96 z- 1.96 z
Point Estimate
Constructing Confidence Interval
a = .025a = .025
Graph will be as follows
Common Confidence Levels
and α values
Here is a table of commonly used confidence
levels, α and α/2 values, and corresponding z-
scores which we are using in our examples:
• (1 - α)*100% α α/2 Zα/2
• 90% 0.1 0.05 1.645
• 95% 0.05 0.025 1.96
• 99% 0.01 0.005 2.58
Ex : Suppose we conduct a poll to try and get a sense of the
outcome of an upcoming election with two candidates.
We poll 1000 people, and 550 of them respond that they
will vote for candidate A
How confident can we be that a given person will cast
their vote for candidate A?
1. Select our desired levels of confidence
We’re going to use the 90%, 95%, and 99% levels
Constructing Confidence Interval
Example
2. Calculate α and α/2
Our  values are 0.1, 0.05, and 0.01 respectively
Our /2 values are 0.05, 0.025, and 0.005
3. Look up the corresponding z-scores
Our Z/2 values are 1.645, 1.96, and 2.58
4. Multiply the z-score by the standard error to find
the margin of error
First we need to calculate the standard error
Constructing Confidence Interval
Example
5. Find the interval by adding and subtracting this
product from the mean
In this case, we are working with a distribution we have
not previously discussed, a normal binomial distribution
(i.e. a vote can choose Candidate A or B, a binomial
function)
We have a probability estimator from our sample, where
the probability of an individual in our sample voting for
candidate A was found to be 550/1000 or 0.55
We can use this information in a formula to estimate the
standard error for such a distribution:
Constructing Confidence Interval
Example
5. Multiply the z-score by the standard error cont.
• For a normal binominal distribution, the standard
error can be estimated using:
sX =
s
n =
(p)(1-p)
n
=
(0.55)(0.45)
1000
= 0.0157
• We can now multiply this value by the z-scores to
calculate the margins of error for each conf. level
Constructing Confidence Interval
Example
5. Multiply the z-score by the standard error cont.
• We calculate the margin of error and add and subtract that
value from the mean (0.55 in this case) to find the bounds
of our confidence intervals at each level of confidence:
Margin Bounds
CI Z/2 of error Lower Upper
90% 1.645 0.026 0.524 0.576
95% 1.96 0.031 0.519 0.581
99% 2.58 0.041 0.509 0.591
Constructing Confidence Interval
Example
Some Myths
 What if we can make a“guess”about proportion value?
 How can we predict whether it will rain tomorrow?
 Be Careful! The following statement is NOT true:
“The probability that µ lies between 143.22 and 162.78 is .95.”
Once you have inserted your sample results into the confidence
interval formula, the word PROBABILITY can no longer be
used to describe the resulting confidence interval.
Some comments about CI’S
 Effects of n, confidence coefficient true for CIs for other
parameters also
 If we repeatedly took random samples of some fixed size n
and each time calculated a 95% CI, in the long run about
95% of the CI’s would contain the population proportion .
 The probability that the CI does not contain is called the
error probability, and is denoted by α.
 α = 1 – confidence coefficient
(1-a)100% a a/2 za/2
90% .10 .050 1.645
95% .05 .025 1.96
99% .01 .005 2.58
Comments about CI for population mean µ
 The method is robust to violations of the assumption
of a normal population distribution
(Be careful if sample data distribution is very highly
skewed, or if it contains severe outliers)
 Greater confidence requires wider CI
 Greater n produces narrower CI
To determine sample size:
 Know the desired confidence level, which determines the
value of Z (the critical value from the standardized
normal distribution. Determining the confidence level is
subjective.
 Know the acceptable sampling error, e. The amount of
error that can be tolerated.
 Know the standard deviation, σ. If unknown, estimate
by past data or make an educated guess
 estimate σ: [σ = range/4] This estimate is derived from
the empirical rule stating that approximately 95% of the
values in a normal distribution are within +/- 2σ of the
mean, giving a range within which most of the values
are located.
Selecting the Sample
Size…
We can control the width of the interval by determining the
sample size necessary to produce narrow intervals.
Suppose we want to estimate the mean demand “to within 5
units”; i.e. we want to the interval estimate to be:
Since:
It follows that
Solve for n to get requisite sample size!
Selecting the Sample
Size…
The amount of sampling error you are willing to accept and the
level of confidence desired, determines the size of your
sample.
n = Z2σ2 / e2
e = Z (σ / √ n )
Choosing the Sample Size
Ex. How large a sample size do we need to estimate
a population proportion (e.g., “very happy”) to
be within 0.03, with probability 0.95?
i.e., what is n so that margin of error of 95%
confidence interval is 0.03?
Set 0.03 = margin of error and solve for n
ˆ0.03 1.96 1.96 (1 )/ ns    
Some comments about CIs and sample size
 We’ve seen that n depends on confidence level
(higher confidence requires larger n) and the
population variability (more variability requires
larger n)
 In practice, determining n not so easy, because (1)
many parameters are to be estimated, (2) resources
may be limited and we may need to compromise,
due to several constraints.
 CI’s can be formed for any parameter (i.e., for
median, mode etc.)
How large must a sample be for the Central
Limit theorem to apply?
The sample size varies according to the shape of the
population. However, for our use, a sample size of 30 or
larger will suffice.
Q. Must sample sizes be 30 or larger for populations that are
normally distributed?
Ans: No. If the population is normally distributed, the
sample means are normally distributed for sample sizes as
small as n=1.
Q. How large is large?
Q. Why not just always pick a sample size of 30?
Aim: Finding an optimum size for the sample
How can I tell the shape of the underlying
population?
 CHECK FOR NORMALITY:
 Use descriptive statistics. Construct stem-and-leaf plots for small
or moderate-sized data sets and frequency distributions and
histograms for large data sets.
 Compute measures of central tendency (mean and median) and
compare with the theoretical and practical properties of the
normal distribution. Compute the interquartile range. Does it
approximate the 1.33 times the standard deviation?
 How are the observations in the data set distributed? Do
approximately two thirds of the observations lie between the
mean and plus or minus 1 standard deviation? Do approximately
four-fifths of the observations lie between the mean and plus or
minus 1.28 standard deviations? Do approximately 19 out of
every 20 observations lie between the mean and plus or minus 2
standard deviations?
Interpreting a Confidence Interval
For the previous 95% confidence interval, the following conclusions are
valid:
 I am 95% confident that the average length of a call for the
population µ, lies between 143.22 and 162.78 minutes.
 If I repeatedly obtained samples of size 85, then 95% of the
resulting confidence intervals would contain µ and 5% would
not.
 QUESTION: Does this confidence interval [143.22 to 162.78]
contain µ?
 ANSWER: I don’t know. All I can say is that this procedure leads
to an interval containing µ 95% of the time.
 I am 95% confident that my estimate of µ [namely 153 minutes] is
within 9.78 minutes of the actual value of µ. RECALL: 9.78 is the
margin of error.
Interpretations (contd.)
Therefore, 7.93% of the time, a random sample of 40
customers from this population will yield a mean
expenditure of 8700 or more.
OR
From any random sample of 40 customers, 7.93% of them will
spend on average 8700 or more.
Interpretations (contd.)
The point estimate for this problem is 13.56 hours, with an
error of +/- 3.20 hours.
I am 90% confident that the average amount of time
accumulated by a manager per week in this industry is
between 10.36 and 16.76 hours.
We are 95% confident that the population proportion of
telemarketing firms that use their operation to assist
order processing is somewhere between .29 and .49.
There is a point estimate of .39 with a margin of error of +/-
.10.
Assumptions necessary to use t-
distribution
 Assumes random variable x is normally
distributed
 However, if sample size is large enough ( > 30),
t-distribution can be used when σ is unknown.
 But if sample size is small, evaluate the shape of
the sample data using a histogram or stem-and-
leaf.
 As the sample size increases, the t-distribution
approaches the Z distribution.
Statistical Inference facilitates
decision making.

More Related Content

What's hot

Statistics: Probability
Statistics: ProbabilityStatistics: Probability
Statistics: ProbabilitySultan Mahmood
 
Variance & standard deviation
Variance & standard deviationVariance & standard deviation
Variance & standard deviationFaisal Hussain
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to StatisticsAnjan Mahanta
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesisJags Jagdish
 
Introduction to statistics...ppt rahul
Introduction to statistics...ppt rahulIntroduction to statistics...ppt rahul
Introduction to statistics...ppt rahulRahul Dhaker
 
statistical estimation
statistical estimationstatistical estimation
statistical estimationAmish Akbar
 
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Meannszakir
 
Chapter 2: Frequency Distribution and Graphs
Chapter 2: Frequency Distribution and GraphsChapter 2: Frequency Distribution and Graphs
Chapter 2: Frequency Distribution and GraphsMong Mara
 
Testing of hypotheses
Testing of hypothesesTesting of hypotheses
Testing of hypothesesRajThakuri
 
Type i and type ii errors
Type i and type ii errorsType i and type ii errors
Type i and type ii errorsp24ssp
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimationTech_MX
 

What's hot (20)

Statistics: Probability
Statistics: ProbabilityStatistics: Probability
Statistics: Probability
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Confidence Intervals
Confidence IntervalsConfidence Intervals
Confidence Intervals
 
Correlation Analysis
Correlation AnalysisCorrelation Analysis
Correlation Analysis
 
Variance & standard deviation
Variance & standard deviationVariance & standard deviation
Variance & standard deviation
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Chi-square distribution
Chi-square distribution Chi-square distribution
Chi-square distribution
 
Introduction to statistics...ppt rahul
Introduction to statistics...ppt rahulIntroduction to statistics...ppt rahul
Introduction to statistics...ppt rahul
 
statistical estimation
statistical estimationstatistical estimation
statistical estimation
 
Probability Distributions
Probability DistributionsProbability Distributions
Probability Distributions
 
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Mean
 
6. point and interval estimation
6. point and interval estimation6. point and interval estimation
6. point and interval estimation
 
Chapter 2: Frequency Distribution and Graphs
Chapter 2: Frequency Distribution and GraphsChapter 2: Frequency Distribution and Graphs
Chapter 2: Frequency Distribution and Graphs
 
Testing of hypotheses
Testing of hypothesesTesting of hypotheses
Testing of hypotheses
 
Type i and type ii errors
Type i and type ii errorsType i and type ii errors
Type i and type ii errors
 
Skewness
SkewnessSkewness
Skewness
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimation
 
Measure of Dispersion in statistics
Measure of Dispersion in statisticsMeasure of Dispersion in statistics
Measure of Dispersion in statistics
 

Viewers also liked

Viewers also liked (6)

Point Estimation
Point EstimationPoint Estimation
Point Estimation
 
Types of estimates
Types of estimatesTypes of estimates
Types of estimates
 
Inferential statistics-estimation
Inferential statistics-estimationInferential statistics-estimation
Inferential statistics-estimation
 
Probability Distributions
Probability DistributionsProbability Distributions
Probability Distributions
 
Probability concept and Probability distribution
Probability concept and Probability distributionProbability concept and Probability distribution
Probability concept and Probability distribution
 
DIstinguish between Parametric vs nonparametric test
 DIstinguish between Parametric vs nonparametric test DIstinguish between Parametric vs nonparametric test
DIstinguish between Parametric vs nonparametric test
 

Similar to Point and Interval Estimation

Module 7 Interval estimatorsMaster for Business Statistics.docx
Module 7 Interval estimatorsMaster for Business Statistics.docxModule 7 Interval estimatorsMaster for Business Statistics.docx
Module 7 Interval estimatorsMaster for Business Statistics.docxgilpinleeanna
 
Mca admission in india
Mca admission in indiaMca admission in india
Mca admission in indiaEdhole.com
 
inferencial statistics
inferencial statisticsinferencial statistics
inferencial statisticsanjaemerry
 
Bca admission in india
Bca admission in indiaBca admission in india
Bca admission in indiaEdhole.com
 
Section 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docxSection 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docxkenjordan97598
 
Chapter 8 review
Chapter 8 reviewChapter 8 review
Chapter 8 reviewdrahkos1
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Marina Santini
 
Estimating population values ppt @ bec doms
Estimating population values ppt @ bec domsEstimating population values ppt @ bec doms
Estimating population values ppt @ bec domsBabasab Patil
 
A.6 confidence intervals
A.6  confidence intervalsA.6  confidence intervals
A.6 confidence intervalsUlster BOCES
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice Ravindra Sharma
 
Lesson04_Static11
Lesson04_Static11Lesson04_Static11
Lesson04_Static11thangv
 
Lesson04_new
Lesson04_newLesson04_new
Lesson04_newshengvn
 
Six Sigma Confidence Interval Analysis (CIA) Training Module
Six Sigma Confidence Interval Analysis (CIA) Training ModuleSix Sigma Confidence Interval Analysis (CIA) Training Module
Six Sigma Confidence Interval Analysis (CIA) Training ModuleFrank-G. Adler
 
Statistical Parameters , Estimation , Confidence region.pptx
Statistical Parameters , Estimation , Confidence region.pptxStatistical Parameters , Estimation , Confidence region.pptx
Statistical Parameters , Estimation , Confidence region.pptxPawanDhamala1
 
Findings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport WritingFindings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport WritingShainaBoling829
 

Similar to Point and Interval Estimation (20)

Module 7 Interval estimatorsMaster for Business Statistics.docx
Module 7 Interval estimatorsMaster for Business Statistics.docxModule 7 Interval estimatorsMaster for Business Statistics.docx
Module 7 Interval estimatorsMaster for Business Statistics.docx
 
Mca admission in india
Mca admission in indiaMca admission in india
Mca admission in india
 
inferencial statistics
inferencial statisticsinferencial statistics
inferencial statistics
 
Bca admission in india
Bca admission in indiaBca admission in india
Bca admission in india
 
Chapter 8
Chapter 8Chapter 8
Chapter 8
 
Section 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docxSection 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docx
 
Chapter 8 review
Chapter 8 reviewChapter 8 review
Chapter 8 review
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation
 
Estimating population values ppt @ bec doms
Estimating population values ppt @ bec domsEstimating population values ppt @ bec doms
Estimating population values ppt @ bec doms
 
A.6 confidence intervals
A.6  confidence intervalsA.6  confidence intervals
A.6 confidence intervals
 
Estimating a Population Proportion
Estimating a Population ProportionEstimating a Population Proportion
Estimating a Population Proportion
 
Estimating a Population Proportion
Estimating a Population ProportionEstimating a Population Proportion
Estimating a Population Proportion
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice
 
Lesson04_Static11
Lesson04_Static11Lesson04_Static11
Lesson04_Static11
 
Stats chapter 10
Stats chapter 10Stats chapter 10
Stats chapter 10
 
QT1 - 07 - Estimation
QT1 - 07 - EstimationQT1 - 07 - Estimation
QT1 - 07 - Estimation
 
Lesson04_new
Lesson04_newLesson04_new
Lesson04_new
 
Six Sigma Confidence Interval Analysis (CIA) Training Module
Six Sigma Confidence Interval Analysis (CIA) Training ModuleSix Sigma Confidence Interval Analysis (CIA) Training Module
Six Sigma Confidence Interval Analysis (CIA) Training Module
 
Statistical Parameters , Estimation , Confidence region.pptx
Statistical Parameters , Estimation , Confidence region.pptxStatistical Parameters , Estimation , Confidence region.pptx
Statistical Parameters , Estimation , Confidence region.pptx
 
Findings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport WritingFindings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport Writing
 

Point and Interval Estimation

  • 1. Point and Interval Estimation Presented by: Shubham Mehta 0019
  • 2. GOALS 1. Methods of Estimation 2. Difference between Point and Interval Estimation 3. Defining Level of Confidence 4. Constructing Confidence Intervals 5. Interpretation of these confidence intervals 6. Determine the sample size for attribute and variable sampling. 7. Explanations using examples /case study
  • 3. What are Estimators?  In statistics, an estimator is a function of the data or sample that is used to infer the value of an unknown parameter in population in a statistical model.  Thus the estimator, the quantity of interest (the estimand or parameter) and its result (the estimate) are different from each other.
  • 4. Qualities of Estimators…Statisticians have already determined the “best” way to estimate a population parameter.  Qualities desirable in estimators include unbiasedness, consistency, and relative efficiency: • An unbiased estimator of a population parameter is an estimator whose expected value is equal to that parameter. • An unbiased estimator is said to be consistent if the difference between the estimator and the parameter grows smaller as the sample size grows larger. • If there are two unbiased estimators of a parameter, the one whose variance is smaller is said to be relatively efficient.
  • 5. Estimation…  There are two types of inference: estimation and hypothesis testing; estimation is introduced first.  The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic.  E.g., the sample mean ( ) is employed to estimate the population mean ( ).
  • 6. Point & Interval Estimation…  For example, suppose we want to estimate the mean summer income of a class of business students. For n=25 students.  It is calculated and average is found to be 400 $/week. point estimate interval estimate  An alternative statement is:  The mean income is between 380 and 420 $/week. 10.6
  • 7. Estimation…  The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic.  There are two types of estimators:  Point Estimator  Interval Estimator
  • 8. Point Estimator…  A point estimator draws inferences about a population by estimating the value of an unknown parameter using a single value or point.  We saw earlier that point probabilities in continuous distributions were virtually zero. Likewise, we’d expect that the point estimator gets closer to the parameter value with an increased sample size, but point estimators don’t reflect the effects of larger sample sizes. Hence we will employ the interval estimator to estimate population parameters…
  • 9. Interval Estimator…  An interval estimator draws inferences about a population by estimating the value of an unknown parameter using an interval.  That is we say (with some ___% certainty) that the population parameter of interest is between some lower and upper bounds.
  • 10. 5. Statistical Inference: Estimation Goal: How can we use sample data to estimate values of population parameters? Point estimate: A single statistic value that is the “best guess” for the parameter value Interval estimate: An interval of numbers around the point estimate, that has a fixed “confidence level” of containing the parameter value. Called a confidence interval. (Based on sampling distribution of the point estimate)
  • 11. Point Estimators – Most common to use sample values  Sample mean estimates population mean m ˆ iy y n m    • Sample std. dev. estimates population std. dev. s 2 ( ) ˆ 1 iy y s n s      • Sample proportion p estimates population proportion p^.
  • 12. Methods of finding estimators… Point estimators  Method of moment  Maximum likelihood estimators  Bayes estimators  The EM algorithm Interval estimators  Inverting a test statistic  Pivotal quantities  Pivoting the CDF  Bayesian intervals
  • 13. Confidence Interval  Confidence Interval of a parameter consists of an interval of numbers along with a probability that the interval contains the unknown parameter  A confidence interval gives a range estimate of values:  Takes into consideration variation in sample statistics from sample to sample  Based on all the observations from one sample  Gives information about closeness to unknown population parameters  Stated in terms of level of confidence  Example: 95% confidence, 99% confidence  Can never be 100% confident  A confidence interval estimate of 100% would be so wide as to be meaningless for practical decision making
  • 14. Confidence Interval  The larger the CI , the more confident we can be that the given interval contains the unknown parameter.  Ideally, we prefer a short interval with high degree of confidence. For Example: We will prefer (95,100) with 95% confidence than (0,100) with 100% confidence.  The value corresponding to a significance level that determines those test statistics that lead to rejection of null hypothesis and those that lead to a decision not to reject null hypothesis is referred to as Critical Value.
  • 15. The general formula for all confidence intervals is: Point Estimate ± (Critical Value) (Standard Error) Sample Mean or Sample Proportion The “z” or “t” Critical Value σ / √ n or s / √ n Confidence Interval Estimates
  • 16. The Level of Significance (α)  Because we only select one sample, and μ or π are unknown, we never really know whether the confidence interval includes the true population mean or proportion, or not.  The level of significance, or “α” risk is the chance we take that the true population parameter is not contained in the confidence interval.  Therefore, a 95% confidence interval would have an “α” of 5%
  • 17. 95% Confidence Interval “ α “ is the proportion in the tails of the sampling distribution that is outside the established confidence interval. If α = .05, then each tail has .025 area a = .025a = .025 .9750.0250 + 1.96 z- 1.96 z Z .06 - 1.9 .0250 Z .06 + 1.9 .9750 The critical values of “z” that define the “α” areas are -1.96 and + 1.96 Point Estimate The Level of Significance (α)
  • 18. Level of Confidence  The Level of Confidence in a confidence interval is a probability that represents the percentage of intervals that will contain if a large number of repeated samples are obtained. The level of confidence is denoted  For example, a 95% level of confidence would mean that if 100 confidence intervals were constructed, each based on a different sample from the same population, we would expect 95 of the intervals to contain the population mean.
  • 19. Constructing Confidence Interval  The construction of a confidence interval for the population mean depends upon three factors 1. The point estimate of the population 2. The level of confidence 3. The standard deviation of the sample mean
  • 20. 1. Select our desired level of confidence Let’s suppose we want to construct an interval using the 95% confidence level 2. Calculate α and α/2 (1-α)*100% = 95%  α = 0.05, α/2 = 0.025 3. Look up the corresponding z-score α/2 = 0.025  a z-score of 1.96 Constructing Confidence Interval
  • 21.  4. Multiply the z-score by the standard error to find the margin of error  5. Find the interval by adding and subtracting this product from the mean ).,.( 2/2/ errorstdZxerrorstdZx   nn Z ss   96.12/ )96.1,96.1( n x n x ss  Constructing Confidence Interval
  • 22. 95% Confidence Interval .9750.0250 + 1.96 z- 1.96 z Point Estimate Constructing Confidence Interval a = .025a = .025 Graph will be as follows
  • 23. Common Confidence Levels and α values Here is a table of commonly used confidence levels, α and α/2 values, and corresponding z- scores which we are using in our examples: • (1 - α)*100% α α/2 Zα/2 • 90% 0.1 0.05 1.645 • 95% 0.05 0.025 1.96 • 99% 0.01 0.005 2.58
  • 24. Ex : Suppose we conduct a poll to try and get a sense of the outcome of an upcoming election with two candidates. We poll 1000 people, and 550 of them respond that they will vote for candidate A How confident can we be that a given person will cast their vote for candidate A? 1. Select our desired levels of confidence We’re going to use the 90%, 95%, and 99% levels Constructing Confidence Interval Example
  • 25. 2. Calculate α and α/2 Our  values are 0.1, 0.05, and 0.01 respectively Our /2 values are 0.05, 0.025, and 0.005 3. Look up the corresponding z-scores Our Z/2 values are 1.645, 1.96, and 2.58 4. Multiply the z-score by the standard error to find the margin of error First we need to calculate the standard error Constructing Confidence Interval Example
  • 26. 5. Find the interval by adding and subtracting this product from the mean In this case, we are working with a distribution we have not previously discussed, a normal binomial distribution (i.e. a vote can choose Candidate A or B, a binomial function) We have a probability estimator from our sample, where the probability of an individual in our sample voting for candidate A was found to be 550/1000 or 0.55 We can use this information in a formula to estimate the standard error for such a distribution: Constructing Confidence Interval Example
  • 27. 5. Multiply the z-score by the standard error cont. • For a normal binominal distribution, the standard error can be estimated using: sX = s n = (p)(1-p) n = (0.55)(0.45) 1000 = 0.0157 • We can now multiply this value by the z-scores to calculate the margins of error for each conf. level Constructing Confidence Interval Example
  • 28. 5. Multiply the z-score by the standard error cont. • We calculate the margin of error and add and subtract that value from the mean (0.55 in this case) to find the bounds of our confidence intervals at each level of confidence: Margin Bounds CI Z/2 of error Lower Upper 90% 1.645 0.026 0.524 0.576 95% 1.96 0.031 0.519 0.581 99% 2.58 0.041 0.509 0.591 Constructing Confidence Interval Example
  • 29. Some Myths  What if we can make a“guess”about proportion value?  How can we predict whether it will rain tomorrow?  Be Careful! The following statement is NOT true: “The probability that µ lies between 143.22 and 162.78 is .95.” Once you have inserted your sample results into the confidence interval formula, the word PROBABILITY can no longer be used to describe the resulting confidence interval.
  • 30. Some comments about CI’S  Effects of n, confidence coefficient true for CIs for other parameters also  If we repeatedly took random samples of some fixed size n and each time calculated a 95% CI, in the long run about 95% of the CI’s would contain the population proportion .  The probability that the CI does not contain is called the error probability, and is denoted by α.  α = 1 – confidence coefficient (1-a)100% a a/2 za/2 90% .10 .050 1.645 95% .05 .025 1.96 99% .01 .005 2.58
  • 31. Comments about CI for population mean µ  The method is robust to violations of the assumption of a normal population distribution (Be careful if sample data distribution is very highly skewed, or if it contains severe outliers)  Greater confidence requires wider CI  Greater n produces narrower CI
  • 32. To determine sample size:  Know the desired confidence level, which determines the value of Z (the critical value from the standardized normal distribution. Determining the confidence level is subjective.  Know the acceptable sampling error, e. The amount of error that can be tolerated.  Know the standard deviation, σ. If unknown, estimate by past data or make an educated guess  estimate σ: [σ = range/4] This estimate is derived from the empirical rule stating that approximately 95% of the values in a normal distribution are within +/- 2σ of the mean, giving a range within which most of the values are located.
  • 33. Selecting the Sample Size… We can control the width of the interval by determining the sample size necessary to produce narrow intervals. Suppose we want to estimate the mean demand “to within 5 units”; i.e. we want to the interval estimate to be: Since: It follows that Solve for n to get requisite sample size!
  • 34. Selecting the Sample Size… The amount of sampling error you are willing to accept and the level of confidence desired, determines the size of your sample. n = Z2σ2 / e2 e = Z (σ / √ n )
  • 35. Choosing the Sample Size Ex. How large a sample size do we need to estimate a population proportion (e.g., “very happy”) to be within 0.03, with probability 0.95? i.e., what is n so that margin of error of 95% confidence interval is 0.03? Set 0.03 = margin of error and solve for n ˆ0.03 1.96 1.96 (1 )/ ns    
  • 36. Some comments about CIs and sample size  We’ve seen that n depends on confidence level (higher confidence requires larger n) and the population variability (more variability requires larger n)  In practice, determining n not so easy, because (1) many parameters are to be estimated, (2) resources may be limited and we may need to compromise, due to several constraints.  CI’s can be formed for any parameter (i.e., for median, mode etc.)
  • 37. How large must a sample be for the Central Limit theorem to apply? The sample size varies according to the shape of the population. However, for our use, a sample size of 30 or larger will suffice. Q. Must sample sizes be 30 or larger for populations that are normally distributed? Ans: No. If the population is normally distributed, the sample means are normally distributed for sample sizes as small as n=1. Q. How large is large? Q. Why not just always pick a sample size of 30? Aim: Finding an optimum size for the sample
  • 38. How can I tell the shape of the underlying population?  CHECK FOR NORMALITY:  Use descriptive statistics. Construct stem-and-leaf plots for small or moderate-sized data sets and frequency distributions and histograms for large data sets.  Compute measures of central tendency (mean and median) and compare with the theoretical and practical properties of the normal distribution. Compute the interquartile range. Does it approximate the 1.33 times the standard deviation?  How are the observations in the data set distributed? Do approximately two thirds of the observations lie between the mean and plus or minus 1 standard deviation? Do approximately four-fifths of the observations lie between the mean and plus or minus 1.28 standard deviations? Do approximately 19 out of every 20 observations lie between the mean and plus or minus 2 standard deviations?
  • 39. Interpreting a Confidence Interval For the previous 95% confidence interval, the following conclusions are valid:  I am 95% confident that the average length of a call for the population µ, lies between 143.22 and 162.78 minutes.  If I repeatedly obtained samples of size 85, then 95% of the resulting confidence intervals would contain µ and 5% would not.  QUESTION: Does this confidence interval [143.22 to 162.78] contain µ?  ANSWER: I don’t know. All I can say is that this procedure leads to an interval containing µ 95% of the time.  I am 95% confident that my estimate of µ [namely 153 minutes] is within 9.78 minutes of the actual value of µ. RECALL: 9.78 is the margin of error.
  • 40. Interpretations (contd.) Therefore, 7.93% of the time, a random sample of 40 customers from this population will yield a mean expenditure of 8700 or more. OR From any random sample of 40 customers, 7.93% of them will spend on average 8700 or more.
  • 41. Interpretations (contd.) The point estimate for this problem is 13.56 hours, with an error of +/- 3.20 hours. I am 90% confident that the average amount of time accumulated by a manager per week in this industry is between 10.36 and 16.76 hours. We are 95% confident that the population proportion of telemarketing firms that use their operation to assist order processing is somewhere between .29 and .49. There is a point estimate of .39 with a margin of error of +/- .10.
  • 42. Assumptions necessary to use t- distribution  Assumes random variable x is normally distributed  However, if sample size is large enough ( > 30), t-distribution can be used when σ is unknown.  But if sample size is small, evaluate the shape of the sample data using a histogram or stem-and- leaf.  As the sample size increases, the t-distribution approaches the Z distribution.