SlideShare a Scribd company logo
1 of 57
Download to read offline
Fractional imputation for handling missing data in
survey sampling
Jae-Kwang Kim
Iowa State University
Department of Cancer Epidemiology & Genetics
National Institutes of Health, National Cancer Institute
June 27, 2017
1 Introduction
2 Fractional Imputation
3 Fractional hot deck imputation
4 R package: FHDI
5 Numerical Illustration
6 Concluding Remarks
Kim (ISU) Fractional Imputation NIH/NCI 2 / 57
Introduction
Basic Setup
U = {1, 2, · · · , N}: Finite population
A ⊂ U: sample (selected by a probability sampling design).
Under complete response, suppose that
ˆηn,g =
i∈A
wi g(yi )
is an unbiased estimator of ηg = N−1 N
i=1 g(yi ). Here, g(·) is a
known function.
For example, g(y) = I(y < 3) leads to ηg = P(Y < 3).
Kim (ISU) Fractional Imputation NIH/NCI 3 / 57
Introduction
Basic Setup (Cont’d)
A = AR ∪ AM, where yi are observed in AR. yi are missing in AM
Ri = 1 if i ∈ AR and Ri = 0 if i ∈ AM.
y∗
i : imputed value for yi , i ∈ AM
Imputed estimator of ηg
ˆηI,g =
i∈AR
wi g(yi ) +
i∈AM
wi g(y∗
i )
Need E {g(y∗
i ) | Ri = 0} = E {g(yi ) | Ri = 0}.
Kim (ISU) Fractional Imputation NIH/NCI 4 / 57
Introduction
ML estimation under missing data setup
Often, find x (always observed) such that
Missing at random (MAR) holds: f (y | x, R = 0) = f (y | x)
Imputed values are created from f (y | x).
Computing the conditional expectation can be a challenging problem.
1 Do not know the true parameter θ in f (y | x) = f (y | x; θ):
E {g (y) | x} = E {g (yi ) | xi ; θ} .
2 Even if we know θ, computing the conditional expectation can be
numerically difficult.
Kim (ISU) Fractional Imputation NIH/NCI 5 / 57
Introduction
Imputation
Imputation: Monte Carlo approximation of the conditional
expectation (given the observed data).
E {g (yi ) | xi } ∼=
1
M
M
j=1
g y
∗(j)
i
1 Bayesian approach: generate y∗
i from
f (yi | xi , yobs) = f (yi | xi , θ) p(θ | xi , yobs)dθ
2 Frequentist approach: generate y∗
i from f yi | xi ; ˆθ , where ˆθ is a
consistent estimator.
Kim (ISU) Fractional Imputation NIH/NCI 6 / 57
Comparison
Bayesian Frequentist
Model Posterior distribution Prediction model
f (latent, θ | data) f (latent | data, θ)
Computation Data augmentation EM algorithm
Prediction I-step E-step
Parameter update P-step M-step
Parameter est’n Posterior mode ML estimation
Imputation Multiple imputation Fractional imputation
Variance estimation Rubin’s formula Linearization
or Resampling
Kim (ISU) Fractional Imputation NIH/NCI 7 / 57
1 Introduction
2 Fractional Imputation
3 Fractional hot deck imputation
4 R package: FHDI
5 Numerical Illustration
6 Concluding Remarks
Kim (ISU) Fractional Imputation NIH/NCI 8 / 57
Fractional Imputation
Idea (parametric model approach)
Approximate E{g(yi ) | xi } by
E{g(yi ) | xi } ∼=
Mi
j=1
w∗
ij g(y
∗(j)
i )
where w∗
ij is the fractional weight assigned to the j-th imputed value
of yi .
If yi is a categorical variable, we can use
y
∗(j)
i = the j-th possible value of yi
w
∗(j)
ij = P(yi = y
∗(j)
i | xi ; ˆθ),
where ˆθ is the (pseudo) MLE of θ.
Kim (ISU) Fractional Imputation NIH/NCI 9 / 57
Fractional imputation
Features
Split the record with missing item into M(> 1) imputed values
Assign fractional weights
The final product is a single data file with size ≤ nM.
For variance estimation, the fractional weights are replicated.
Kim (ISU) Fractional Imputation NIH/NCI 10 / 57
Fractional imputation using parametric model
Assume y ∼ f (y | x; θ) for some θ.
In this case, under MAR, the following fractional imputation method
can be used.
1 Parameter Estimation: Estimate θ by solving
i∈A
wi Ri S(θ; xi , yi ) = 0 (1)
where S(θ; x, y) = ∂ log f (y | x; θ)/∂θ.
2 Imputation: Generate M imputed values of yi , denoted by
y
∗(j)
i , j = 1, · · · , M, from f (yi | xi ; ˆθ), where ˆθ is obtained from (1).
For estimating µg = E{g(Y )}, the fractional imputation estimator of
µg is
ˆµFI,g =
i∈A
wi {Ri g(yi ) + (1 − Ri )
M
j=1
w∗
ij g(y
∗(j)
i )},
where w∗
ij = 1/M.
Kim (ISU) Fractional Imputation NIH/NCI 11 / 57
Remark
If we want to produce smaller M, then the following two-phase sampling
method can be developed.
1 First use large M1 (say M1 = 10, 000) to obtain y
∗(j)
i , j = 1, · · · , M1.
2 From the first-phase sample of size M1 generated from Step 1, select
a second-phase sample of size M2 using an efficient sampling method,
such as stratification or systematic sampling. Under optimal
stratification, we have
M2
j=1
w∗
ij g(y
∗(j)
i ) = E{g(Y ) | xi ; ˆθ} + Op(max{M
−1/2
1 , M−1
2 }). (2)
Kim (ISU) Fractional Imputation NIH/NCI 12 / 57
Figure: Two-phase sampling for fractional imputation: Fractional imputation with
size M2 = 4 from the histogram of M1 >> M2 imputed values
Kim (ISU) Fractional Imputation NIH/NCI 13 / 57
Calibration fractional imputation
In addition to efficient sampling, we can also consider calibration
weighting to construct the fractional weights to satisfy M
j=1 w∗
ij = 1
and
i∈A
wi (1 − Ri )
M
j=1
w∗
ij g(y
∗(j)
i ) =
i∈A
wi (1 − Ri )E{g(Y ) | xi ; ˆθ}
exactly for prespecified g(·) function. The calibration fractional
weighting is discussed in Fuller and Kim (2005).
Kim (ISU) Fractional Imputation NIH/NCI 14 / 57
Variance estimation
For replication variance estimation, we first compute ˆθ(k) from (1)
using wi replaced by w
(k)
i . Next, we need to construct the replicated
fractional weights w
∗(k)
ij to satisfy M
j=1 w
∗(k)
ij = 1 and
i∈A
w
(k)
i (1−Ri )
M
j=1
w
∗(k)
ij g(y
∗(j)
i )
.
=
i∈A
w
(k)
i (1−Ri )E{g(Y ) | xi ; ˆθ(k)
}.
(3)
The replicates for ˆµFI,g can be computed by
ˆµ
(k)
FI,g =
i∈A
w
(k)
i {Ri g(yi ) + (1 − Ri )
M
j=1
w
∗(k)
ij g(y
∗(j)
i )}.
Note that the imputed values are not changed for each replication.
Only the fractional weights are changed.
Kim (ISU) Fractional Imputation NIH/NCI 15 / 57
Variance estimation (Cont’d)
One way to achieve the condition (3) is to use the importance
weighting given by
w
∗(k)
ij ∝
f (y
∗(j)
i | xi ; ˆθ(k))
f (y
∗(j)
i | xi ; ˆθ)
with M
j=1 w
∗(k)
ij = 1.
If Y is categorical variable, fractional imputation is much easier. For
example, if Y is binary then we only need two imputed values
(y
∗(j)
i = 0 or 1) and the fractional weight corresponding to y
∗(j)
i is
w∗
ij = P(Y = y
∗(j)
i | xi ; ˆθ), for j = 1, 2.
Kim (ISU) Fractional Imputation NIH/NCI 16 / 57
Example 1: Fractional imputation for categorical data
Example (n = 10)
ID Weight y1 y2
1 w1 y1,1 y1,2
2 w2 y2,1 M
3 w3 M y3,2
4 w4 y4,1 y4,2
5 w5 y5,1 y5,2
6 w6 y6,1 y6,2
7 w7 M y7,2
8 w8 M M
9 w9 y9,1 y9,2
10 w10 y10,2 y10,2
M: Missing
Kim (ISU) Fractional Imputation NIH/NCI 17 / 57
Example 1
Example (y1, y2: dichotomous, taking 0 or 1)
ID Weight y1 y2
1 w1 y1,1 y1,2
2 w2w∗
2,1 y2,1 0
w2w∗
2,2 y2,1 1
3 w3w∗
3,1 0 y3,2
w3w∗
3,2 1 y3,2
4 w4 y4,1 y4,2
5 w5 y5,1 y5,2
Kim (ISU) Fractional Imputation NIH/NCI 18 / 57
Example 1
Example (y1, y2: dichotomous, taking 0 or 1)
ID Weight y1 y2
6 w6 y6,1 y6,2
7 w7w∗
7,1 0 y7,2
w7w∗
7,2 1 y7,2
8 w8w∗
8,1 0 0
w8w∗
8,2 0 1
w8w∗
8,3 1 0
w8w∗
8,4 1 1
9 w9 y9,1 y9,2
10 w10 y10,1 y10,2
Kim (ISU) Fractional Imputation NIH/NCI 19 / 57
Example 1 (Cont’d)
E-step: Fractional weights are the conditional probabilities of the
imputed values given the observations.
w∗
ij = ˆP(y
∗(j)
i,mis | yi,obs)
=
ˆπ(yi,obs, y
∗(j)
i,mis)
Mi
l=1 ˆπ(yi,obs, y
∗(l)
i,mis)
where (yi,obs, yi,mis) is the (observed, missing) part of yi = (yi1, yi,2).
M-step: Update the joint probability using the fractional weights.
ˆπab =
1
ˆN
n
i=1
Mi
j=1
wi w∗
ij I(y
∗(j)
i,1 = a, y
∗(j)
i,2 = b)
with ˆN = n
i=1 wi .
Kim (ISU) Fractional Imputation NIH/NCI 20 / 57
1 Introduction
2 Fractional Imputation
3 Fractional hot deck imputation
4 R package: FHDI
5 Numerical Illustration
6 Concluding Remarks
Kim (ISU) Fractional Imputation NIH/NCI 21 / 57
Fractional hot deck imputation 1
Hot deck imputation: Imputed values are observed values.
Suppose that we are interested in estimating θ1 = E(Y ) or even
θ2 = Pr(Y < c) where y ∼ f (y | x) where x is always observed and y
is subject to missingness.
Assume MAR in the sense that Pr(R = 1 | x, y) does not depend on
y.
Assume that there exists z ∈ {1, · · · , G} such that
f (y | x, z) = f (y | z). (4)
In this case, we can assume that
y | (z = g)
i.i.d
∼ (µg , σ2
g )
which is sometimes called cell mean model (Kim and Fuller, 2004).
Kim (ISU) Fractional Imputation NIH/NCI 22 / 57
Fractional Hot deck imputation 2
Under (4), one can express
f (y | x) ∼=
G
g=1
pg (x)fg (y) (5)
where pg (x) = P(z = g | x) and fg (y) = f (y | z = g). Model (5)
can be called finite mixture model.
Under (5), we can implement two-step imputation
1 Step 1 (Parameter estimation): Compute the conditional CDF
corresponding to (5) using
ˆF(y | xi ) =
G
g=1
ˆpg (xi ) ˆFg (y),
where ˆpg (x) is the estimated cell probabilities and ˆFg (y) is the
empirical CDF within group g.
2 Step 2 (Imputation): From the conditional CDF, obtain M imputed
values.
Kim (ISU) Fractional Imputation NIH/NCI 23 / 57
Remark
Variable z is used to define imputation cells (or imputation classes).
If x is categorical and used directly to define cells (i.e. z = x), then
pg (xi ) = 1 if xi = g and pg (xi ) = 0 otherwise. In this case,
ˆF(y | xi ) = ˆFg (y), for xi = g.
Fractional hot deck imputation for this special case is discussed in
Kim and Fuller (2004) and Fuller and Kim (2005).
If x is continuous or highly dimensional, we may use some
classification method (or tree method) to define imputation cells.
Kim (ISU) Fractional Imputation NIH/NCI 24 / 57
Multivariate Extension
Idea
In hot deck imputation, we can make a nonparametric approximation
of f (·) using a finite mixture model
f (yi,mis | yi,obs) =
G
g=1
pg (yi,obs)fg (yi,mis), (6)
where pg (yi,obs) = P(zi = g | yi,obs), fg (yi,mis) = f (yi,mis | z = g)
and z is the latent variable associated with imputation cell.
To satisfy the above approximation, we need to find z such that
f (yi,mis | zi , yi,obs) = f (yi,mis | zi ).
Kim (ISU) Fractional Imputation NIH/NCI 25 / 57
Multivariate Extension
Imputation cell
Assume p-dimensional survey items: Y = (Y1, · · · , Yp)
For each item k, create a transformation of Yk into Zk, a discrete
version of Yk based on the sample quantiles among respondents.
If yi,k is missing, then zi,k is also missing.
Imputation cells are created based on the observed value of
zi = (zi,1, · · · , zi,p).
Expression (6) can be written as
f (yi,mis | yi,obs) =
zmis
P(zi,mis = zmis | yi,obs)f (yi,mis | zi )
∼=
zmis
P(zi,mis = zmis | zi,obs)f (yi,mis | zi )
where zi = (zi,obs, zi,mis) similarly to yi = (yi,obs, yi,mis).
Kim (ISU) Fractional Imputation NIH/NCI 26 / 57
Fractional hot deck imputation: Two-step approach
Step 1: Parameter estimation step
1 Compute ˆP(zmis | yi,obs): May require an iterative EM algorithm
2 Compute the cell CDF from the set of full respondents.
Combine the two estimates to obtain the conditional CDF.
Step 2: Imputation step
Select M donors from the conditional CDF.
Kim (ISU) Fractional Imputation NIH/NCI 27 / 57
1 Introduction
2 Fractional Imputation
3 Fractional hot deck imputation
4 R package: FHDI
5 Numerical Illustration
6 Concluding Remarks
Kim (ISU) Fractional Imputation NIH/NCI 28 / 57
FHDI: Introduction 1
Input:
Multivariate missing data
Output (Goal):
Create a single complete data with imputed values.
Preserve correlation structure.
Provide a consistent FHDI estimator on the imputed data.
Provide variance estimator for the FHDI estimator.
Kim (ISU) Fractional Imputation NIH/NCI 29 / 57
FHDI: Introduction 2
(Recall) Fractional Imputation
E(yi,mis | yi,obs ) is approximated by
E(yi,mis | yi,obs ) ∼=
M
j=1
w∗
i,j y
∗(j)
i ,
Draw M(> 1) imputed values on each missing value.
Assign fractional weights on imputed values.
The final product is a single data set with size ≤ nR + nM × M.
Kim (ISU) Fractional Imputation NIH/NCI 30 / 57
FHDI: Introduction 3
How can we generate y∗
mis from f (ymis | yobs) in general case?
Apply Two phase sampling approach
(Phase I) Imputation cell for hot deck imputation
Determine imputation cells based on z, where z is the discretized
values of y (use estimated quantiles to create z).
Estimate cell probabilities for z using EM by weighting method
(Example 1).
(Phase II) Donor selection
Fractional imputation for missing y within each imputation cell.
Assign all possible values on missing ymis (FEFI, Fulley Efficient
Fraction Imputation) and fractional weights proportional to the
estimated cell probabilities.
Approximate FEFI imputation using a systematic sampling (FHDI).
Kim (ISU) Fractional Imputation NIH/NCI 31 / 57
FHDI: Introduction 4
Analysis
Mean estimator
¯y =
i∈A
M
j=1 wi w∗
ij y∗
ij
i∈A wi
,
Regression estimator
ˆβ = (X WX)−1
X Wy∗
Variance estimation
ˆθFHDI =
L
k=1
ck
ˆθ
(k)
FHDI − ˆθFHDI
2
,
where ck is a replicate factor associated with ˆθ
(k)
FHDI and ˆθ
(k)
FHDI is the
the k-th replicate estimate obtained using the k-th fractional weights
replicate denoted by w
(k)
i × w
∗(k)
ij .
Kim (ISU) Fractional Imputation NIH/NCI 32 / 57
FHDI: Implementation 1
Three scenarios for multivariate missing data
1 All categorical data: SAS procedure SURVEYIMPUTE and R
package FHDI.
install.packages(“FHDI”).
Require R 3.4.0 or later.
Require Rtools34 or later.
More details: see
https://sites.google.com/view/jaekwangkim/software.
2 All continuous data: R package FHDI.
3 A mixed data of categorical and continuous items: Not Applicable
with the current version of FHDI.
Kim (ISU) Fractional Imputation NIH/NCI 33 / 57
FHDI: Implementation 2
We have n = 100 sample observations for the multivariate data vector
yi = (y1i , y2i , y3i , y4i ), i = 1, . . . , n, generated from
Y1 = 1 + e1,
Y2 = 2 + ρe1 + 1 − ρ2e2,
Y3 = Y1 + e3,
Y4 = −1 + 0.5Y3 + e4.
We set ρ = 0.5; e1 and e2 are generated from a standard normal
distribution; e3 is generated from a standard exponential distribution; and
e4 is generated from a normal distribution N(0, 3/2).
Kim (ISU) Fractional Imputation NIH/NCI 34 / 57
FHDI: Implementation 3
> library(FHDI)
> example(FHDI)
> summary(daty)
y1 y2 y3 y4
Min. :-1.6701 Min. :0.02766 Min. :-1.4818 Min. :-2.920292
1st Qu.: 0.4369 1st Qu.:1.03796 1st Qu.: 0.9339 1st Qu.:-0.781067
Median : 0.8550 Median :1.79693 Median : 1.7246 Median :-0.121467
Mean : 0.9821 Mean :1.93066 Mean : 1.7955 Mean :-0.006254
3rd Qu.: 1.6171 3rd Qu.:2.71396 3rd Qu.: 2.5172 3rd Qu.: 0.787863
Max. : 3.1312 Max. :5.07103 Max. : 5.3347 Max. : 4.351372
NA’s :42 NA’s :34 NA’s :18 NA’s :11
Kim (ISU) Fractional Imputation NIH/NCI 35 / 57
FHDI: Implementation 4
Categorization: imputation cell
> cdaty=FHDI_CellMake(daty,k=3)
> head(cdaty$data)
ID WT y1 y2 y3 y4
[1,] 1 1 1.47963286 2.150860 NA 1.894211796
[2,] 2 1 NA 1.141496 1.6025296 -1.036946859
[3,] 3 1 0.70870936 1.885673 1.2506894 NA
[4,] 4 1 NA 2.753840 NA 1.211049509
[5,] 5 1 0.86273572 2.425549 1.8875492 -0.539284732
[6,] 6 1 0.03460025 1.740481 0.4909525 0.007130484
> head(cdaty$cell)
y1 y2 y3 y4
[1,] 3 2 0 3
[2,] 0 1 1 1
[3,] 2 2 2 0
[4,] 0 2 0 3
[5,] 2 3 2 2
[6,] 1 2 1 2
Kim (ISU) Fractional Imputation NIH/NCI 36 / 57
FHDI: Implementation 5
> cdaty$cell.resp
y1 y2 y3 y4
[1,] 1 1 1 1
[2,] 1 1 2 3
[3,] 1 2 1 2
[4,] 1 2 2 1
[5,] 2 2 2 3
[6,] 2 3 2 2
[7,] 3 1 3 3
[8,] 3 2 3 2
[9,] 3 2 3 3
[10,] 3 3 3 1
> head(cdaty$cell.non.resp)
y1 y2 y3 y4
[1,] 0 0 0 2
[2,] 0 0 0 3
[3,] 0 0 1 1
[4,] 0 0 1 2
[5,] 0 0 2 1
[6,] 0 0 2 2
10 unique patterns in AR and 47 patterns in AM.
Ex) Respondents with (1,2,1,2) or (3,2,3,2) can be used as donors for
recipients with (0, 0, 0, 2).
Kim (ISU) Fractional Imputation NIH/NCI 37 / 57
FHDI: Implementation 6
MLE cell probability estimates
> datz=cdaty$cell
> jcp=FHDI_CellProb(datz)
> jcp$cellpr
1111 1123 1212 1221 2223 2322
0.18110421 0.05474648 0.12693514 0.07786676 0.17388579 0.08263912
3133 3232 3233 3331
0.02175015 0.10356376 0.08871434 0.08879425
> sum(jcp$cellpr)
[1] 1
A tailored version of EM by weighting (Ibrahim, 1990), as illustated in
Example 1.
Kim (ISU) Fractional Imputation NIH/NCI 38 / 57
FHDI: Implementation 7
> z
[,1] [,2] [,3]
[1,] 1 2 2
[2,] 2 1 2
[3,] 1 0 2
> FHDI_CellProb(z)
$cellpr
122 212
0.6666667 0.3333333
> z
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 2 2 2 2
[2,] 2 1 2 2 2 2
[3,] 1 0 2 2 2 2
> FHDI_CellProb(z)
$cellpr
122222 212222
0.6666667 0.3333333
Cell probabilities are computed on the unique patterns of Z in AR.
Incorporate information of nonrespondents.
Kim (ISU) Fractional Imputation NIH/NCI 39 / 57
FHDI: Implementation 8
FEFI imputation
> FEFI=FHDI_Driver(daty , s_op_imputation =" FEFI", i_op_variance =1,k=3)
> dim(FEFI$fimp.data)
[1] 330 8
> FEFI$fimp.data [1:13 ,]
ID FID WT FWT y1 y2 y3 y4
[1,] 1 1 1 0.5000000 1.47963286 2.150860 2.881646 1.8942118
[2,] 1 2 1 0.5000000 1.47963286 2.150860 2.493438 1.8942118
[3,] 2 1 1 0.2000000 -0.09087472 1.141496 1.602530 -1.0369469
[4,] 2 2 1 0.2000000 -1.67006193 1.141496 1.602530 -1.0369469
[5,] 2 3 1 0.2000000 -0.39302750 1.141496 1.602530 -1.0369469
[6,] 2 4 1 0.2000000 0.97612864 1.141496 1.602530 -1.0369469
[7,] 2 5 1 0.2000000 0.21467221 1.141496 1.602530 -1.0369469
[8,] 3 1 1 0.1666667 0.70870936 1.885673 1.250689 0.7770526
[9,] 3 2 1 0.1666667 0.70870936 1.885673 1.250689 1.2839115
[10 ,] 3 3 1 0.1666667 0.70870936 1.885673 1.250689 0.6309413
[11 ,] 3 4 1 0.1666667 0.70870936 1.885673 1.250689 0.3232018
[12 ,] 3 5 1 0.1666667 0.70870936 1.885673 1.250689 0.5848844
[13 ,] 3 6 1 0.1666667 0.70870936 1.885673 1.250689 1.0342970
Kim (ISU) Fractional Imputation NIH/NCI 40 / 57
FHDI: Implementation 9
FHDI imputation (M=5)
> FHDI=FHDI_Driver(daty , s_op_imputation =" FHDI",M=5, i_op_variance =1,k=3
> dim(FHDI$fimp.data)
[1] 285 8
> FHDI$fimp.data [1:12 ,]
ID FID WT FWT y1 y2 y3 y4
[1,] 1 1 1 0.5000000 1.47963286 2.150860 2.881646 1.8942118
[2,] 1 2 1 0.5000000 1.47963286 2.150860 2.493438 1.8942118
[3,] 2 1 1 0.2000000 -0.09087472 1.141496 1.602530 -1.0369469
[4,] 2 2 1 0.2000000 -1.67006193 1.141496 1.602530 -1.0369469
[5,] 2 3 1 0.2000000 -0.39302750 1.141496 1.602530 -1.0369469
[6,] 2 4 1 0.2000000 0.97612864 1.141496 1.602530 -1.0369469
[7,] 2 5 1 0.2000000 0.21467221 1.141496 1.602530 -1.0369469
[8,] 3 1 1 0.2000000 0.70870936 1.885673 1.250689 0.7770526
[9,] 3 2 1 0.2000000 0.70870936 1.885673 1.250689 1.2839115
[10 ,] 3 3 1 0.2000000 0.70870936 1.885673 1.250689 0.6309413
[11 ,] 3 4 1 0.2000000 0.70870936 1.885673 1.250689 0.3232018
[12 ,] 3 5 1 0.2000000 0.70870936 1.885673 1.250689 1.0342970
A FEFI imputed value 0.5848844 is not selected as FHDI imputed values.
Large sample size reduction in FHDI if the original data size is large.
Kim (ISU) Fractional Imputation NIH/NCI 41 / 57
FHDI: Implementation 10
Table: Regression (y1 ∼ y2) coefficient estimates with standard errors.
Estimator Intercept (S.E.) Slope (S.E.)
Naive -0.074 (0.305) 0.588 (0.142)
FEFI 0.035 (0.251) 0.466 (0.094)
FHDI 0.023 (0.252) 0.472 (0.095)
True 0 0.5
FEFI and FHDI estimators produce smaller standard errors compare
to Naive estimator.
FHDI estimator well approximates FEFI estimator.
Kim (ISU) Fractional Imputation NIH/NCI 42 / 57
1 Introduction
2 Fractional Imputation
3 Fractional hot deck imputation
4 R package: FHDI
5 Numerical Illustration
6 Concluding Remarks
Kim (ISU) Fractional Imputation NIH/NCI 43 / 57
Numerical illustration
A pseudo finite population constructed from a single month data in
Monthly Retail Trade Survey (MRTS) at US Bureau of Census
N = 7, 260 retail business units in five strata
Three variables in the data
h: stratum
xhi : inventory values
yhi : sales
Kim (ISU) Fractional Imputation NIH/NCI 44 / 57
Box plot of log sales and log inventory values by strata
q
qq
qqqqqqq
qq
q
q
q
qqqq
q
qqq
q
q
q
q
q
q
q
q
qqq
q
q
q
qqqq
q
q
qq
qqq
q
q
q
qqq
q
q
q
q
q
q
q
qq
qq
q
q
q
qq
qq
qq
qq
q
q
q
q
q
qqq
q
qqqqq
qqq
qq
1 2 3 4 5
1011121314151617
Box plot of sales data by strata
strata
logscale
q
q
q
q
q
q
qq
qqqqq
qqq
q
qq
q
q
q
qq
q
q
q
qq
qq
q
qq
qq
q
q
qq
qq
q
q
q
qqq
q
qqq
qqq
qqq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qqqq
q
q
q
qqqqqqqq
qq
qq
qq
q
qqqqqqqq
q
qqqqqqqqqqqqq
qq
q
qq
q
q
q
q
q
q
qq
q
q
qq
qq
qq
q
q
q
qqqq
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
qq
qq
qqq
q
qq
qq
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
qq
q
qq
q
q
qqq
qq
q
qq
q
q
q
q
qqqq
q
qqq
qq
q
qq
qqq
q
q
q
q
q
1 2 3 4 5
1314151617
Box plot of inventory data by strata
strata
logscale
Kim (ISU) Fractional Imputation NIH/NCI 45 / 57
Imputation model
log(yhi ) = β0h + β1 log(xhi ) + ehi
where
ehi ∼ N(0, σ2
)
Kim (ISU) Fractional Imputation NIH/NCI 46 / 57
Residual plot and residual QQ plot
12 13 14 15 16
−3−2−10123
Fitted values
Residuals
q
q
q
q
q
qq
qq q
qq
q
qq
qqqqqq
qq
q
qq
q
q
q
q
q
q
q
q
qqqqqqq
q
q
q
qqqq
qq
q
q
q
qq
q
q
qqqqqq
qqq
q
qqqqqqqqq
q
q
q
q
q
q
q
qq
q
qqq
qqq
q
q qq
q
q
q
qq
qqqq
q
q
q
q
q
qq
qq
qq
q
q
qq
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
qqq
qqqq
q
q
q
q
q
qq
q
q
qqq
q
qqq
q
qqq
qqqq
qq
q
q
q
q
q
q
qq
qq
q
q
q
q
q
qqq
q
qq
q
qq
q
qqq
q
q
q
qqqqq
q
qq
q
q
qq
q
qq
qq
qq
qq
q
q
q
q
q
q
qq
q
q
q
qq
q
q
qqq
qqqqqq
q
q
q
q
qq
qq
q
qqq
qq
q
qq
q
q
q
qq
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q qqqq
q
q
q
q
qq
qq
q
qq
q
q
q
qqqq
q
qqq
qq
q
qq
q qq
q
qq
q
q
q
q
q
qq
qq
qqq
qq
q
q
q
q
q
q
q
q
q
q
qqq
qqqqqqqqqqqqqqqq
qqqq
qq
qqq
qq
qq
q
q
q
q
q
q
q
q
q
qq
qq
qq
qq
qqqqqqq
qq
q
qq
q
qqqqq
q
qq
qqqqqq
q
q
qq
qq
q
q
qqq
q
q
q q
qqq
q
q
qq
q
qqq
qq
qq
qqqqq
qq
qqqqqqqqqq q
q
q
q
q
q
qq
q
q
qqq
q
q
qq
q
qqq
q
q
qq
qq
qqq
qq
qqqqqqqq
qqqqqqqqqq
q
qq
q
qqqqqq
q
qqqq
qqq
qq qqqqq
qqqq
q
qq
qq
q
q
q
qq
qq
q
q
q
qq
qqqqqq
qq
q
qqq
q
q
qq
q
qqq qq
qq
q
qqq
q qqqqqqqqqqqqq
qq
qqqqq
qq
qqq
q
qqq
q
q
qq q
q
qq
q
q
qqqq
qq
qqq
q
qqqqq
q q
q
q
q
q
q
q
q
qq
q
q
qq
q q
q
q
qqq
qq
q
qq
qqq qq
q
q
q
qq
qqqq qq
q
q
q
q
qq
qqqqqqqqq
q
q
qqq
q
q
q
qq
q
qq
qq
q
q
q
q
q
qqq
q
q
q qqq
qqqqqqqqq
qq
qqq
q
q
qqqqqqqqqqq q
qqqqqqq
q
q
qq
qq
q
q
q
q
qq
qqqqq
q
q
q
q
q
q
qq
q
q
q
q
qq
q
qqqq
q
qqqq
qq qq
qq
q
qqq
q
q
q
qq
qq
qq
q
q
q
qqqqqqqqqq
q
qq
q
q
q
q
q
q
q
qqqq
q
q
q
qqqq
qqqqqqq
q
q
q
q
qq
qq
q
qqqqqq
qq
q
qq
q
qqqqqqqq
qqq
q
q
q
qq
q
qq
q
q
q
q
q
q
qqq
q
q
qq
q
q
qq
qq
q
q
qq
q
q
qqq
q
q
qq
q
qqqqq
q
qqqqqqq
qq
qqqqqqqqq
q
qq
qq
q
qq
qq
qq
q
q
q
q
q
qq
qq
qq
qqq
q
q
qq
qqq
q
qqqq
qq
q
qqq
qqq
qqq
q
q
q
q
qq
q
qq
qq
q
q
q
qqq
q
q
q
q
q
q
qq q
q
q
q
q
q
q
q
q
q
qq
q
qqqqq
q
qqq
q
qqq
qq
qq
q
q
q
qqq qq
q q
qq q
qq
q
q
q
qq
q
q
q
q
q
q
q qq
q
qq
qqqqqqqq
q
q
qq
qq
q
q
q
qq
qq
qqqq
q
q
q
q
q
q
q
q
qqq
qq
qq
q
q
q
q
q
q
qq
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
qqq
q
qq
q
qqq
qq
q
q
q q
q
q
q
q
q
q
q
qq q
qq
q
q
q q
q
q
qq
q
q
q
q
q
q
qqq
q
q
q
q
qqq
q
qq
q
q
q
q
qqq
q
q
q
q
q
q
q
qq
qq
q
q
q
q
q
q
q
q
q
q
qq
q
qq
q q
q
qq
q
q
q
qq
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqq
qqqq
qqqqqqq
qqqqqq
qqq
qqqqqqq
qqqqqqqqqq
qqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
q
q
q
q
qqqq
q
qqq
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
qq
q
qq
q
qqq
qqqqq
q
qqq
q
qq
q
qqq
q
q
q
qq
qq
qqq
q
q
qqq
qq
qq
qq
q
q
qqq
qq
qq
qq
q
q
q
q
qq q
qq
q
qq
q
qqq
q q
qq
q
qq
q
q
q
q
q
q
q
q
qq
q
qqq
qq
qq
q
qq
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q q
q
q
qq
qq
q
qqq
q
qq
q
q
qqqqq
q
qq
qqq
q
qq q qq
qq
qqq
q
q
q
qq
q qq
q
q
q
q
q
q
q
qqqq
q
q
qq
qq
q
qq
q
qq
qqqq
qqqqqq
q
q
q
q
qqq
qqqqqqqqq
q
qqqq
qqqq
qqq
qq
q
q
q
q
qqq
q
q
qq
qq
q
qqq
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
qqq
q
q
q
qqqq
qq
q
q
q
q
q
qq
qq
qq
qqqqq
q
qq
qq
q
qqqq
qqq
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
qq
qqq
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
qq
qq
qq
q
qqq
qq
qq
qq
q
qqqq
qq
q
q
qqq
q
qq
qqqq
q
q
q
qq
qq
q
q
q
qq
q
q
q
q
q
q
qq
q
qq
qq
qq
q
q
q
qq
q
qq
q
q
qq
qq
qq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
q
q
qq
q
qq
q
q
q
qqqq
q
q
q
q
qqq
q
qq
qq
qqq
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqq
q
qqqqqqq qq qqqqqqq
qqqqq q
qqqqqqq
qqqqq
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqq
q
q
q
q
q
qqq
q
q
q
q
qqq
qqq
q
q
q
q
qq
q
q
q
q
q
qqqq
qqqqq
qqq
qq
q
qqqqqqq
qq
q
qqqqqqqqqqqq
q
qqqqqqqqqqqqqqqq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
qqqqq
qqqq
q
q
q
qq
q
q
qq
q
q
qqq
qqqqq
qqqq
qqq
q
qqqqqqqqqq
qq qq qq
qqq
qqqqqqqqqq
qq
q
qq
qqqqq
q
q
q
q
q
q
q
q
q
q
q
qqqq
qqq
q
q
q
qq
qqq
q
qqqqq
q
qqq
qqq
q
qqqqqqqqqqqqqqq
qq
qqq
q
qqqq
q
qq
qqqqqqqq
q
q
qqqqq
qqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qq
qq
qq
qq
qqqq
qqqqq
q
q
q
q
qqqq
q
q
qqqqq
q
q
qq
qqq
qqq
q
q
q
qq
qq
qqqq
q
qqq
qq
q
qq
q
q
qq
qq
q
q
qqqq
q
q
q
q
q
qq
q
q
q
q q
qq
q
q
q qqqqqqqqqqqqqqqqq
qqqqqqqqqqqqq
qq
q
q
qq
qqqq
q
q
qqqq
q
q
q
q
qqqq
q
q
q
q
q
qq
qq
qq
qq
q
q
q
q
q
q
qq
q
q
q
qq
q
qq
q
q
q
q
q
qqqq
q
qqqq
q
qq
q
q
q
qqq
qq
q
q
q
q
qq
qq
qq
q
qq
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
qq
qqq
q
q
qq
q
q
q
qq
q
q
qq
q
qqq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
qq
q
q
q
q
q
qqq
q
qq
q
qq
q
qq
qqq
qq
qq
qq
q
qqq
q
qqqq
q
qqq
qq
qqqqqq
qqq
qqqqq
qqqqqqq
qq
qqqqqq
qqqqqqq
qqqqqq
qqqqqqq
q
qqq
q
q
qqqq
qqq
q
q
qq
qqq
qqq
qq
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
qq
qq
q
qq
q
q
q
qq
q
q
q
q
qqqqqq
q
qq
q
qqq
qqqqq
qqqq
qqqqq
qqqqq
qqq
qq
q
q
q
q
qqqqq
q
q
q
qq
q
q
qqq
q
qqqq
qqqq
q
qqqqqq
qqqq
q
qq
q
q
q
q
qqq
q
q
q
qqqq
qqqqqq
q
qqq
q
qq
qqq
qq
qqqqqqq
q
qq
qq
q
q
q
qqqqqqqq
q
q
q
q
qq
q
q
qq
qqq
qq
qqq
q
q qq
q q
q
q
q
qq
q
q
q
qq
q
qq
q
q
qq
qq
q
qq
q
qq
qq
q
q
q
q
q
qqq
q
q
qqq
qqqqq
q
qqqqqq
qqq
qqqqqqq
qqqqqqq
qq
qqqqqqqq
q
qqqqqq
qq
qqqq qqqqqqqqqqqq qqqqq qqqqq
qq qqq
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
qqqqq
q
q
q
q
q
q
qq
q
q
q
qqqqqq
q
q
q
qqq
qq
qqq
q
qqq
qq
q
qq
qqq
q
qqqqqqq
qqqqqq
q
q
qqqq
q
qq
q
q
q
qqqqqq
qqqqq
qqqqqqqqq
q
q
q
qq
qqqqqq
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
qq
qq
q
q
q
qqq
q
q
q
q
q
qq
q
qqq qqqq
q
q
q
q
qq
qqqq
qq
q
qqq
qq
qqqqqqqqqqqqqqqqqqqq
qqqq
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
qqqq
qq
q
q
q
q
q
q
q qq
q
qqq
q
q
q
q
q
q qqq
q
q q
q
q
q
q q
qq
q
qq
q
q
q
qq
qqqq
q
q q
q
qq
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qq
q
q qq
q
q
qqq
q
q
qq
q
q
q
q
q
q
q
q
q
qq
qq
qq
q
q
q
q
q
q
q
q
qq
q
q
qqqqqqqqqqq
q
q
qqqqqq
qqqqqqqqqqq
qqqqqqqqq
q
qqqqqqqqqqqqqqqqqq
q
q
qqqqqqqqqqq
q
qqq
q
q
qqqq
q
qqqq
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
qq
q
qqqq
q
q
q
q
q
q
qq
q q
q
q
q
qq
q
q
q
q
q
q
q q
q
qqqq
qqqqqq
qqq
qq
qq
qq
qqq
q
qq
qqqqqqqq
qq
q
q
qqqq
qqq
qqqq
qqq
q
qq
q
q
q
q
q
q
qq
q
qq
q
q
q
qq
q
q
q
qq
q
qq
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qqq
qq
q
qq
q
q
q
q
q
q
qqqqq
q
q
qq
qqq
q
q
q
q
qq
qqqqq
q
q
q
qq
q
qqq
qq
qq
q
q
qqq
qq
q
q
q
q
q
qq
q
qq
q
qq
q
qq
q
q
q
q
qqq
q
qq
q
qqqqq
q
q
q
q
qq
q
q
q q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q q
q
q
qq
q
q
qq
q
qq
q
qq
q
qq
q
q
q
q
qqq
q
qqqq
qq
qqq
qqqq
q
q
qq
q
q
q
q
qqqq
q
q
qqq
q
q
q
qq
q
qqq
q
qqq
q
qq
qq
q
qqqq
qqq
q
q
q
qq
q
qq
qqq
qqq
qq
q
q
q
qqqqqq
qq
q
qq
q
qqqq
qqq
q
qq
q
q
q
q
q
q
q
q
q
q
qq
qq
q
q
qqqq
qq
q
q
q
q
q
q
qq
q
q
q
q q
q
q
q
q
q
q
qq
q
qqqqq
qqq q
qq
q
q
q
q
q
q
q
q
q
q
qq
qq q
qqq
qqq
q
qq
q
q
qqqq
q
q
q
qq
q
q
qqq
q
qq
q
q
q
qqqqq
q
q
qqqq
qq
q
qqq
qqq
q
q
q
qqq
q
q
qq
qq
q
q
q
q
q
qqq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qq
qqq
q
q
q
q
qq
q
qq
q
q
qq
qq
qq
q
q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
qq
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
qq
q
q
q
qqq
q
qq
q
q
q
q
q
q
q
q
qq
qq
q
qq
qq
q
q
q
qq
qq
qq
qqqq
q
q
q
q
qq
qq
q
q
qqqq
qq
q
qqqqqqq
qq
q
qqqqq
qqqq
q
qqq
q
q
qqq
q
qq
qqqq
q
q
q
q
q
q
q
q
qq
q
qqq
qq
q
q
qqq
q
q
q
q
q
q
qq
q
q
q
qqq
q
qq
q
q
q
q
qqq
q
q
qqqq
qq
qqqq
q
q
qq
q
q
q
qqqqqqqq
qqqqqqqqqqqqqqqq
q
qqqqqqqqqq
q
qqq
qq
q
qqqq
q
q qq
qqqqqq
qq
qqqqqqq
q
qqqq
q
q
qqq
q
q
q
qq
qq
q
qq
q
qq
qqq
q
qq
q
q
qq
qq
q
qq
q
qq
qq
q
qqq
qq
q
q
q
q
q
qqqq
q
q
q
q
qq
q
q
q
q
qq
qq
q qq
qqq
qq
q
qq
q
q
q
qq
qq
qq
qqq
q
q
qq
q
q
q
q
q
q
q
qq
q
qqqq
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
qq
q
qqqq
qqqq
qqqq
qqqq
qq
qqqqqq
qq
qqq
qq
q
qqq
q
q
q
q
qqq
qq
qq
q
q
q
q
q
q
q
q
q
qqq
q
qqq
qq
qqqqq
q
qq
qqqq
q
qq
q
q
q
qq
q
q
q
qq
q
q
q
qqq
q
qqq
qqq
q
q
qq
qqqqqqq
qqq
qqqq
q
q
q
q
qqqq
q
q
q
q
q q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
q
qq
q
q
q
q
q
qq
qq
qq
qqq
qq
qqqqqqqq
q
q
qq
qq
q
qqq
qqqqqq
q
qqq
q
qq
qq
qq
q
q
q
q
q
q
qq
q
qqq
q
qqq
q
q
q
qq
q
q
q
qq
q
q
qq
q
q
q
q
q
qq
q
q
qqq
qq
qq
q
q
q
qqq
qq
qq
q
q
q
q
q
q
q
q
qqqq
q
qq
q
qq
q
q
qqqqqqqq
qqq
q
qq
qqq
qq
qqqqqqqqqqq
q
q
q
q
q
q
q
q
qq
q
q
qqqq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
qq
q
qq
qqq
q
q
qq
qq
qqqqqqq
qqqqq
q
q
qqqq
qqq
qq
q
q
qq
q
q
q
q
q
qqq
q
qqqq
qq
q
qq
q
qq
q
q
qqq
qqq q
qq
q
qqqqqqqqq
q
q
q
q
qq
qqq
q
q
q
qq
qq
q
qq
q
qq
q
qq
q
qqqq
qq
qq
q
qq
qq
q
q
q
qq
qq
q
q
q
q
q
qqq
q
q
qq
qq
q
qqq
q
qq
q
q
q
q
q
qq
q
qqq
q
qqq
qq
qq
q
q
q
qq
q
q
q
qq
qq
qqqq
q
qqq
qqqqqq
q
qqq
q
q
q
qqq
q
q
q
qqq
q
q
q
qqqqqq
q
qq
qqqq
qqq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
qqq
q
q q
q q
q
q
q
q
q
q
q
q
qq
q
qqq
q
q
qq
q
q
q
q
qq
q
q
q
qq
q
qqq
q
q
qq
q
q
q
q
q
q qqq
q
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
qq
q
qqq
qq
q
qqq
q
q
qqqq
qqq
q
q
q
q
q
qq
q
q
q
qqq
qqq
qq
qqq
q
qq
qqqq
qq
q
q
qq
q
qq
q
q
q
qq
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
qq
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q q
q
q
q
q
qq
qq
qqq
qq
q q
q
qq qqq
q
qqq
qq
q
q
qq
qq
q
qq
qqqq q
q
q
q
q
qq
q
qq
qq
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
qq qq
q
q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qqqq
q
q
q
q
q
q
qq
q q
q
qq
q
q
q
q
q
q
q
qq q
q
q
q
qq
q
qq
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
qqqq
q
q
qqq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
qqqq
q
q
qq
q
qqqq
q
q
q
q
q
q
q
q
qq
qq
q
q
q
qqqq
q
q
qqq
q
q
q
qqqq
q
qq
qqqq
qqq
q
q
qq
qq
q
q
qqq
qq q
q
qq
q
q
qqq
qq
q
q
q
q
q
q
q
q
q q
q
q
q
qqq
qq qq
q
qqq
q
qq
q
qqq
qq
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq q
q
qq
q
q
q
q q
qqqq
qq
q
q qq
q
q
qq
q
qqq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
qqqq
q
q
q
q
q
q
q
q
qq
q
q
q
qq q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q qq qqqq
q
q
qq
q
q
q
q
qq q
q
q
qq
q
q
q
q
qq
qqq
q
q
q
qq
qq
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
qqqqqqqq
q
q
qqq
q
qqqqqqqq
q
qqq
qqqqq
q
qqqq
q
q
qq
q
qq
q
qqq
q
q
q
q
qq
qq
q
qq
q
q
q
qq q
q
q q
qq
q
q
q
q
q
q
qq
q
q q
q
q
q
q
q
q
q
qqq
qq
q
q
qqqq
qq
qq
qqq
q
q
q
qq
q
qqqq
q
qq
q
q
qqq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
qqqq
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
qq
qq
q
q
q
qqq
q
q
q
q
q
q
q
qq
q
qq
q
q
qq
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q qq
q
q
q
q
q
q
q
q
q
q
qq
qq q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
qq
qq
q
qq
q
q
q
q
q
q
q
q
q
qq
qqq
q
q
q
q
qq
qq
q
qq
qq
q
qq
q
q
q
q
q
qq
q
qqq
q
q
q
q
qq q
q
q
q
qqq
q
q
q
qq
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
qqqq
q
q
qq
q
q
qqq
q
q
q
q
qq
q
q
qq
q
qq
qqq
q
q
q
q
q
qqq
q
q
qqq
q
q
q
q
q
q
qqq
qqq
qq
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
qq
qq
q
q
qqq
q
q
q
q
q
qqqq
q
q
q
q
q
qq q
qq
q
q
q
qq
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
qqqq
q
qq
q
q
q
q
q
q
qq
q
q
q
qq
q qq
q
q
q
q
qq
q
q
qq q
q
qq
q
q
qqqqq
q
q
qq
qqq
q
q
q
q
q
qq
qqqq
q
qq
qq
q
q
q
q
qqq
qq
q
q
q
qqqqq
qqq
q
qqq
q
qq
qqq
q
q
q
q
q
q
q q
q
qq
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q q
q
q q
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q q
q
q
Residuals vs Fitted
4
6658
6424
q
q
q
q
q
qq
qq
q
qq
q
q q
qqqqqq
qq
q
qq
q
q
q
q
q
q
q
q
qqq
qqqq
q
q
q
qqqq
q
q
q
q
q
qq
q
q
qqqqqq
qqq
q
q
qqqq qqqq
q
q
q
q
q
q
q
qq
q
qqq
qqq
q
qqq
q
q
q
qq
qqqq
q
q
q
q
q
qq
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
qqq
qqqq
q
q
q
q
q
qq
q
q
qqq
q
qqq
q
qqq
qqq q
qq
q
q
q
q
q
q
qq
qq
q
q
q
q
q
qqq
q
qq
q
q q
q
qqq
q
q
q
qq qqq
q
qq
q
q
qq
q
qq
qq
qq
qq
q
q
q
q
q
q
qq
q
q
q
qq
q
q
qqq
qqqqqq
q
q
q
q
qq
qq
q
qqq
qq
q
q q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
qq
q
q
q
qqqqq
q
q
q
q
qq
qq
q
qq
q
q
q
qqqq
q
qqq
qq
q
q q
qqq
q
q
q
q
q
q
q
q
qq
qq
qqq
qq
q
q
q
q
q
q
q
q
q
q
qqq
qq qqqqqqqq
qqqqqq
qqqq
qq
qqq
q q
q q
q
q
q
q
q
q
q
q
q
qq
qq
qq
qq
qq qqqqq
q q
q
qq
q
qqqqq
q
qq
q qqqqq
q
q
q
q
qq
q
q
qqq
q
q
qq
qqq
q
q
qq
q
qqq
qq
qq
qqqqq
q q
q qqqqqqqq q
q
q
q
q
q
q
qq
q
q
qqq
q
q
qq
q
qqq
q
q
qq
qq
q
qq
qq
qq qq q
qq
q
qq q qqqqq qq
q
q
q
q
qqqqq q
q
q
qqq
qqq
qqqqqqq
qqq q
q
q q
qq
q
q
q
qq
qq
q
q
q
qq
qq qqqq
qq
q
qqq
q
q
qq
q
qqqqq
qq
q
qqq
qqqqqqqq qqqqqq
qq
qqqqq
qq
q
qq
q
qqq
q
q
qqq
q
qq
q
q
qqqq
qq
qqq
q
qqqqq
qq
q
q
q
q
q
q
q
qq
q
q
qq
qq
q
q
qqq
qq
q
qq
qqqqq
q
q
q
qq
qqqqqq
q
q
q
q
qq
qqqqq qqqq
q
q
q qq
q
q
q
qq
q
qq
qq
q
q
q
q
q
qqq
q
q
qqq q
qqqqqqqqq
q q
qqq
q
q
qqqqqqqqqqq q
qqqqqqq
q
q
qq
qq
q
q
q
q
q
q
qqqqq
q
q
q
q
q
q
qq
q
q
q
q
qq
q
qqqq
q
qqqq
qqqq
qq
q
qqq
q
q
q
qq
qq
qq
q
q
q
q
qqqq qqqqq
q
qq
q
q
q
q
q
q
q
qqqq
q
q
q
qqqq
qqqqqqq
q
q
q
q
qq
qq
q
qqqqqq
q
q
q
qq
q
qqqqqqqq
qqq
q
q
q
qq
q
qq
q
q
q
q
q
q
qqq
q
q
qq
q
q
qq
qq
q
q
q
q
q
q
q
q q
q
q
qq
q
qqqq q
q
qqqqqq
q
qq
qqqqqqqqq
q
qq
qq
q
qq
qq
q
q
q
q
q
q
q
qq
qq
qq
qqq
q
q
qq
qqq
q
qqqq
qq
q
qqq
qqq
qqq
q
q
q
q
qq
q
qq
qq
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
qqqqq
q
qqq
q
qqq
qq
qq
q
q
q
qqqq q
qq
qqq
qq
q
q
q
qq
q
q
q
q
q
q
qqq
q
qq
qq
qqqqqq
q
q
qq
qq
q
q
q
qq
qq
q qqq
q
q
q
q
q
q
q
q
qqq
qq
qq
q
q
q
q
q
q
qq
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
qqq
q
qq
q
qq q
qq
q
q
qq
q
q
q
q
q
q
q
qqq
q q
q
q
qq
q
q
qq
q
q
q
q
q
q
qqq
q
q
q
q
qqq
q
qq
q
q
q
q
qqq
q
q
q
q
q
q
q
qq
qq
q
q
q
q
q
q
q
q
q
q
qq
q
qq
qq
q
q q
q
q
q
qq
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqq
qqqq
qqqqqq q
q
qq
q
qq
qqq
qqqqqq
q
qqqqqqqqq
q
qqqqqq
qq
q
qqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqq
q
q
q
q
q
q
qq qq
q
qqq
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
qq
q
qq
q
qq
q
qq
qqq
q
qqq
q
qq
q
qqq
q
q
q
qq
qq
qqq
q
q
qqq
qq
qq
q q
q
q
qqq
qq
qq
qq
q
q
q
q
qqq
qq
q
qq
q
qqq
qq
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
qq
qq
qq
q
qq
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
qq
q
qq q
q
qq
q
q
qqqqq
q
q
q
qqq
q
qqqqq
qq
qqq
q
q
q
qq
qqq
q
q
q
q
q
q
q
qqq
q
q
q
qq
qq
q
qq
q
qq
qqq
q
qqqq qq
q
q
q
q
qq q
q qqq
qq
qqq
q
q qqq
qq qq
qqq
qq
q
q
q
q
qqq
q
q
qq
q q
q
qqq
q
q
q
qqqq qqqq qqqqqqq qqqqqqq q
q qqqqqqq q
q
q
qqq
q
q
q
qqqq
qq
q
q
q
q
q
qq
qq
qq
qqqqq
q
qq
qq
q
qqqq
qqq
q
qqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqq
q
qq
qqq
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
qq
qq
qq
q
qqq
qq
qq
qq
q
q qqq
qq
q
q
qqq
q
qq
qqq q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q q
q
q
qq
q
q
q
qq
q
qq
q
q
qq
qq
qq
q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
q
q
qq
q
qq
q
q
q
qqqq
q
q
q
q
qqq
q
q
q
qq
qqq
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqq
q
qqqqqqq
qqqqqqqq q
qqqqqq
qqq qqq q
q qqqq
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqq
q
q
q
q
q
qqq
q
q
q
q
qqq
qqq
q
q
q
q
qq
q
q
q
q
q
q q
qq
qqqqq
qqq
q q
q
qq qqqq q
qq
q
q qqqqq qqqqqq
q
qqqq qq
qqqqqqqqqq
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qqqqq
qqqq
q
q
q
qq
q
q
qq
q
q
qqq
qqqqq
qqqq
qqq
q
q
qqqqqqqq q
q qqqqq
qqq
qqqqqq q q
qq
qq
q
qq
qqqqq
q
q
q
q
q
q
q
q
q
q
q
qqqq
qqq
q
q
q
qq
qqq
q
qqqqq
q
q
qq
qq
q
q
qqqqqqqqqqqqqqq
qq
qqq
q
qqqq
q
qq
qqqqqqqq
q
q
qqqqq
qq
q
q
qqqqqqqqqqqqq qq qqqqqqqqqq qqqq
qq
qq
qq
qq
q
qqq
qqqqq
q
q
q
q
qqqq
q
q
qqq
qq
q
q
qq
qq q
qqq
q
q
q
qq
qq
qqqq
q
qqq
qq
q
qq
q
q
qq
qq
q
q
q qqq
q
q
q
q
q
qq
q
q
q
q q
qq
q
q
qqqqqq qq qqqqqqq qqq
q qqq qqqqqq qqq
qq
q
q
qq
qq
qq
q
q
qqqq
q
q
q
q
qq
qq
q
q
q
q
q
qq
qq
qq
qq
q
q
q
q
q
q
q q
q
q
q
qq
q
q
q
q
q
q
q
q
qqq q
q
qqqq
q
q
q
q
q
q
qqq
qq
q
q
q
q
qq
qq
qq
q
qq
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
qqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q q
qq
qqq
q
q
qq
q
q
q
qq
q
q
qq
q
qqq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
qq
q
qq
q
qq
q
qq
qq
qq
qq
q
qqq
q
qqqq
q
qqq
q
q
qqqqqq
qqq
qqqqq
q qqqqqq
qq
qqqq q q
qqqqqqq
qqqqq q
qq
qqqq q
q
qqq
q
q
q qqq
qqq
q
q
qq
qqq
qqq
q q
q
q
q
q
q
q
q
q
q
q
q qq
q
q
q
qq
qq
q
qq
q
q
q
qq
q
q
q
q
qqqqqq
q
qq
q
qqq
qqqqq
qqqq
qqqqq
qqqq
q
qqq
qq
q
q
q
q
qqqqq
q
q
q
qq
q
q
q qq
q
q qqq
qqqq
q
qqq qqq
q
q
q
q
q
qq
q
q
q
q
qqq
q
q
q
q
qqq
qqqqqq
q
qqq
q
qq
qq q
q q
qqqqqqq
q
qq
qq
q
q
q
qqqqqqqq
q
q
q
q
qq
q
q
qq
qqq
qq
qqq
q
qqq
qq
q
q
q
qq
q
q
q
qq
q
qq
q
q
qq
q
q
q
qq
q
qq
qq
q
q
q
q
q
qqq
q
q
qqq
qqq qqq
qqqqqq
qqq
q qqqqq
q
qqq qqq q
q q
qqqqqqqq
q
qqqqqq
qq
qqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqq
q
q
qq
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
qq
qqq
q
q
q
q
q
q
q q
q
q
q
qqqqqq
q
q
q
qq q
q q
qqq
q
q qq
qq
q
qq
q qq
q
qqqqq qq
q
qqqqq
q
q
qqqq
q
qq
q
q
q
qqqq qq
qqqqq
qqqqqqqqq
q
q
q
qq
qqqqqq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qqq
q
q
q
q
q
qq
q
qqqqq
qq
q
q
q
q
qq
qqqq
qq
q
qqq
qq
qqqqqqqqqqqq
qqqqqqqq
qqqq
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
qqqq
qq
q
q
q
q
q
q
qqq
q
qqq
q
q
q
q
q
qqqq
q
qq
q
q
q
qq
qq
q
qq
q
q
q
qq
qqqq
q
qq
q
q q
q
q
q
q
q
q
q qqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq
qq
q
qqq
q
q
qqq
q
q
qq
q
q
q
q
q
q
q
q
q
qq
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
qqq qq
qq qqqq
q
q
qqqq qq
q
qq qqqqqqq q
qqqqqqqqq
q
qq qqqqqqq qqqqqq qqq
q
q
qqqqq qq qqqq
q
qqq
q
q
qqqq
q
qqqq
q
qq
q
qq
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
qq
q
qqqq
q
q
q
q
q
q
qq
qq
q
q
q
qq
q
q
q
q
q
q
qq
q
qqqq
q
qqqq
q
qqq
qq
qq
qq
qqq
q
qq
qqqqqqqq
qq
q
q
qqqq
qqq
qqqq
qqq
q
qq
q
q
q
q
q
q
qq
q
qq
q
q
q
qq
q
q
q
qq
q
qq
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qqq
qq
q
qq
q
q
q
q
q
q
qqqq
q
q
q
qq
qqq
q
q
q
q
qq
qqq qq
q
q
q
qq
q
qqq
qq
qq
q
q
qq
q
qq
q
q
q
q
q
qq
q
qq
q
qq
q
qq
q
q
q
q
q
qq
q
qq
q
qqqq
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qq
q
q
qq
q
qq
q
qq
q
qq
q
q
q
q
qqq
q
qqq
q
qq
qqq
q
qqq
q
q
qq
q
q
q
q
qqqq
q
q
qqq
q
q
q
qq
q
qqq
q
qqq
q
qq
q
q
q
qq
qq
q qq
q
q
q
qq
q
qq
qqq
qqq
qq
q
q
q
qqqqqq
qq
q
qq
q
qqq q
qqq
q
qq
q
q
q
q
q
q
q
q
q
q
qq
qq
q
q
q qqq
qq
q
q
q
q
q
q
qq
q
q
q
qq
q
q
q
q
q
q
qq
q
qqqqq
qqqq
qq
q
q
q
q
q
q
q
q
q
q
qq
q q
q
qqq
qqq
q
q q
q
q
qq qq
q
q
q
qq
q
q
qq
q
q
qq
q
q
q
qqqqq
q
q
qqqq
qq
q
qqq
qqq
q
q
q
qqq
q
q
q
q
qq
q
q
q
q
q
qqq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qq
qqq
q
q
q
q
qq
q
qq
q
q
qq
qq
qq
q
q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
qq
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
qq
q
q
q
qqq
q
q q
q
q
q
q
q
q
q
q
qq
qq
q
qq
qq
q
q
q
q
q
qq
qq
qq
qq
q
q
q
q
qq
qq
q
q
qqqq
qq
q
qqqqqqq
qq
q
qqqqq
qqqq
q
qqq
q
q
qqq
q
qq
qq qq
q
q
q
q
q
q
q
q
qq
q
qqq
qq
q
q
qqq
q
q
q
q
q
q
q q
q
q
q
qqq
q
qq
q
q
q
q
qq
q
q
q
qqqq
qq
qqqq
q
q
qq
q
q
q
qqqqqqqq
qqq
q qqqq
qqqq qq qq
q
qqqqqq q
qqq
q
qqq
qq
q
qqqq
q
qq
q
qqq qqq
qq
qqqqqqq
q
qqqq
q
q
qqq
q
q
q
qq
qq
q
qq
q
qq
qqq
q
qq
q
q
qq
qq
q
qq
q
qq
qq
q
qqq
qq
q
q
q
q
q
qqqq
q
q
q
q
q q
q
q
q
q
qq
qq
qqq
qqq
q q
q
qq
q
q
q
qq
qq
qq
qq q
q
q
qq
q
q
q
q
q
q
q
qq
q
qqqq
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
qq
q
qqqq
qqqq
qqqq
qqqq
qq
qqq
qqq
qq
qqq
qq
q
qqq
q
q
q
q
qqq
q
q
qq
q
q
q
q
q
q
q
q
q
qqq
q
qqq
qq
qqq qq
q
q
q
qq qq
q
qq
q
q
q
qq
q
q
q
qq
q
q
q
q qq
q
qqq
qqq
q
q
qq
qqqqqqq
qqq
qqqq
q
q
q
q
qqqq
q
q
q
q
qq
q
q
q
q
q
q
q q
q
q
q
qq
q
q
q
qq
q
q
q
q
q
qq
qq
q
q
q qq
qq
q qqqq qq q
q
q
qq
qq
q
q
q q
qq
qqq q
q
qq q
q
qq
qq
qq
q
q
q
q
q
q
qq
q
q qq
q
qqq
q
q
q
qq
q
q
q
qq
q
q
qq
q
q
q
q
q
qq
q
q
qqq
qq
qq
q
q
q
qqq
qq
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q q
q
qq
q
q
qqqq qq qq
qqq
q
qq
qq q
qq
qq q
qqqq qqqq
q
q
q
q
q
q
q
q
qq
q
q
q qqq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
qq
q
qq
qqq
q
q
q q
qq
qq qq
qqq
qqq
qq
q
q
qqqq
qqq
q q
q
q
qq
q
q
q
q
q
qqq
q
qqqq
qq
q
q q
q
qq
q
q
qqq
qqqq
qq
q
qqqq qq qqq
q
q
q
q
qq
q
qq
q
q
q
qq
qq
q
qq
q
qq
q
qq
q
qqq
q
qq
qq
q
qq
qq
q
q
q
qq
qq
q
q
q
q
q
qqq
q
q
qq
qq
q
qqq
q
qq
q
q
q
q
q
qq
q
qqq
q
qqq
qq
qq
q
q
q
qq
q
q
q
qq
qq
qqq q
q
qqq
qq qqqq
q
qqq
q
q
q
qqq
q
q
q
q q q
q
q
q
qq qqqq
q
qq
qqqq
qqq
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
qqq
q
qq
qq
q
q
q
q
q
q
q
q
qq
q
qq q
q
q
qq
q
q
q
q
qq
q
q
q
qq
q
qqq
q
q
qq
q
q
q
q
q
qqq q
q
q
q
q
q
q
q
q
q
q q
q
qq
q
q
q
qq
q
qq
q
qq
q
qqq
q
q
qqqq
q qq
q
q
q
q
q
qq
q
q
q
qq q
q
q q
qq
qq q
q
qq
qqqq
qq
q
q
qq
q
qq
q
q
q
qq
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
qq
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
qq
qq
qqq
qq
qq
q
qqqqq
q
qqq
qq
q
q
qq
qq
q
qq
qqqqq
q
q
q
q
qq
q
qq
q q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
q q q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qqqq
q
q
q
q
q
q
qq
qq
q
qq
q
q
q
q
q
q
q
qqq
q
q
q
qq
q
qq
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
qqqq
q
q
qqq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qqqq
q
q
qq
q
qqqq
q
q
q
q
q
q
q
q
qq
qq
q
q
q
qqqq
q
q
qqq
q
q
q
qqqq
q
qq
qqqq
qqq
q
q
qq
qq
q
q
qqq
q
qq
q
q q
q
q
qqq
qq
q
q
q
q
q
q
q
q
qq
q
q
q
qqq
qqqq
q
q qq
q
qq
q
qqq
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqq
q
qq
q
q
q
q
q
qqqq
qq
q
qqq
q
q
qq
q
qqq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
qqqq
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqq
q
q
qq
q
q
q
q
qqq
q
q
qq
q
q
q
q
qq
q qq
q
q
q
qq
qq
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
qqqqq q
qq
q
q
qqq
q
qqqqq qqq
q
q qq
qqq qq
q
qq qq
q
q
qq
q
qq
q
q qq
q
q
q
q
qq
qq
q
qq
q
q
q
qqq
q
qq
qq
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
qqq
q
q
q
q
qqqq
q
q
qq
qqq
q
q
q
qq
q
qqq q
q
qq
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
qq
qq
q
q
q
qqq
q
q
q
q
q
q
q
qq
q
qq
q
q
qq
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
qq
qq
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
qq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
qq
qq
q
qq
qq
q
qq
q
q
q
q
q
qq
q
qqq
q
q
q
q
qqq
q
q
q
qq
q
q
q
q
qq
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
qqq
q
q
q
qq
q
q
qqq
q
q
q
q
qq
q
q
qq
q
q q
qqq
q
q
q
q
q
qqq
q
q
qqq
q
q
q
q
q
q
q q q
qqq
qq
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
q
qqqq
q
q
q
q
q
q qq
qq
q
q
q
qq
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
qq
q
q
q
q
q
q
qq
q
q
q
qq
qqq
q
q
q
q
qq
q
q
qqq
q
qq
q
q
qqqqq
q
q
qq
qqq
q
q
q
q
q
qq
qqq q
q
qq
qq
q
q
q
q
qqq
qq
q
q
q
qqq qq
qq q
q
qqq
q
q q
qqq
q
q
q
q
q
q
q q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqq
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q q
q
q
qq
q
q
qq
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
−4 −2 0 2 4
−505
Theoretical Quantiles
Standardizedresiduals
Normal Q−Q
4
6658
6424
Regression model of log(y) against log(x) and strata indicator
Kim (ISU) Fractional Imputation NIH/NCI 47 / 57
Stratified random sampling
Table: The sample allocation in stratified simple random sampling.
Strata 1 2 3 4 5
Strata size Nh 352 566 1963 2181 2198
Sample size nh 28 32 46 46 48
Sampling weight 12.57 17.69 42.67 47.41 45.79
Kim (ISU) Fractional Imputation NIH/NCI 48 / 57
Response mechanism: MAR
Variable xhi is always observed and only yhi is subject to missingness.
PMAR
Rhi ∼ Bernoulli(πhi ), πhi = 1/[1 + exp{4 − 0.3 log(xhi )}].
The overall response rate is about 0.6.
Kim (ISU) Fractional Imputation NIH/NCI 49 / 57
Simulation Study
Table 1 Monte Carlo bias and variance of the point estimators.
Parameter Estimator Bias Variance Std Var
Complete sample 0.00 0.42 100
θ = E(Y ) MI 0.00 0.59 134
FI 0.00 0.58 133
Table 2 Monte Carlo relative bias of the variance estimator.
Parameter Imputation Relative bias (%)
V (ˆθ) MI 18.4
FI 2.7
Kim (ISU) Fractional Imputation NIH/NCI 50 / 57
Discussion
Rubin’s formula is based on the following decomposition:
V (ˆθMI ) = V (ˆθn) + V (ˆθMI − ˆθn)
where ˆθn is the complete-sample estimator of θ. Basically, WM term
estimates V (ˆθn) and (1 + M−1)BM term estimates V (ˆθMI − ˆθn).
For general case, we have
V (ˆθMI ) = V (ˆθn) + V (ˆθMI − ˆθn) + 2Cov(ˆθMI − ˆθn, ˆθn)
and Rubin’s variance estimator ignores the covariance term. Thus, a
sufficient condition for the validity of unbiased variance estimator is
Cov(ˆθMI − ˆθn, ˆθn) = 0.
Meng (1994) called the condition congeniality of ˆθn.
Congeniality holds when ˆθn is the MLE of θ (self-efficient estimator).
Kim (ISU) Fractional Imputation NIH/NCI 51 / 57
Discussion
For example, there are two estimators of θ = E(Y ) when log(Y )
follows from N(β0 + β1x, σ2).
1 Maximum likelihood method:
ˆθMLE = n−1
n
i=1
exp{ˆβ0 + ˆβ1xi + 0.5ˆσ2
}
2 Method of moments:
ˆθMME = n−1
n
i=1
yi
The MME of θ = E(Y ) does not satisfy the congeniality and Rubin’s
variance estimator is biased (Yang and Kim, 2016).
Rubin’s variance estimator is essentially unbiased for MLE of θ (R.B.
= -1.9%) but MLE is rarely used in practice.
Kim (ISU) Fractional Imputation NIH/NCI 52 / 57
1 Introduction
2 Fractional Imputation
3 Fractional hot deck imputation
4 R package: FHDI
5 Numerical Illustration
6 Concluding Remarks
Kim (ISU) Fractional Imputation NIH/NCI 53 / 57
Summary
Fractional imputation is developed as a frequentist imputation.
Multiple imputation is motivated from a Bayesian framework. The
frequentist validity of multiple imputation requires congeniality.
Fractional imputation does not require the congeniality condition and
works well for Method of Moments estimators.
Fractional Hot Deck Imputation is now developed in SAS and R.
Kim (ISU) Fractional Imputation NIH/NCI 54 / 57
Future research
Fractional imputation using Gaussian finite mixture models.
Survey data integration: Extension of Kim and Rao (2012) and Kim,
Berg, and Park (2016).
Fractional imputation under model uncertainty.
Kim (ISU) Fractional Imputation NIH/NCI 55 / 57
Collaborators
Fractional Hot Deck Imputation
Jongho Im, Inho Cho, Wayne Fuller, Pushpal Mukhopadhyay
Survey Data Integration
Emily Berg, J.N.K. Rao, Seho Park
Review of Fractional Imputation, Congeniality
Shu Yang
Kim (ISU) Fractional Imputation NIH/NCI 56 / 57
The end
Kim (ISU) Fractional Imputation NIH/NCI 57 / 57

More Related Content

What's hot

3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMINGChristian Robert
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewMohamed Farouk
 
Pattern Recognition
Pattern RecognitionPattern Recognition
Pattern RecognitionEunho Lee
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
4thchannel conference poster_freedom_gumedze
4thchannel conference poster_freedom_gumedze4thchannel conference poster_freedom_gumedze
4thchannel conference poster_freedom_gumedzeFreedom Gumedze
 
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized ModelPredicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Modelweekendsunny
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsChristian Robert
 
Hypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesHypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesYoung-Geun Choi
 
JHU Job Talk
JHU Job TalkJHU Job Talk
JHU Job Talkjtleek
 
Slides csm
Slides csmSlides csm
Slides csmychaubey
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
7. toda yamamoto-granger causality
7. toda yamamoto-granger causality7. toda yamamoto-granger causality
7. toda yamamoto-granger causalityQuang Hoang
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapterChristian Robert
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chaptersChristian Robert
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersAlexander Decker
 

What's hot (20)

3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
 
Seattle.Slides.7
Seattle.Slides.7Seattle.Slides.7
Seattle.Slides.7
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
 
Pattern Recognition
Pattern RecognitionPattern Recognition
Pattern Recognition
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
4thchannel conference poster_freedom_gumedze
4thchannel conference poster_freedom_gumedze4thchannel conference poster_freedom_gumedze
4thchannel conference poster_freedom_gumedze
 
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized ModelPredicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forests
 
Hypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesHypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rules
 
JHU Job Talk
JHU Job TalkJHU Job Talk
JHU Job Talk
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
Slides csm
Slides csmSlides csm
Slides csm
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
7. toda yamamoto-granger causality
7. toda yamamoto-granger causality7. toda yamamoto-granger causality
7. toda yamamoto-granger causality
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chapters
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parameters
 
2. adf tests
2. adf tests2. adf tests
2. adf tests
 

Similar to Fi review5

Fractional hot deck imputation - Jae Kim
Fractional hot deck imputation - Jae KimFractional hot deck imputation - Jae Kim
Fractional hot deck imputation - Jae KimJae-kwang Kim
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfAlexander Litvinenko
 
Uncertainty in deep learning
Uncertainty in deep learningUncertainty in deep learning
Uncertainty in deep learningYujiro Katagiri
 
Solving inverse problems via non-linear Bayesian Update of PCE coefficients
Solving inverse problems via non-linear Bayesian Update of PCE coefficientsSolving inverse problems via non-linear Bayesian Update of PCE coefficients
Solving inverse problems via non-linear Bayesian Update of PCE coefficientsAlexander Litvinenko
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionMichael Stumpf
 
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier DetectionA Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier DetectionKonkuk University, Korea
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaAlexander Litvinenko
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector MachinesEdgar Marca
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsAlexander Litvinenko
 
Cs229 notes7b
Cs229 notes7bCs229 notes7b
Cs229 notes7bVuTran231
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statisticsPierre Pudlo
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsUmberto Picchini
 
Machine learning (8)
Machine learning (8)Machine learning (8)
Machine learning (8)NYversity
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsBigMC
 

Similar to Fi review5 (20)

Fractional hot deck imputation - Jae Kim
Fractional hot deck imputation - Jae KimFractional hot deck imputation - Jae Kim
Fractional hot deck imputation - Jae Kim
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdf
 
Uncertainty in deep learning
Uncertainty in deep learningUncertainty in deep learning
Uncertainty in deep learning
 
Litvinenko nlbu2016
Litvinenko nlbu2016Litvinenko nlbu2016
Litvinenko nlbu2016
 
Solving inverse problems via non-linear Bayesian Update of PCE coefficients
Solving inverse problems via non-linear Bayesian Update of PCE coefficientsSolving inverse problems via non-linear Bayesian Update of PCE coefficients
Solving inverse problems via non-linear Bayesian Update of PCE coefficients
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
 
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier DetectionA Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Ica group 3[1]
Ica group 3[1]Ica group 3[1]
Ica group 3[1]
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problems
 
Cs229 notes7b
Cs229 notes7bCs229 notes7b
Cs229 notes7b
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statistics
 
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space models
 
Machine learning (8)
Machine learning (8)Machine learning (8)
Machine learning (8)
 
sada_pres
sada_pressada_pres
sada_pres
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering models
 
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
 

Recently uploaded

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Tamer Koksalan, PhD
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 

Recently uploaded (20)

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 

Fi review5

  • 1. Fractional imputation for handling missing data in survey sampling Jae-Kwang Kim Iowa State University Department of Cancer Epidemiology & Genetics National Institutes of Health, National Cancer Institute June 27, 2017
  • 2. 1 Introduction 2 Fractional Imputation 3 Fractional hot deck imputation 4 R package: FHDI 5 Numerical Illustration 6 Concluding Remarks Kim (ISU) Fractional Imputation NIH/NCI 2 / 57
  • 3. Introduction Basic Setup U = {1, 2, · · · , N}: Finite population A ⊂ U: sample (selected by a probability sampling design). Under complete response, suppose that ˆηn,g = i∈A wi g(yi ) is an unbiased estimator of ηg = N−1 N i=1 g(yi ). Here, g(·) is a known function. For example, g(y) = I(y < 3) leads to ηg = P(Y < 3). Kim (ISU) Fractional Imputation NIH/NCI 3 / 57
  • 4. Introduction Basic Setup (Cont’d) A = AR ∪ AM, where yi are observed in AR. yi are missing in AM Ri = 1 if i ∈ AR and Ri = 0 if i ∈ AM. y∗ i : imputed value for yi , i ∈ AM Imputed estimator of ηg ˆηI,g = i∈AR wi g(yi ) + i∈AM wi g(y∗ i ) Need E {g(y∗ i ) | Ri = 0} = E {g(yi ) | Ri = 0}. Kim (ISU) Fractional Imputation NIH/NCI 4 / 57
  • 5. Introduction ML estimation under missing data setup Often, find x (always observed) such that Missing at random (MAR) holds: f (y | x, R = 0) = f (y | x) Imputed values are created from f (y | x). Computing the conditional expectation can be a challenging problem. 1 Do not know the true parameter θ in f (y | x) = f (y | x; θ): E {g (y) | x} = E {g (yi ) | xi ; θ} . 2 Even if we know θ, computing the conditional expectation can be numerically difficult. Kim (ISU) Fractional Imputation NIH/NCI 5 / 57
  • 6. Introduction Imputation Imputation: Monte Carlo approximation of the conditional expectation (given the observed data). E {g (yi ) | xi } ∼= 1 M M j=1 g y ∗(j) i 1 Bayesian approach: generate y∗ i from f (yi | xi , yobs) = f (yi | xi , θ) p(θ | xi , yobs)dθ 2 Frequentist approach: generate y∗ i from f yi | xi ; ˆθ , where ˆθ is a consistent estimator. Kim (ISU) Fractional Imputation NIH/NCI 6 / 57
  • 7. Comparison Bayesian Frequentist Model Posterior distribution Prediction model f (latent, θ | data) f (latent | data, θ) Computation Data augmentation EM algorithm Prediction I-step E-step Parameter update P-step M-step Parameter est’n Posterior mode ML estimation Imputation Multiple imputation Fractional imputation Variance estimation Rubin’s formula Linearization or Resampling Kim (ISU) Fractional Imputation NIH/NCI 7 / 57
  • 8. 1 Introduction 2 Fractional Imputation 3 Fractional hot deck imputation 4 R package: FHDI 5 Numerical Illustration 6 Concluding Remarks Kim (ISU) Fractional Imputation NIH/NCI 8 / 57
  • 9. Fractional Imputation Idea (parametric model approach) Approximate E{g(yi ) | xi } by E{g(yi ) | xi } ∼= Mi j=1 w∗ ij g(y ∗(j) i ) where w∗ ij is the fractional weight assigned to the j-th imputed value of yi . If yi is a categorical variable, we can use y ∗(j) i = the j-th possible value of yi w ∗(j) ij = P(yi = y ∗(j) i | xi ; ˆθ), where ˆθ is the (pseudo) MLE of θ. Kim (ISU) Fractional Imputation NIH/NCI 9 / 57
  • 10. Fractional imputation Features Split the record with missing item into M(> 1) imputed values Assign fractional weights The final product is a single data file with size ≤ nM. For variance estimation, the fractional weights are replicated. Kim (ISU) Fractional Imputation NIH/NCI 10 / 57
  • 11. Fractional imputation using parametric model Assume y ∼ f (y | x; θ) for some θ. In this case, under MAR, the following fractional imputation method can be used. 1 Parameter Estimation: Estimate θ by solving i∈A wi Ri S(θ; xi , yi ) = 0 (1) where S(θ; x, y) = ∂ log f (y | x; θ)/∂θ. 2 Imputation: Generate M imputed values of yi , denoted by y ∗(j) i , j = 1, · · · , M, from f (yi | xi ; ˆθ), where ˆθ is obtained from (1). For estimating µg = E{g(Y )}, the fractional imputation estimator of µg is ˆµFI,g = i∈A wi {Ri g(yi ) + (1 − Ri ) M j=1 w∗ ij g(y ∗(j) i )}, where w∗ ij = 1/M. Kim (ISU) Fractional Imputation NIH/NCI 11 / 57
  • 12. Remark If we want to produce smaller M, then the following two-phase sampling method can be developed. 1 First use large M1 (say M1 = 10, 000) to obtain y ∗(j) i , j = 1, · · · , M1. 2 From the first-phase sample of size M1 generated from Step 1, select a second-phase sample of size M2 using an efficient sampling method, such as stratification or systematic sampling. Under optimal stratification, we have M2 j=1 w∗ ij g(y ∗(j) i ) = E{g(Y ) | xi ; ˆθ} + Op(max{M −1/2 1 , M−1 2 }). (2) Kim (ISU) Fractional Imputation NIH/NCI 12 / 57
  • 13. Figure: Two-phase sampling for fractional imputation: Fractional imputation with size M2 = 4 from the histogram of M1 >> M2 imputed values Kim (ISU) Fractional Imputation NIH/NCI 13 / 57
  • 14. Calibration fractional imputation In addition to efficient sampling, we can also consider calibration weighting to construct the fractional weights to satisfy M j=1 w∗ ij = 1 and i∈A wi (1 − Ri ) M j=1 w∗ ij g(y ∗(j) i ) = i∈A wi (1 − Ri )E{g(Y ) | xi ; ˆθ} exactly for prespecified g(·) function. The calibration fractional weighting is discussed in Fuller and Kim (2005). Kim (ISU) Fractional Imputation NIH/NCI 14 / 57
  • 15. Variance estimation For replication variance estimation, we first compute ˆθ(k) from (1) using wi replaced by w (k) i . Next, we need to construct the replicated fractional weights w ∗(k) ij to satisfy M j=1 w ∗(k) ij = 1 and i∈A w (k) i (1−Ri ) M j=1 w ∗(k) ij g(y ∗(j) i ) . = i∈A w (k) i (1−Ri )E{g(Y ) | xi ; ˆθ(k) }. (3) The replicates for ˆµFI,g can be computed by ˆµ (k) FI,g = i∈A w (k) i {Ri g(yi ) + (1 − Ri ) M j=1 w ∗(k) ij g(y ∗(j) i )}. Note that the imputed values are not changed for each replication. Only the fractional weights are changed. Kim (ISU) Fractional Imputation NIH/NCI 15 / 57
  • 16. Variance estimation (Cont’d) One way to achieve the condition (3) is to use the importance weighting given by w ∗(k) ij ∝ f (y ∗(j) i | xi ; ˆθ(k)) f (y ∗(j) i | xi ; ˆθ) with M j=1 w ∗(k) ij = 1. If Y is categorical variable, fractional imputation is much easier. For example, if Y is binary then we only need two imputed values (y ∗(j) i = 0 or 1) and the fractional weight corresponding to y ∗(j) i is w∗ ij = P(Y = y ∗(j) i | xi ; ˆθ), for j = 1, 2. Kim (ISU) Fractional Imputation NIH/NCI 16 / 57
  • 17. Example 1: Fractional imputation for categorical data Example (n = 10) ID Weight y1 y2 1 w1 y1,1 y1,2 2 w2 y2,1 M 3 w3 M y3,2 4 w4 y4,1 y4,2 5 w5 y5,1 y5,2 6 w6 y6,1 y6,2 7 w7 M y7,2 8 w8 M M 9 w9 y9,1 y9,2 10 w10 y10,2 y10,2 M: Missing Kim (ISU) Fractional Imputation NIH/NCI 17 / 57
  • 18. Example 1 Example (y1, y2: dichotomous, taking 0 or 1) ID Weight y1 y2 1 w1 y1,1 y1,2 2 w2w∗ 2,1 y2,1 0 w2w∗ 2,2 y2,1 1 3 w3w∗ 3,1 0 y3,2 w3w∗ 3,2 1 y3,2 4 w4 y4,1 y4,2 5 w5 y5,1 y5,2 Kim (ISU) Fractional Imputation NIH/NCI 18 / 57
  • 19. Example 1 Example (y1, y2: dichotomous, taking 0 or 1) ID Weight y1 y2 6 w6 y6,1 y6,2 7 w7w∗ 7,1 0 y7,2 w7w∗ 7,2 1 y7,2 8 w8w∗ 8,1 0 0 w8w∗ 8,2 0 1 w8w∗ 8,3 1 0 w8w∗ 8,4 1 1 9 w9 y9,1 y9,2 10 w10 y10,1 y10,2 Kim (ISU) Fractional Imputation NIH/NCI 19 / 57
  • 20. Example 1 (Cont’d) E-step: Fractional weights are the conditional probabilities of the imputed values given the observations. w∗ ij = ˆP(y ∗(j) i,mis | yi,obs) = ˆπ(yi,obs, y ∗(j) i,mis) Mi l=1 ˆπ(yi,obs, y ∗(l) i,mis) where (yi,obs, yi,mis) is the (observed, missing) part of yi = (yi1, yi,2). M-step: Update the joint probability using the fractional weights. ˆπab = 1 ˆN n i=1 Mi j=1 wi w∗ ij I(y ∗(j) i,1 = a, y ∗(j) i,2 = b) with ˆN = n i=1 wi . Kim (ISU) Fractional Imputation NIH/NCI 20 / 57
  • 21. 1 Introduction 2 Fractional Imputation 3 Fractional hot deck imputation 4 R package: FHDI 5 Numerical Illustration 6 Concluding Remarks Kim (ISU) Fractional Imputation NIH/NCI 21 / 57
  • 22. Fractional hot deck imputation 1 Hot deck imputation: Imputed values are observed values. Suppose that we are interested in estimating θ1 = E(Y ) or even θ2 = Pr(Y < c) where y ∼ f (y | x) where x is always observed and y is subject to missingness. Assume MAR in the sense that Pr(R = 1 | x, y) does not depend on y. Assume that there exists z ∈ {1, · · · , G} such that f (y | x, z) = f (y | z). (4) In this case, we can assume that y | (z = g) i.i.d ∼ (µg , σ2 g ) which is sometimes called cell mean model (Kim and Fuller, 2004). Kim (ISU) Fractional Imputation NIH/NCI 22 / 57
  • 23. Fractional Hot deck imputation 2 Under (4), one can express f (y | x) ∼= G g=1 pg (x)fg (y) (5) where pg (x) = P(z = g | x) and fg (y) = f (y | z = g). Model (5) can be called finite mixture model. Under (5), we can implement two-step imputation 1 Step 1 (Parameter estimation): Compute the conditional CDF corresponding to (5) using ˆF(y | xi ) = G g=1 ˆpg (xi ) ˆFg (y), where ˆpg (x) is the estimated cell probabilities and ˆFg (y) is the empirical CDF within group g. 2 Step 2 (Imputation): From the conditional CDF, obtain M imputed values. Kim (ISU) Fractional Imputation NIH/NCI 23 / 57
  • 24. Remark Variable z is used to define imputation cells (or imputation classes). If x is categorical and used directly to define cells (i.e. z = x), then pg (xi ) = 1 if xi = g and pg (xi ) = 0 otherwise. In this case, ˆF(y | xi ) = ˆFg (y), for xi = g. Fractional hot deck imputation for this special case is discussed in Kim and Fuller (2004) and Fuller and Kim (2005). If x is continuous or highly dimensional, we may use some classification method (or tree method) to define imputation cells. Kim (ISU) Fractional Imputation NIH/NCI 24 / 57
  • 25. Multivariate Extension Idea In hot deck imputation, we can make a nonparametric approximation of f (·) using a finite mixture model f (yi,mis | yi,obs) = G g=1 pg (yi,obs)fg (yi,mis), (6) where pg (yi,obs) = P(zi = g | yi,obs), fg (yi,mis) = f (yi,mis | z = g) and z is the latent variable associated with imputation cell. To satisfy the above approximation, we need to find z such that f (yi,mis | zi , yi,obs) = f (yi,mis | zi ). Kim (ISU) Fractional Imputation NIH/NCI 25 / 57
  • 26. Multivariate Extension Imputation cell Assume p-dimensional survey items: Y = (Y1, · · · , Yp) For each item k, create a transformation of Yk into Zk, a discrete version of Yk based on the sample quantiles among respondents. If yi,k is missing, then zi,k is also missing. Imputation cells are created based on the observed value of zi = (zi,1, · · · , zi,p). Expression (6) can be written as f (yi,mis | yi,obs) = zmis P(zi,mis = zmis | yi,obs)f (yi,mis | zi ) ∼= zmis P(zi,mis = zmis | zi,obs)f (yi,mis | zi ) where zi = (zi,obs, zi,mis) similarly to yi = (yi,obs, yi,mis). Kim (ISU) Fractional Imputation NIH/NCI 26 / 57
  • 27. Fractional hot deck imputation: Two-step approach Step 1: Parameter estimation step 1 Compute ˆP(zmis | yi,obs): May require an iterative EM algorithm 2 Compute the cell CDF from the set of full respondents. Combine the two estimates to obtain the conditional CDF. Step 2: Imputation step Select M donors from the conditional CDF. Kim (ISU) Fractional Imputation NIH/NCI 27 / 57
  • 28. 1 Introduction 2 Fractional Imputation 3 Fractional hot deck imputation 4 R package: FHDI 5 Numerical Illustration 6 Concluding Remarks Kim (ISU) Fractional Imputation NIH/NCI 28 / 57
  • 29. FHDI: Introduction 1 Input: Multivariate missing data Output (Goal): Create a single complete data with imputed values. Preserve correlation structure. Provide a consistent FHDI estimator on the imputed data. Provide variance estimator for the FHDI estimator. Kim (ISU) Fractional Imputation NIH/NCI 29 / 57
  • 30. FHDI: Introduction 2 (Recall) Fractional Imputation E(yi,mis | yi,obs ) is approximated by E(yi,mis | yi,obs ) ∼= M j=1 w∗ i,j y ∗(j) i , Draw M(> 1) imputed values on each missing value. Assign fractional weights on imputed values. The final product is a single data set with size ≤ nR + nM × M. Kim (ISU) Fractional Imputation NIH/NCI 30 / 57
  • 31. FHDI: Introduction 3 How can we generate y∗ mis from f (ymis | yobs) in general case? Apply Two phase sampling approach (Phase I) Imputation cell for hot deck imputation Determine imputation cells based on z, where z is the discretized values of y (use estimated quantiles to create z). Estimate cell probabilities for z using EM by weighting method (Example 1). (Phase II) Donor selection Fractional imputation for missing y within each imputation cell. Assign all possible values on missing ymis (FEFI, Fulley Efficient Fraction Imputation) and fractional weights proportional to the estimated cell probabilities. Approximate FEFI imputation using a systematic sampling (FHDI). Kim (ISU) Fractional Imputation NIH/NCI 31 / 57
  • 32. FHDI: Introduction 4 Analysis Mean estimator ¯y = i∈A M j=1 wi w∗ ij y∗ ij i∈A wi , Regression estimator ˆβ = (X WX)−1 X Wy∗ Variance estimation ˆθFHDI = L k=1 ck ˆθ (k) FHDI − ˆθFHDI 2 , where ck is a replicate factor associated with ˆθ (k) FHDI and ˆθ (k) FHDI is the the k-th replicate estimate obtained using the k-th fractional weights replicate denoted by w (k) i × w ∗(k) ij . Kim (ISU) Fractional Imputation NIH/NCI 32 / 57
  • 33. FHDI: Implementation 1 Three scenarios for multivariate missing data 1 All categorical data: SAS procedure SURVEYIMPUTE and R package FHDI. install.packages(“FHDI”). Require R 3.4.0 or later. Require Rtools34 or later. More details: see https://sites.google.com/view/jaekwangkim/software. 2 All continuous data: R package FHDI. 3 A mixed data of categorical and continuous items: Not Applicable with the current version of FHDI. Kim (ISU) Fractional Imputation NIH/NCI 33 / 57
  • 34. FHDI: Implementation 2 We have n = 100 sample observations for the multivariate data vector yi = (y1i , y2i , y3i , y4i ), i = 1, . . . , n, generated from Y1 = 1 + e1, Y2 = 2 + ρe1 + 1 − ρ2e2, Y3 = Y1 + e3, Y4 = −1 + 0.5Y3 + e4. We set ρ = 0.5; e1 and e2 are generated from a standard normal distribution; e3 is generated from a standard exponential distribution; and e4 is generated from a normal distribution N(0, 3/2). Kim (ISU) Fractional Imputation NIH/NCI 34 / 57
  • 35. FHDI: Implementation 3 > library(FHDI) > example(FHDI) > summary(daty) y1 y2 y3 y4 Min. :-1.6701 Min. :0.02766 Min. :-1.4818 Min. :-2.920292 1st Qu.: 0.4369 1st Qu.:1.03796 1st Qu.: 0.9339 1st Qu.:-0.781067 Median : 0.8550 Median :1.79693 Median : 1.7246 Median :-0.121467 Mean : 0.9821 Mean :1.93066 Mean : 1.7955 Mean :-0.006254 3rd Qu.: 1.6171 3rd Qu.:2.71396 3rd Qu.: 2.5172 3rd Qu.: 0.787863 Max. : 3.1312 Max. :5.07103 Max. : 5.3347 Max. : 4.351372 NA’s :42 NA’s :34 NA’s :18 NA’s :11 Kim (ISU) Fractional Imputation NIH/NCI 35 / 57
  • 36. FHDI: Implementation 4 Categorization: imputation cell > cdaty=FHDI_CellMake(daty,k=3) > head(cdaty$data) ID WT y1 y2 y3 y4 [1,] 1 1 1.47963286 2.150860 NA 1.894211796 [2,] 2 1 NA 1.141496 1.6025296 -1.036946859 [3,] 3 1 0.70870936 1.885673 1.2506894 NA [4,] 4 1 NA 2.753840 NA 1.211049509 [5,] 5 1 0.86273572 2.425549 1.8875492 -0.539284732 [6,] 6 1 0.03460025 1.740481 0.4909525 0.007130484 > head(cdaty$cell) y1 y2 y3 y4 [1,] 3 2 0 3 [2,] 0 1 1 1 [3,] 2 2 2 0 [4,] 0 2 0 3 [5,] 2 3 2 2 [6,] 1 2 1 2 Kim (ISU) Fractional Imputation NIH/NCI 36 / 57
  • 37. FHDI: Implementation 5 > cdaty$cell.resp y1 y2 y3 y4 [1,] 1 1 1 1 [2,] 1 1 2 3 [3,] 1 2 1 2 [4,] 1 2 2 1 [5,] 2 2 2 3 [6,] 2 3 2 2 [7,] 3 1 3 3 [8,] 3 2 3 2 [9,] 3 2 3 3 [10,] 3 3 3 1 > head(cdaty$cell.non.resp) y1 y2 y3 y4 [1,] 0 0 0 2 [2,] 0 0 0 3 [3,] 0 0 1 1 [4,] 0 0 1 2 [5,] 0 0 2 1 [6,] 0 0 2 2 10 unique patterns in AR and 47 patterns in AM. Ex) Respondents with (1,2,1,2) or (3,2,3,2) can be used as donors for recipients with (0, 0, 0, 2). Kim (ISU) Fractional Imputation NIH/NCI 37 / 57
  • 38. FHDI: Implementation 6 MLE cell probability estimates > datz=cdaty$cell > jcp=FHDI_CellProb(datz) > jcp$cellpr 1111 1123 1212 1221 2223 2322 0.18110421 0.05474648 0.12693514 0.07786676 0.17388579 0.08263912 3133 3232 3233 3331 0.02175015 0.10356376 0.08871434 0.08879425 > sum(jcp$cellpr) [1] 1 A tailored version of EM by weighting (Ibrahim, 1990), as illustated in Example 1. Kim (ISU) Fractional Imputation NIH/NCI 38 / 57
  • 39. FHDI: Implementation 7 > z [,1] [,2] [,3] [1,] 1 2 2 [2,] 2 1 2 [3,] 1 0 2 > FHDI_CellProb(z) $cellpr 122 212 0.6666667 0.3333333 > z [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 2 2 2 2 2 [2,] 2 1 2 2 2 2 [3,] 1 0 2 2 2 2 > FHDI_CellProb(z) $cellpr 122222 212222 0.6666667 0.3333333 Cell probabilities are computed on the unique patterns of Z in AR. Incorporate information of nonrespondents. Kim (ISU) Fractional Imputation NIH/NCI 39 / 57
  • 40. FHDI: Implementation 8 FEFI imputation > FEFI=FHDI_Driver(daty , s_op_imputation =" FEFI", i_op_variance =1,k=3) > dim(FEFI$fimp.data) [1] 330 8 > FEFI$fimp.data [1:13 ,] ID FID WT FWT y1 y2 y3 y4 [1,] 1 1 1 0.5000000 1.47963286 2.150860 2.881646 1.8942118 [2,] 1 2 1 0.5000000 1.47963286 2.150860 2.493438 1.8942118 [3,] 2 1 1 0.2000000 -0.09087472 1.141496 1.602530 -1.0369469 [4,] 2 2 1 0.2000000 -1.67006193 1.141496 1.602530 -1.0369469 [5,] 2 3 1 0.2000000 -0.39302750 1.141496 1.602530 -1.0369469 [6,] 2 4 1 0.2000000 0.97612864 1.141496 1.602530 -1.0369469 [7,] 2 5 1 0.2000000 0.21467221 1.141496 1.602530 -1.0369469 [8,] 3 1 1 0.1666667 0.70870936 1.885673 1.250689 0.7770526 [9,] 3 2 1 0.1666667 0.70870936 1.885673 1.250689 1.2839115 [10 ,] 3 3 1 0.1666667 0.70870936 1.885673 1.250689 0.6309413 [11 ,] 3 4 1 0.1666667 0.70870936 1.885673 1.250689 0.3232018 [12 ,] 3 5 1 0.1666667 0.70870936 1.885673 1.250689 0.5848844 [13 ,] 3 6 1 0.1666667 0.70870936 1.885673 1.250689 1.0342970 Kim (ISU) Fractional Imputation NIH/NCI 40 / 57
  • 41. FHDI: Implementation 9 FHDI imputation (M=5) > FHDI=FHDI_Driver(daty , s_op_imputation =" FHDI",M=5, i_op_variance =1,k=3 > dim(FHDI$fimp.data) [1] 285 8 > FHDI$fimp.data [1:12 ,] ID FID WT FWT y1 y2 y3 y4 [1,] 1 1 1 0.5000000 1.47963286 2.150860 2.881646 1.8942118 [2,] 1 2 1 0.5000000 1.47963286 2.150860 2.493438 1.8942118 [3,] 2 1 1 0.2000000 -0.09087472 1.141496 1.602530 -1.0369469 [4,] 2 2 1 0.2000000 -1.67006193 1.141496 1.602530 -1.0369469 [5,] 2 3 1 0.2000000 -0.39302750 1.141496 1.602530 -1.0369469 [6,] 2 4 1 0.2000000 0.97612864 1.141496 1.602530 -1.0369469 [7,] 2 5 1 0.2000000 0.21467221 1.141496 1.602530 -1.0369469 [8,] 3 1 1 0.2000000 0.70870936 1.885673 1.250689 0.7770526 [9,] 3 2 1 0.2000000 0.70870936 1.885673 1.250689 1.2839115 [10 ,] 3 3 1 0.2000000 0.70870936 1.885673 1.250689 0.6309413 [11 ,] 3 4 1 0.2000000 0.70870936 1.885673 1.250689 0.3232018 [12 ,] 3 5 1 0.2000000 0.70870936 1.885673 1.250689 1.0342970 A FEFI imputed value 0.5848844 is not selected as FHDI imputed values. Large sample size reduction in FHDI if the original data size is large. Kim (ISU) Fractional Imputation NIH/NCI 41 / 57
  • 42. FHDI: Implementation 10 Table: Regression (y1 ∼ y2) coefficient estimates with standard errors. Estimator Intercept (S.E.) Slope (S.E.) Naive -0.074 (0.305) 0.588 (0.142) FEFI 0.035 (0.251) 0.466 (0.094) FHDI 0.023 (0.252) 0.472 (0.095) True 0 0.5 FEFI and FHDI estimators produce smaller standard errors compare to Naive estimator. FHDI estimator well approximates FEFI estimator. Kim (ISU) Fractional Imputation NIH/NCI 42 / 57
  • 43. 1 Introduction 2 Fractional Imputation 3 Fractional hot deck imputation 4 R package: FHDI 5 Numerical Illustration 6 Concluding Remarks Kim (ISU) Fractional Imputation NIH/NCI 43 / 57
  • 44. Numerical illustration A pseudo finite population constructed from a single month data in Monthly Retail Trade Survey (MRTS) at US Bureau of Census N = 7, 260 retail business units in five strata Three variables in the data h: stratum xhi : inventory values yhi : sales Kim (ISU) Fractional Imputation NIH/NCI 44 / 57
  • 45. Box plot of log sales and log inventory values by strata q qq qqqqqqq qq q q q qqqq q qqq q q q q q q q q qqq q q q qqqq q q qq qqq q q q qqq q q q q q q q qq qq q q q qq qq qq qq q q q q q qqq q qqqqq qqq qq 1 2 3 4 5 1011121314151617 Box plot of sales data by strata strata logscale q q q q q q qq qqqqq qqq q qq q q q qq q q q qq qq q qq qq q q qq qq q q q qqq q qqq qqq qqq q q q q q q q q q q q qq q q q q qq q q q q q qq q qq q q q q q q q q q q q q qq q q q q q q q q q q qqqq q q q qqqqqqqq qq qq qq q qqqqqqqq q qqqqqqqqqqqqq qq q qq q q q q q q qq q q qq qq qq q q q qqqq q q q q q q q q q qq q q qq q q q q qq qq qqq q qq qq q q q q q q qqq q q q q q q q qq q qq q q qqq qq q qq q q q q qqqq q qqq qq q qq qqq q q q q q 1 2 3 4 5 1314151617 Box plot of inventory data by strata strata logscale Kim (ISU) Fractional Imputation NIH/NCI 45 / 57
  • 46. Imputation model log(yhi ) = β0h + β1 log(xhi ) + ehi where ehi ∼ N(0, σ2 ) Kim (ISU) Fractional Imputation NIH/NCI 46 / 57
  • 47. Residual plot and residual QQ plot 12 13 14 15 16 −3−2−10123 Fitted values Residuals q q q q q qq qq q qq q qq qqqqqq qq q qq q q q q q q q q qqqqqqq q q q qqqq qq q q q qq q q qqqqqq qqq q qqqqqqqqq q q q q q q q qq q qqq qqq q q qq q q q qq qqqq q q q q q qq qq qq q q qq q q q q q q q q qqq q q q q q qqq qqqq q q q q q qq q q qqq q qqq q qqq qqqq qq q q q q q q qq qq q q q q q qqq q qq q qq q qqq q q q qqqqq q qq q q qq q qq qq qq qq q q q q q q qq q q q qq q q qqq qqqqqq q q q q qq qq q qqq qq q qq q q q qq q q q q q q qqq q q q q q q q q q qq q q q q qqqq q q q q qq qq q qq q q q qqqq q qqq qq q qq q qq q qq q q q q q qq qq qqq qq q q q q q q q q q q qqq qqqqqqqqqqqqqqqq qqqq qq qqq qq qq q q q q q q q q q qq qq qq qq qqqqqqq qq q qq q qqqqq q qq qqqqqq q q qq qq q q qqq q q q q qqq q q qq q qqq qq qq qqqqq qq qqqqqqqqqq q q q q q q qq q q qqq q q qq q qqq q q qq qq qqq qq qqqqqqqq qqqqqqqqqq q qq q qqqqqq q qqqq qqq qq qqqqq qqqq q qq qq q q q qq qq q q q qq qqqqqq qq q qqq q q qq q qqq qq qq q qqq q qqqqqqqqqqqqq qq qqqqq qq qqq q qqq q q qq q q qq q q qqqq qq qqq q qqqqq q q q q q q q q q qq q q qq q q q q qqq qq q qq qqq qq q q q qq qqqq qq q q q q qq qqqqqqqqq q q qqq q q q qq q qq qq q q q q q qqq q q q qqq qqqqqqqqq qq qqq q q qqqqqqqqqqq q qqqqqqq q q qq qq q q q q qq qqqqq q q q q q q qq q q q q qq q qqqq q qqqq qq qq qq q qqq q q q qq qq qq q q q qqqqqqqqqq q qq q q q q q q q qqqq q q q qqqq qqqqqqq q q q q qq qq q qqqqqq qq q qq q qqqqqqqq qqq q q q qq q qq q q q q q q qqq q q qq q q qq qq q q qq q q qqq q q qq q qqqqq q qqqqqqq qq qqqqqqqqq q qq qq q qq qq qq q q q q q qq qq qq qqq q q qq qqq q qqqq qq q qqq qqq qqq q q q q qq q qq qq q q q qqq q q q q q q qq q q q q q q q q q q qq q qqqqq q qqq q qqq qq qq q q q qqq qq q q qq q qq q q q qq q q q q q q q qq q qq qqqqqqqq q q qq qq q q q qq qq qqqq q q q q q q q q qqq qq qq q q q q q q qq qq q q qq q q q q q q q q q q qq q q qq qqq q qq q qqq qq q q q q q q q q q q q qq q qq q q q q q q qq q q q q q q qqq q q q q qqq q qq q q q q qqq q q q q q q q qq qq q q q q q q q q q q qq q qq q q q qq q q q qq q q qqq q q q q q q q q qq q q qq q qq q q q q q q q q q q qqqqqqqqqqqqq qqqq qqqqqqq qqqqqq qqq qqqqqqq qqqqqqqqqq qqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q q q q q qqqq q qqq q q q q qq q q q q qq q q q q qq q qq q qqq qqqqq q qqq q qq q qqq q q q qq qq qqq q q qqq qq qq qq q q qqq qq qq qq q q q q qq q qq q qq q qqq q q qq q qq q q q q q q q q qq q qqq qq qq q qq qq q q qq q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q qq qq q qqq q qq q q qqqqq q qq qqq q qq q qq qq qqq q q q qq q qq q q q q q q q qqqq q q qq qq q qq q qq qqqq qqqqqq q q q q qqq qqqqqqqqq q qqqq qqqq qqq qq q q q q qqq q q qq qq q qqq q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q qqq q q q qqqq qq q q q q q qq qq qq qqqqq q qq qq q qqqq qqq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qq qqq q q q q q qq q q q q q q q qq q q qq q q q q q qq qq qq q qqq qq qq qq q qqqq qq q q qqq q qq qqqq q q q qq qq q q q qq q q q q q q qq q qq qq qq q q q qq q qq q q qq qq qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q q q qq q qq q q q qqqq q q q q qqq q qq qq qqq q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqq q qqqqqqq qq qqqqqqq qqqqq q qqqqqqq qqqqq q q q q qq q q q q q q qq q q q q q q q q q q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqq q q q q q qqq q q q q qqq qqq q q q q qq q q q q q qqqq qqqqq qqq qq q qqqqqqq qq q qqqqqqqqqqqq q qqqqqqqqqqqqqqqq q q q q q q q qq q q q q q q q q q qq q q q qqqqq qqqq q q q qq q q qq q q qqq qqqqq qqqq qqq q qqqqqqqqqq qq qq qq qqq qqqqqqqqqq qq q qq qqqqq q q q q q q q q q q q qqqq qqq q q q qq qqq q qqqqq q qqq qqq q qqqqqqqqqqqqqqq qq qqq q qqqq q qq qqqqqqqq q q qqqqq qqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qq qq qq qq qqqq qqqqq q q q q qqqq q q qqqqq q q qq qqq qqq q q q qq qq qqqq q qqq qq q qq q q qq qq q q qqqq q q q q q qq q q q q q qq q q q qqqqqqqqqqqqqqqqq qqqqqqqqqqqqq qq q q qq qqqq q q qqqq q q q q qqqq q q q q q qq qq qq qq q q q q q q qq q q q qq q qq q q q q q qqqq q qqqq q qq q q q qqq qq q q q q qq qq qq q qq q qq q q q q q q q q q qq q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q q q q q q qq q q q q q q q q q q q q q q q qq q q q qq qq qqq q q qq q q q qq q q qq q qqq q q qq q q q q q q q q q q q q q q q q qq qq q q q q q qqq q qq q qq q qq qqq qq qq qq q qqq q qqqq q qqq qq qqqqqq qqq qqqqq qqqqqqq qq qqqqqq qqqqqqq qqqqqq qqqqqqq q qqq q q qqqq qqq q q qq qqq qqq qq q q q q q q q q q q qqq q q q qq qq q qq q q q qq q q q q qqqqqq q qq q qqq qqqqq qqqq qqqqq qqqqq qqq qq q q q q qqqqq q q q qq q q qqq q qqqq qqqq q qqqqqq qqqq q qq q q q q qqq q q q qqqq qqqqqq q qqq q qq qqq qq qqqqqqq q qq qq q q q qqqqqqqq q q q q qq q q qq qqq qq qqq q q qq q q q q q qq q q q qq q qq q q qq qq q qq q qq qq q q q q q qqq q q qqq qqqqq q qqqqqq qqq qqqqqqq qqqqqqq qq qqqqqqqq q qqqqqq qq qqqq qqqqqqqqqqqq qqqqq qqqqq qq qqq q q qq q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q qq q q q q q q q q qq q qqqqq q q q q q q qq q q q qqqqqq q q q qqq qq qqq q qqq qq q qq qqq q qqqqqqq qqqqqq q q qqqq q qq q q q qqqqqq qqqqq qqqqqqqqq q q q qq qqqqqq q q qq q q q q q q q q q qq q qq qq q q q qqq q q q q q qq q qqq qqqq q q q q qq qqqq qq q qqq qq qqqqqqqqqqqqqqqqqqqq qqqq q q q qqq q q q q q q q q q q qqqq qq q q q q q q q qq q qqq q q q q q q qqq q q q q q q q q qq q qq q q q qq qqqq q q q q qq q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qq q q qq q q qqq q q qq q q q q q q q q q qq qq qq q q q q q q q q qq q q qqqqqqqqqqq q q qqqqqq qqqqqqqqqqq qqqqqqqqq q qqqqqqqqqqqqqqqqqq q q qqqqqqqqqqq q qqq q q qqqq q qqqq q qq q qq q q q q q q q q q q qq q q q q q q qq q q q q q q qq q q q q qq q qqqq q q q q q q qq q q q q q qq q q q q q q q q q qqqq qqqqqq qqq qq qq qq qqq q qq qqqqqqqq qq q q qqqq qqq qqqq qqq q qq q q q q q q qq q qq q q q qq q q q qq q qq qq q q qq q q q q q q q q q q q q qqq qq q qq q q q q q q qqqqq q q qq qqq q q q q qq qqqqq q q q qq q qqq qq qq q q qqq qq q q q q q qq q qq q qq q qq q q q q qqq q qq q qqqqq q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q q qq q q qq q qq q qq q qq q q q q qqq q qqqq qq qqq qqqq q q qq q q q q qqqq q q qqq q q q qq q qqq q qqq q qq qq q qqqq qqq q q q qq q qq qqq qqq qq q q q qqqqqq qq q qq q qqqq qqq q qq q q q q q q q q q q qq qq q q qqqq qq q q q q q q qq q q q q q q q q q q q qq q qqqqq qqq q qq q q q q q q q q q q qq qq q qqq qqq q qq q q qqqq q q q qq q q qqq q qq q q q qqqqq q q qqqq qq q qqq qqq q q q qqq q q qq qq q q q q q qqq q q qq q q q q q q q q q q q q qq qqq q q q q qq q qq q q qq qq qq q q q q q qq q q q q q qq q q q q qq q q q q q q q q qq q qq q q q q q q q q q q q q q qq q qq q q q q q q q q qq q q q q q q q q q q q q qqq q qq q q q qq q q q q q q q q q q q qqq q q qq q q q qqq q qq q q q q q q q q qq qq q qq qq q q q qq qq qq qqqq q q q q qq qq q q qqqq qq q qqqqqqq qq q qqqqq qqqq q qqq q q qqq q qq qqqq q q q q q q q q qq q qqq qq q q qqq q q q q q q qq q q q qqq q qq q q q q qqq q q qqqq qq qqqq q q qq q q q qqqqqqqq qqqqqqqqqqqqqqqq q qqqqqqqqqq q qqq qq q qqqq q q qq qqqqqq qq qqqqqqq q qqqq q q qqq q q q qq qq q qq q qq qqq q qq q q qq qq q qq q qq qq q qqq qq q q q q q qqqq q q q q qq q q q q qq qq q qq qqq qq q qq q q q qq qq qq qqq q q qq q q q q q q q qq q qqqq q q q q q q q q q q qqq q q q qq q qqqq qqqq qqqq qqqq qq qqqqqq qq qqq qq q qqq q q q q qqq qq qq q q q q q q q q q qqq q qqq qq qqqqq q qq qqqq q qq q q q qq q q q qq q q q qqq q qqq qqq q q qq qqqqqqq qqq qqqq q q q q qqqq q q q q q q q q q q q q qq q q q qq q q q qq q q q q q qq qq qq qqq qq qqqqqqqq q q qq qq q qqq qqqqqq q qqq q qq qq qq q q q q q q qq q qqq q qqq q q q qq q q q qq q q qq q q q q q qq q q qqq qq qq q q q qqq qq qq q q q q q q q q qqqq q qq q qq q q qqqqqqqq qqq q qq qqq qq qqqqqqqqqqq q q q q q q q q qq q q qqqq q q q q q q qq q q q q q q q q qq q qq q q q qq q qq qqq q q qq qq qqqqqqq qqqqq q q qqqq qqq qq q q qq q q q q q qqq q qqqq qq q qq q qq q q qqq qqq q qq q qqqqqqqqq q q q q qq qqq q q q qq qq q qq q qq q qq q qqqq qq qq q qq qq q q q qq qq q q q q q qqq q q qq qq q qqq q qq q q q q q qq q qqq q qqq qq qq q q q qq q q q qq qq qqqq q qqq qqqqqq q qqq q q q qqq q q q qqq q q q qqqqqq q qq qqqq qqq qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q qq qqq q q q q q q q q q q q q q qq q qqq q q qq q q q q qq q q q qq q qqq q q qq q q q q q q qqq q q q q q q q q q qq q qq q q q qq q qqq qq q qqq q q qqqq qqq q q q q q qq q q q qqq qqq qq qqq q qq qqqq qq q q qq q qq q q q qq q qq q q q q q q qq q q q q q q q qq q q q qq q q qq q qq q qq q q q q q q q q q q q q q q q q qq q q qqq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q qq qq qqq qq q q q qq qqq q qqq qq q q qq qq q qq qqqq q q q q q qq q qq qq q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q q q qq qq q q q q q qq q q q q q qq q q q q q q qq q q q q q q q q q q q qq q q q q q q q qq q q qqqq q q q q q q qq q q q qq q q q q q q q qq q q q q qq q qq q q q q qq q q q q qq q q q q q qqqq q q qqq q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qqqq q q qq q qqqq q q q q q q q q qq qq q q q qqqq q q qqq q q q qqqq q qq qqqq qqq q q qq qq q q qqq qq q q qq q q qqq qq q q q q q q q q q q q q q qqq qq qq q qqq q qq q qqq qq q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqq q q qq q q q q q qqqq qq q q qq q q qq q qqq q q q q q q q q q q q qq q q qq q q q q q q qq q q q q q q q q q q qq q q q q q q qq q q qqqq q q q q q q q q qq q q q qq q q q q q q q qq q q q q q q q q q q q q q q qq qqqq q q qq q q q q qq q q q qq q q q q qq qqq q q q qq qq q q q q q q qq q q qq q q q q q q q qq q q q q q q q q q q qq q q q q q q q q q q qq q q q q q qqqqqqqq q q qqq q qqqqqqqq q qqq qqqqq q qqqq q q qq q qq q qqq q q q q qq qq q qq q q q qq q q q q qq q q q q q q qq q q q q q q q q q q qqq qq q q qqqq qq qq qqq q q q qq q qqqq q qq q q qqq q qq q q q q q q q q q q q q q q qq qqqq q q q q q q q q q q q q qqq q q q q q q q q q q qqq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q qqq q q q q q q q q qq q q q q q q q q q q q q q q q q qq q q q q qq q q qq q q q q q q q qq qq q q q qqq q q q q q q q qq q qq q q qq qq q qq q q q q q q q q q q q q q q q q q q q q qq qq q qq q q q q q q q q q q q q q q q q q q q qq q q q qq q q q q q q q qq q qq q q q q q q qq q q q q q q q q q q qq qq q qqq q q q q q q q q qq q q q q q q q q q qqq q q q q q q q q q q q q q q q q q q q q q q qqq qq qq q qq q q q q q q q q q qq qqq q q q q qq qq q qq qq q qq q q q q q qq q qqq q q q q qq q q q q qqq q q q qq q q qq q q q qq q q q q q q q qqqq q q qq q q qqq q q q q qq q q qq q qq qqq q q q q q qqq q q qqq q q q q q q qqq qqq qq q q q q q q q q q q q qqq q q q q q q q qq q q q q q q q q qq qq qq q q qqq q q q q q qqqq q q q q q qq q qq q q q qq q q q q q q qq q qq q q q q q q q q qq q q qq q q q qq q q q q q q q q q q q q q q q q qq q q q qq q qq q qq q q q q q q q q q q q qqqq q qq q q q q q q qq q q q qq q qq q q q q qq q q qq q q qq q q qqqqq q q qq qqq q q q q q qq qqqq q qq qq q q q q qqq qq q q q qqqqq qqq q qqq q qq qqq q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqq q q q q q q qqq q q q q q q q q qq q q q q q q q qq q q qq q q q q q q q q q qq q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q q qq q q q q q Residuals vs Fitted 4 6658 6424 q q q q q qq qq q qq q q q qqqqqq qq q qq q q q q q q q q qqq qqqq q q q qqqq q q q q q qq q q qqqqqq qqq q q qqqq qqqq q q q q q q q qq q qqq qqq q qqq q q q qq qqqq q q q q q qq q q qq q q qq q q q q q q q q qqq q q q q q qqq qqqq q q q q q qq q q qqq q qqq q qqq qqq q qq q q q q q q qq qq q q q q q qqq q qq q q q q qqq q q q qq qqq q qq q q qq q qq qq qq qq q q q q q q qq q q q qq q q qqq qqqqqq q q q q qq qq q qqq qq q q q q q q q q q q q q q q qqq q q q q q q q q q qq q q q qqqqq q q q q qq qq q qq q q q qqqq q qqq qq q q q qqq q q q q q q q q qq qq qqq qq q q q q q q q q q q qqq qq qqqqqqqq qqqqqq qqqq qq qqq q q q q q q q q q q q q q qq qq qq qq qq qqqqq q q q qq q qqqqq q qq q qqqqq q q q q qq q q qqq q q qq qqq q q qq q qqq qq qq qqqqq q q q qqqqqqqq q q q q q q q qq q q qqq q q qq q qqq q q qq qq q qq qq qq qq q qq q qq q qqqqq qq q q q q qqqqq q q q qqq qqq qqqqqqq qqq q q q q qq q q q qq qq q q q qq qq qqqq qq q qqq q q qq q qqqqq qq q qqq qqqqqqqq qqqqqq qq qqqqq qq q qq q qqq q q qqq q qq q q qqqq qq qqq q qqqqq qq q q q q q q q qq q q qq qq q q qqq qq q qq qqqqq q q q qq qqqqqq q q q q qq qqqqq qqqq q q q qq q q q qq q qq qq q q q q q qqq q q qqq q qqqqqqqqq q q qqq q q qqqqqqqqqqq q qqqqqqq q q qq qq q q q q q q qqqqq q q q q q q qq q q q q qq q qqqq q qqqq qqqq qq q qqq q q q qq qq qq q q q q qqqq qqqqq q qq q q q q q q q qqqq q q q qqqq qqqqqqq q q q q qq qq q qqqqqq q q q qq q qqqqqqqq qqq q q q qq q qq q q q q q q qqq q q qq q q qq qq q q q q q q q q q q q qq q qqqq q q qqqqqq q qq qqqqqqqqq q qq qq q qq qq q q q q q q q qq qq qq qqq q q qq qqq q qqqq qq q qqq qqq qqq q q q q qq q qq qq q q q qq q q q q q q q q qq q q q q q q q q q qq q qqqqq q qqq q qqq qq qq q q q qqqq q qq qqq qq q q q qq q q q q q q qqq q qq qq qqqqqq q q qq qq q q q qq qq q qqq q q q q q q q q qqq qq qq q q q q q q qq qq q q qq q q q q q q q q q q qq q q qq qqq q qq q qq q qq q q qq q q q q q q q qqq q q q q qq q q qq q q q q q q qqq q q q q qqq q qq q q q q qqq q q q q q q q qq qq q q q q q q q q q q qq q qq qq q q q q q q qq q q qqq q q q q q q q q qq q q qq q qq q q q q q q q q q q qqqqqqqqqqqqq qqqq qqqqqq q q qq q qq qqq qqqqqq q qqqqqqqqq q qqqqqq qq q qqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqq q q q q q q qq qq q qqq q q q q qq q q q q qq q q q q qq q qq q qq q qq qqq q qqq q qq q qqq q q q qq qq qqq q q qqq qq qq q q q q qqq qq qq qq q q q q qqq qq q qq q qqq qq q q q qq q q q q q q q q qq q q qq qq qq q qq qq q q qq q q q q q q q q q q qq q q qq q q q q q q q q q q qq q q qq qq q qq q q qq q q qqqqq q q q qqq q qqqqq qq qqq q q q qq qqq q q q q q q q qqq q q q qq qq q qq q qq qqq q qqqq qq q q q q qq q q qqq qq qqq q q qqq qq qq qqq qq q q q q qqq q q qq q q q qqq q q q qqqq qqqq qqqqqqq qqqqqqq q q qqqqqqq q q q qqq q q q qqqq qq q q q q q qq qq qq qqqqq q qq qq q qqqq qqq q qqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqq q qq qqq q q q q q qq q q q q q q q q q q q qq q q q q q qq qq qq q qqq qq qq qq q q qqq qq q q qqq q qq qqq q q q q qq q q q q q qq q q q q q q qq q q q q q qq q q q qq q qq q q qq qq qq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q q q qq q qq q q q qqqq q q q q qqq q q q qq qqq q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqq q qqqqqqq qqqqqqqq q qqqqqq qqq qqq q q qqqq q q q q qq q q q q q q qq q q q q q q q q q q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqq q q q q q qqq q q q q qqq qqq q q q q qq q q q q q q q qq qqqqq qqq q q q qq qqqq q qq q q qqqqq qqqqqq q qqqq qq qqqqqqqqqq q q q q q q q q q q q q q q q q q q qq q q q qqqqq qqqq q q q qq q q qq q q qqq qqqqq qqqq qqq q q qqqqqqqq q q qqqqq qqq qqqqqq q q qq qq q qq qqqqq q q q q q q q q q q q qqqq qqq q q q qq qqq q qqqqq q q qq qq q q qqqqqqqqqqqqqqq qq qqq q qqqq q qq qqqqqqqq q q qqqqq qq q q qqqqqqqqqqqqq qq qqqqqqqqqq qqqq qq qq qq qq q qqq qqqqq q q q q qqqq q q qqq qq q q qq qq q qqq q q q qq qq qqqq q qqq qq q qq q q qq qq q q q qqq q q q q q qq q q q q q qq q q qqqqqq qq qqqqqqq qqq q qqq qqqqqq qqq qq q q qq qq qq q q qqqq q q q q qq qq q q q q q qq qq qq qq q q q q q q q q q q q qq q q q q q q q q qqq q q qqqq q q q q q q qqq qq q q q q qq qq qq q qq q qq q q q q q q q q q qq q q q q q q q qq q q q qqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq q q q q q q q qq q q q q q q q q q q q q q q q qq q q q q q qq qqq q q qq q q q qq q q qq q qqq q q qq q q q q q q q q q q q q q q q q q q qq q q q q q q qq q qq q qq q qq q qq qq qq qq q qqq q qqqq q qqq q q qqqqqq qqq qqqqq q qqqqqq qq qqqq q q qqqqqqq qqqqq q qq qqqq q q qqq q q q qqq qqq q q qq qqq qqq q q q q q q q q q q q q q qq q q q qq qq q qq q q q qq q q q q qqqqqq q qq q qqq qqqqq qqqq qqqqq qqqq q qqq qq q q q q qqqqq q q q qq q q q qq q q qqq qqqq q qqq qqq q q q q q qq q q q q qqq q q q q qqq qqqqqq q qqq q qq qq q q q qqqqqqq q qq qq q q q qqqqqqqq q q q q qq q q qq qqq qq qqq q qqq qq q q q qq q q q qq q qq q q qq q q q qq q qq qq q q q q q qqq q q qqq qqq qqq qqqqqq qqq q qqqqq q qqq qqq q q q qqqqqqqq q qqqqqq qq qqqqqqqqqqqqqqqqqqqqqqqqqq qqqqq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q qq qqq q q q q q q q q q q q qqqqqq q q q qq q q q qqq q q qq qq q qq q qq q qqqqq qq q qqqqq q q qqqq q qq q q q qqqq qq qqqqq qqqqqqqqq q q q qq qqqqqq q q qq q q q q q q q q q q q q q q qq q q q qqq q q q q q qq q qqqqq qq q q q q qq qqqq qq q qqq qq qqqqqqqqqqqq qqqqqqqq qqqq q q q qqq q q q q q q q q q q qqqq qq q q q q q q qqq q qqq q q q q q qqqq q qq q q q qq qq q qq q q q qq qqqq q qq q q q q q q q q q q qqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq qq q qqq q q qqq q q qq q q q q q q q q q qq qq q q q q q q q q q q qq q q qqq qq qq qqqq q q qqqq qq q qq qqqqqqq q qqqqqqqqq q qq qqqqqqq qqqqqq qqq q q qqqqq qq qqqq q qqq q q qqqq q qqqq q qq q qq q q q q q q q q q qqq q q q q q q qq q q q q q q qq q q q q qq q qqqq q q q q q q qq qq q q q qq q q q q q q qq q qqqq q qqqq q qqq qq qq qq qqq q qq qqqqqqqq qq q q qqqq qqq qqqq qqq q qq q q q q q q qq q qq q q q qq q q q qq q qq qq q q qq q q q q q q q q q q q q qqq qq q qq q q q q q q qqqq q q q qq qqq q q q q qq qqq qq q q q qq q qqq qq qq q q qq q qq q q q q q qq q qq q qq q qq q q q q q qq q qq q qqqq q q q q q qq q q qq q q q q q q q q qq q q q q q q q qq q q qq q q qq q qq q qq q qq q q q q qqq q qqq q qq qqq q qqq q q qq q q q q qqqq q q qqq q q q qq q qqq q qqq q qq q q q qq qq q qq q q q qq q qq qqq qqq qq q q q qqqqqq qq q qq q qqq q qqq q qq q q q q q q q q q q qq qq q q q qqq qq q q q q q q qq q q q qq q q q q q q qq q qqqqq qqqq qq q q q q q q q q q q qq q q q qqq qqq q q q q q qq qq q q q qq q q qq q q qq q q q qqqqq q q qqqq qq q qqq qqq q q q qqq q q q q qq q q q q q qqq q q qq q q q q q q q q q q q q qq qqq q q q q qq q qq q q qq qq qq q q q q q qq q q q q q qq q q q q qq q q q q q q q q qq q qq q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q qqq q qq q q q qq q q q q q q q q q q q qqq q q qq q q q qqq q q q q q q q q q q q qq qq q qq qq q q q q q qq qq qq qq q q q q qq qq q q qqqq qq q qqqqqqq qq q qqqqq qqqq q qqq q q qqq q qq qq qq q q q q q q q q qq q qqq qq q q qqq q q q q q q q q q q q qqq q qq q q q q qq q q q qqqq qq qqqq q q qq q q q qqqqqqqq qqq q qqqq qqqq qq qq q qqqqqq q qqq q qqq qq q qqqq q qq q qqq qqq qq qqqqqqq q qqqq q q qqq q q q qq qq q qq q qq qqq q qq q q qq qq q qq q qq qq q qqq qq q q q q q qqqq q q q q q q q q q q qq qq qqq qqq q q q qq q q q qq qq qq qq q q q qq q q q q q q q qq q qqqq q q q q q q q q q q qqq q q q qq q qqqq qqqq qqqq qqqq qq qqq qqq qq qqq qq q qqq q q q q qqq q q qq q q q q q q q q q qqq q qqq qq qqq qq q q q qq qq q qq q q q qq q q q qq q q q q qq q qqq qqq q q qq qqqqqqq qqq qqqq q q q q qqqq q q q q qq q q q q q q q q q q q qq q q q qq q q q q q qq qq q q q qq qq q qqqq qq q q q qq qq q q q q qq qqq q q qq q q qq qq qq q q q q q q qq q q qq q qqq q q q qq q q q qq q q qq q q q q q qq q q qqq qq qq q q q qqq qq qq q q q q q q q q q qq q q q q q qq q q qqqq qq qq qqq q qq qq q qq qq q qqqq qqqq q q q q q q q q qq q q q qqq q q q q q q qq q q q q q q q q qq q qq q q q qq q qq qqq q q q q qq qq qq qqq qqq qq q q qqqq qqq q q q q qq q q q q q qqq q qqqq qq q q q q qq q q qqq qqqq qq q qqqq qq qqq q q q q qq q qq q q q qq qq q qq q qq q qq q qqq q qq qq q qq qq q q q qq qq q q q q q qqq q q qq qq q qqq q qq q q q q q qq q qqq q qqq qq qq q q q qq q q q qq qq qqq q q qqq qq qqqq q qqq q q q qqq q q q q q q q q q qq qqqq q qq qqqq qqq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q qq qqq q qq qq q q q q q q q q qq q qq q q q qq q q q q qq q q q qq q qqq q q qq q q q q q qqq q q q q q q q q q q q q q qq q q q qq q qq q qq q qqq q q qqqq q qq q q q q q qq q q q qq q q q q qq qq q q qq qqqq qq q q qq q qq q q q qq q qq q q q q q q qq q q q q q q q qq q q q qq q q qq q qq q qq q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q qq qq qqq qq qq q qqqqq q qqq qq q q qq qq q qq qqqqq q q q q qq q qq q q q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q q q q q q q q q q q q qq q q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q q q q qq q q qqqq q q q q q q qq qq q qq q q q q q q q qqq q q q qq q qq q q q q qq q q q q qq q q q q q qqqq q q qqq q q q q qq q q q q q q q q q q q q q q qq q q q q q q q qqqq q q qq q qqqq q q q q q q q q qq qq q q q qqqq q q qqq q q q qqqq q qq qqqq qqq q q qq qq q q qqq q qq q q q q q qqq qq q q q q q q q q qq q q q qqq qqqq q q qq q qq q qqq qq q q q q q q q q q q qq q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqqq q qq q q q q q qqqq qq q qqq q q qq q qqq q q q q q q q q q q q qq q q qq q q q q q q qq q q q q q q q q q q qq q q q q q q qq q q qqqq q q q q q q q q q q q q q qqq q q q q q q qq q q q q q q q q q q q q q qqqqqqq q q qq q q q q qqq q q qq q q q q qq q qq q q q qq qq q q q q q q qq q q qq q q q q q q q qq q q q q q q q q q q qq q q q q q q q q q q qq q q q q q qqqqq q qq q q qqq q qqqqq qqq q q qq qqq qq q qq qq q q qq q qq q q qq q q q q qq qq q qq q q q qqq q qq qq q q q q q q qq q qq q q q q q q q qqq q q q q qqqq q q qq qqq q q q qq q qqq q q qq q q qq q q qq q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q qqq q q q q q q q q q q qqq q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq q qq q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q qqq q q q q q q q q qq q q q q q q q q q q q q q q q q qq q q q q qq q q qq q q q q q q q qq qq q q q qqq q q q q q q q qq q qq q q qq qq q qq q q q q q q q q q q q q q q q q q q q q qq qq q qq q q q q q q q q q q q q q q q q q q q qq q q q qq q q q q q q q qq q qq q q q q q qqq q q q q q q q q q q qq qq q qqq q q q q q q q q qq q q q q q q q q q qqq q q q q q q q q q q q q q q q q q q q q q q qqq qq qq q q q q q q q q q q q q q q qqq q q q q qq qq q qq qq q qq q q q q q qq q qqq q q q q qqq q q q qq q q q q qq q q qq q q q qq q q q q q q q qqq q q q qq q q qqq q q q q qq q q qq q q q qqq q q q q q qqq q q qqq q q q q q q q q q qqq qq q q q q q q q q q q q qqq q q q q q q q qq q q q q q q q q q q qq q q q q qq q q q q q q qqqq q q q q q q qq qq q q q qq q q q q q q qq q qq q q q q q q q q qq q q qq q q q qq q q q q q q q q q q q q q q q q qq q q q qq q q q q qq q q q q q q q q q q q qqq q q qq q q q q q q qq q q q qq qqq q q q q qq q q qqq q qq q q qqqqq q q qq qqq q q q q q qq qqq q q qq qq q q q q qqq qq q q q qqq qq qq q q qqq q q q qqq q q q q q q q q q qq q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqqq q q q q q qqq q q q q q q q q qq q q q q q q q q q q q qq q q qq q qq q q qq q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q q q q q qq q q −4 −2 0 2 4 −505 Theoretical Quantiles Standardizedresiduals Normal Q−Q 4 6658 6424 Regression model of log(y) against log(x) and strata indicator Kim (ISU) Fractional Imputation NIH/NCI 47 / 57
  • 48. Stratified random sampling Table: The sample allocation in stratified simple random sampling. Strata 1 2 3 4 5 Strata size Nh 352 566 1963 2181 2198 Sample size nh 28 32 46 46 48 Sampling weight 12.57 17.69 42.67 47.41 45.79 Kim (ISU) Fractional Imputation NIH/NCI 48 / 57
  • 49. Response mechanism: MAR Variable xhi is always observed and only yhi is subject to missingness. PMAR Rhi ∼ Bernoulli(πhi ), πhi = 1/[1 + exp{4 − 0.3 log(xhi )}]. The overall response rate is about 0.6. Kim (ISU) Fractional Imputation NIH/NCI 49 / 57
  • 50. Simulation Study Table 1 Monte Carlo bias and variance of the point estimators. Parameter Estimator Bias Variance Std Var Complete sample 0.00 0.42 100 θ = E(Y ) MI 0.00 0.59 134 FI 0.00 0.58 133 Table 2 Monte Carlo relative bias of the variance estimator. Parameter Imputation Relative bias (%) V (ˆθ) MI 18.4 FI 2.7 Kim (ISU) Fractional Imputation NIH/NCI 50 / 57
  • 51. Discussion Rubin’s formula is based on the following decomposition: V (ˆθMI ) = V (ˆθn) + V (ˆθMI − ˆθn) where ˆθn is the complete-sample estimator of θ. Basically, WM term estimates V (ˆθn) and (1 + M−1)BM term estimates V (ˆθMI − ˆθn). For general case, we have V (ˆθMI ) = V (ˆθn) + V (ˆθMI − ˆθn) + 2Cov(ˆθMI − ˆθn, ˆθn) and Rubin’s variance estimator ignores the covariance term. Thus, a sufficient condition for the validity of unbiased variance estimator is Cov(ˆθMI − ˆθn, ˆθn) = 0. Meng (1994) called the condition congeniality of ˆθn. Congeniality holds when ˆθn is the MLE of θ (self-efficient estimator). Kim (ISU) Fractional Imputation NIH/NCI 51 / 57
  • 52. Discussion For example, there are two estimators of θ = E(Y ) when log(Y ) follows from N(β0 + β1x, σ2). 1 Maximum likelihood method: ˆθMLE = n−1 n i=1 exp{ˆβ0 + ˆβ1xi + 0.5ˆσ2 } 2 Method of moments: ˆθMME = n−1 n i=1 yi The MME of θ = E(Y ) does not satisfy the congeniality and Rubin’s variance estimator is biased (Yang and Kim, 2016). Rubin’s variance estimator is essentially unbiased for MLE of θ (R.B. = -1.9%) but MLE is rarely used in practice. Kim (ISU) Fractional Imputation NIH/NCI 52 / 57
  • 53. 1 Introduction 2 Fractional Imputation 3 Fractional hot deck imputation 4 R package: FHDI 5 Numerical Illustration 6 Concluding Remarks Kim (ISU) Fractional Imputation NIH/NCI 53 / 57
  • 54. Summary Fractional imputation is developed as a frequentist imputation. Multiple imputation is motivated from a Bayesian framework. The frequentist validity of multiple imputation requires congeniality. Fractional imputation does not require the congeniality condition and works well for Method of Moments estimators. Fractional Hot Deck Imputation is now developed in SAS and R. Kim (ISU) Fractional Imputation NIH/NCI 54 / 57
  • 55. Future research Fractional imputation using Gaussian finite mixture models. Survey data integration: Extension of Kim and Rao (2012) and Kim, Berg, and Park (2016). Fractional imputation under model uncertainty. Kim (ISU) Fractional Imputation NIH/NCI 55 / 57
  • 56. Collaborators Fractional Hot Deck Imputation Jongho Im, Inho Cho, Wayne Fuller, Pushpal Mukhopadhyay Survey Data Integration Emily Berg, J.N.K. Rao, Seho Park Review of Fractional Imputation, Congeniality Shu Yang Kim (ISU) Fractional Imputation NIH/NCI 56 / 57
  • 57. The end Kim (ISU) Fractional Imputation NIH/NCI 57 / 57