SlideShare a Scribd company logo
1 of 49
Double Robustness: 
Theory and Applications with Missing Data 
Lu Mao 
Department of Biostatistics 
The University of North Carolina at Chapel Hill 
Email: lmao@unc.edu 
April 17, 2013 
1/49
Table of Contents 
Part I: A Semiparametric Perspective 
A motivating example 
Semiparametric approachs to coarsened data 
Constructing the estimating equation 
Part II: Applications in Missing Data Problems 
Data with two levels of missingness 
Monotone coarsened data 
2/49
Part I: A Semiparametric Perspective 
3/49
A Motivating Example 
I Given an iid sample Y1;    ; Yn from an arbitrary 
distribution, consider the estimation of the population 
mean  = EY by  Y , which solves 
Pn(Y  ) = 0; 
where PnZ  1 
n 
Pn 
i=1 Zi. 
I Suppose some of the Yi's are missing. Let Ri = 1 if Yi is 
observed and = 0 if otherwise. Let (Y ) = P(R = 1jY ). 
Now consider estimating  by solving 
PnR(Y  ) = 0 
resulting in 
^CC = 
P 
PRiYi 
Ri 
!p 
E[(Y )Y ] 
E(Y ) 
6= : 
4/49
I Suppose in addition to Yi, an auxilary variable Xi is also 
collected, and R ? Y jX. Assume P(R = 1jY;X) = (X; ). 
To correct the bias, we apply the estimating equation ( ^n 
is a consistent estimator for 0) 
	IPW 
n = Pn 
R 
(X; ^n) 
(Y  ) 
resulting in 
^IPW = 
Pn[RY=(X; ^n)] 
Pn[R=(X; ^n)] 
= 
Pn[RY=(X; 0)] 
Pn[R=(X; 0)] 
+ op(1) 
!p 
E[RY=(X; 0)] 
E[R=(X; 0)] 
=  
(1) 
5/49
I Assume (X) = E(Y jX; ), and consider a new estimating 
equation as a modi
cation of 	IPW 
n : 
	DR 
n = Pn 
  
R 
(X; ^n) 
(Y  )  
R  (X; ^n) 
(X; ^n) 
! 
; 
((X; ^n)  ) 
(2) 
resulting in 
^DR = Pn 
  
R 
(X; ^n) 
Y  
R  (X; ^n) 
(X; ^n) 
! 
: (3) 
(X; ^n) 
Now let's study the consistency of ^DR under dierent 
assumptions. 
6/49
I Scenario 1. (X; ) correct; (X; ) incorrect. So, 
^n !p 0, but ^n ! , with (X; )6= E(Y jX): 
^DR = Pn 
 
R 
(X; 0) 
Y  
R  (X; 0) 
(X; 0) 
(X; ) 
 
+ op(1) 
!p E 
 
R 
(X; 0) 
Y 
 
 E 
 
R  (X; 0) 
(X; 0) 
 
(X; ) 
= E 
 
Y E 
 
R 
(X; 0)
Y;X 
 
 E 
 
(X; )E 
 
R  (X; 0) 
(X; 0)
Y;X 
 
=   0 
=  
(4) 
7/49
I Scenario 2.(X; ) correct; (X; ) incorrect; So, ^n !p 0, 
but ^n ! , with (X; )6= E(RjY;X): 
^DR = Pn 
 
R 
(X; ) 
Y  
R  (X; ) 
(X; ) 
(X; 0) 
 
+ op(1) 
!p E 
 
R 
(X; ) 
Y 
 
 E 
 
R  (X; ) 
(X; ) 
 
(X; 0) 
= E 
 
R 
(X; ) 
E(Y jR;X) 
 
 E 
 
R  (X; ) 
(X; ) 
(X; 0) 
 
= E 
 
R 
(X; ) 
 
 E 
(X; 0) 
 
R  (X; ) 
(X; ) 
(X; 0) 
 
= E[E((X; 0))] 
= : 
(5) 
8/49
Result 1 (Double robustness) 
^DR is consistent if either the  model or the  model is 
correct, that is, under M1 [M2, where 
M1 = fp(rjy; x; ) :  2 1g, and M2 = fp(yjx; ) :  2 2g. In 
other words, ^DR is doubly robust. 
I Now, let's consider a somewhat dierent question: 
eciency under M1 M2. For simplicity, we assume we 
know the true values (0; 0). 
I Denote Gng(Z) = p1 
n 
Pn 
i=1[g(Zi)  Eg(Z)]. Algebraic 
manipulations yield: 
p 
n(^IPW  ) = 
1 
Pn[R=(X; 0)] 
 Gn 
R 
0(X; ) 
(Y  ) 
  N(0; 2 
IPW): 
(6) 
9/49
I where 
2 
IPW = E 
 
R 
(X; 0) 
(Y  ) 
2 
= E 
(Y  )2 
(X; 0) 
: 
I Similarly 
p 
n(^DR  ) = Gn 
 
R 
(X; 0) 
(Y  )  
R  (X; 0) 
(X; 0) 
 
((X; 0)  ) 
  N(0; 2D 
R); 
(7) 
where 
2D 
R = E 
 
R 
(X; 0) 
2 
(Y  ) 
 2E 
 
R 
(X; 0) 
(Y  ) 
 
R  (X; 0) 
(X; 0) 
((X; 0)  ) 
 
+ E 
 
R  (X; 0) 
(X; 0) 
2 
((X; 0)  ) 
10/49
I 
2D 
R = E 
(Y  )2 
(X; 0) 
 2E 
 
1  (X; 0) 
(X; 0) 
((X; 0)  )2 
 
+ E 
 
1  (X; 0) 
(X; 0) 
((X; 0)  )2 
 
= 2 
IPW  E 
 
1  (X; 0) 
(X; 0) 
((X; 0)  )2 
 
(X;0) (Y  ),  A = R(X;0) 
(X;0) ((X; 0)  ), 
I  IPW = R 
22 
and  DR =  IPW +  A. Consider the Hilbert space L2(P). 
Since IPW ^and DR ^have in
uence functions  IPW and 
 DR respectively, their squared length (jj  jj E()2) are 
the asumptotic variances for ^IPW and ^DR. 
11/49
I The following
gure provides a geometric illustration: 
Figure : A geometric interpretation of eciency improvement by 
the DR. 
Result 2 (Eciency of DR) 
^DR is more ecient than ^IPW under M1 M2. 
12/49
Remark 1.1 
The above example suggests that 
I For a full data problem, there is a natural extension, via 
the IPW (inverse probability weighting) method, to a 
corresponding missing data problem; 
I By positing a working model p(zmisjzobs; ), the IPW 
estimating equation can be modi
ed by adding a suitable 
augmentation term, resulting in an estimator that is still 
consistent even if the working model p(zmisjzobs; ) is not 
correct; 
I If in case p(zmisjzobs; ) is correct, the new estimator is 
consistent even if missing mechanism is incorrectly 
modeled. In this sense, the new estimator is doubly robust; 
I The doubly robust estimator has improved eciency if 
both models are correct. 
13/49
Semiparametric approachs to coarsened data 
I First we introduce the terminology of coarsening, which 
contains missing data as a special case: 
De
nition 1.2 (Coarsening) 
Suppose the full data consist of iid observations of an 
l-dimensional random vector Z. De
ne a coarsening variable C 
such that when C = r, we only observe Gr(Z), where Gr() is a 
many-to-one function. Further denote C = 1 if Z is completely 
observed (no coarsening), that is, G1(Z) = Z. Thus, the 
observed data consist of iid copies of (C;GC(Z)). 
De
nition 1.3 (Coarsening at random) 
The data are said to be coarsened at random (CAR) if 
C ? ZjGC(Z). 
Remark 1.4 (Assumption) 
All problems considered are under the assumption of CAR. 
14/49
Terminology 
I Z: Full data; 
I GC(Z): Observed data; 
I (C;GC(Z)): Coarsened data. 
I Semiparametric models arise naturally in coarsened data 
problems. 
I Consider a full data regression model, z = (y; x)0: 
p(zj
; ) = p(yjx;
)p(x; ); 
where
is the regression parameter, and  is in
nite 
dimensional (e.g. arbitrary cdf F for x). 
I Now suppose some components of x is missing (at 
random), then the likelihood becomes 
q(y; xobs; rj
; ; ) = p(rjy; xobs; ) 
Z 
p(yjx;
)p(x; )dxmis 
15/49
I Now the in
nite dimensional nuisance  cannot be ignored. 
Hence we have arrived at a semiparametric model. 
I Let's review some basic theory about semiparametric 
inference. We assume as previously that
is p-dimensional, 
the parameter of interest, and  is a possibly 
in
nite-dimensional nuissance parameter. 
De
nition 1.5 (RAL and in
uence function) 
The estimator ^
n is regular asymptotically linear (RAL) if 
p 
n( ^
n
0) = Gn ~
0;0 + op(1): (8) 
The mean-zero function ~
0;0 is said to be the in
uence 
function of ^
n. 
Remark 1.6 (RAL estimator) 
If (8) holds, by CLT we easily have 
p 
n( ^
n
0)   N(0;E ~  2) 
16/49
De

More Related Content

What's hot

Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...Katsuya Ito
 
On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansFrank Nielsen
 
Code of the multidimensional fractional pseudo-Newton method using recursive ...
Code of the multidimensional fractional pseudo-Newton method using recursive ...Code of the multidimensional fractional pseudo-Newton method using recursive ...
Code of the multidimensional fractional pseudo-Newton method using recursive ...mathsjournal
 
Solving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic IntegersSolving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic IntegersJoseph Molina
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerChristian Robert
 
Variational Inference
Variational InferenceVariational Inference
Variational InferenceTushar Tank
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen
 
Kolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametriclineKolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametriclineAlina Barbulescu
 
Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...
Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...
Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...IOSR Journals
 
Classification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsClassification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsFrank Nielsen
 
Fixed point result in menger space with ea property
Fixed point result in menger space with ea propertyFixed point result in menger space with ea property
Fixed point result in menger space with ea propertyAlexander Decker
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon informationFrank Nielsen
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 

What's hot (20)

Chemistry Assignment Help
Chemistry Assignment Help Chemistry Assignment Help
Chemistry Assignment Help
 
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Asymptotic Analysis
Asymptotic  AnalysisAsymptotic  Analysis
Asymptotic Analysis
 
On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract means
 
Code of the multidimensional fractional pseudo-Newton method using recursive ...
Code of the multidimensional fractional pseudo-Newton method using recursive ...Code of the multidimensional fractional pseudo-Newton method using recursive ...
Code of the multidimensional fractional pseudo-Newton method using recursive ...
 
Solving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic IntegersSolving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic Integers
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like sampler
 
CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...
CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...
CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...
 
Computer Science Assignment Help
Computer Science Assignment Help Computer Science Assignment Help
Computer Science Assignment Help
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 
Variational Inference
Variational InferenceVariational Inference
Variational Inference
 
Project Paper
Project PaperProject Paper
Project Paper
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 
Kolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametriclineKolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametricline
 
Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...
Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...
Adomian Decomposition Method for Certain Space-Time Fractional Partial Differ...
 
Classification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsClassification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metrics
 
Fixed point result in menger space with ea property
Fixed point result in menger space with ea propertyFixed point result in menger space with ea property
Fixed point result in menger space with ea property
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon information
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 

Similar to Double Robustness: Theory and Applications with Missing Data

Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Reading Seminar (140515) Spectral Learning of L-PCFGs
Reading Seminar (140515) Spectral Learning of L-PCFGsReading Seminar (140515) Spectral Learning of L-PCFGs
Reading Seminar (140515) Spectral Learning of L-PCFGsKeisuke OTAKI
 
Application of Cylindrical and Spherical coordinate system in double-triple i...
Application of Cylindrical and Spherical coordinate system in double-triple i...Application of Cylindrical and Spherical coordinate system in double-triple i...
Application of Cylindrical and Spherical coordinate system in double-triple i...Sonendra Kumar Gupta
 
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMSSOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMSTahia ZERIZER
 
An Approach For Solving Nonlinear Programming Problems
An Approach For Solving Nonlinear Programming ProblemsAn Approach For Solving Nonlinear Programming Problems
An Approach For Solving Nonlinear Programming ProblemsMary Montoya
 
Hormann.2001.TPI.pdf
Hormann.2001.TPI.pdfHormann.2001.TPI.pdf
Hormann.2001.TPI.pdfssuserbe139c
 
Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...
Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...
Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...Daisuke Satow
 
Fixed point result in probabilistic metric space
Fixed point result in probabilistic metric spaceFixed point result in probabilistic metric space
Fixed point result in probabilistic metric spaceAlexander Decker
 
NONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALES
NONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALESNONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALES
NONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALESTahia ZERIZER
 
On a Deterministic Property of the Category of k-almost Primes: A Determinist...
On a Deterministic Property of the Category of k-almost Primes: A Determinist...On a Deterministic Property of the Category of k-almost Primes: A Determinist...
On a Deterministic Property of the Category of k-almost Primes: A Determinist...Ramin (A.) Zahedi
 
Common fixed point theorems of integral type in menger pm spaces
Common fixed point theorems of integral type in menger pm spacesCommon fixed point theorems of integral type in menger pm spaces
Common fixed point theorems of integral type in menger pm spacesAlexander Decker
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motionJa-Keoung Koo
 

Similar to Double Robustness: Theory and Applications with Missing Data (20)

MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
 
Ichimura 1993: Semiparametric Least Squares (non-technical)
Ichimura 1993: Semiparametric Least Squares (non-technical)Ichimura 1993: Semiparametric Least Squares (non-technical)
Ichimura 1993: Semiparametric Least Squares (non-technical)
 
Zeta Zero-Counting Function
Zeta Zero-Counting FunctionZeta Zero-Counting Function
Zeta Zero-Counting Function
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Reading Seminar (140515) Spectral Learning of L-PCFGs
Reading Seminar (140515) Spectral Learning of L-PCFGsReading Seminar (140515) Spectral Learning of L-PCFGs
Reading Seminar (140515) Spectral Learning of L-PCFGs
 
Application of Cylindrical and Spherical coordinate system in double-triple i...
Application of Cylindrical and Spherical coordinate system in double-triple i...Application of Cylindrical and Spherical coordinate system in double-triple i...
Application of Cylindrical and Spherical coordinate system in double-triple i...
 
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMSSOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
 
stochastic processes assignment help
stochastic processes assignment helpstochastic processes assignment help
stochastic processes assignment help
 
An Approach For Solving Nonlinear Programming Problems
An Approach For Solving Nonlinear Programming ProblemsAn Approach For Solving Nonlinear Programming Problems
An Approach For Solving Nonlinear Programming Problems
 
lecture6.ppt
lecture6.pptlecture6.ppt
lecture6.ppt
 
Hormann.2001.TPI.pdf
Hormann.2001.TPI.pdfHormann.2001.TPI.pdf
Hormann.2001.TPI.pdf
 
Lecture5
Lecture5Lecture5
Lecture5
 
Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...
Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...
Exact Sum Rules for Vector Channel at Finite Temperature and its Applications...
 
Fixed point result in probabilistic metric space
Fixed point result in probabilistic metric spaceFixed point result in probabilistic metric space
Fixed point result in probabilistic metric space
 
NONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALES
NONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALESNONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALES
NONLINEAR DIFFERENCE EQUATIONS WITH SMALL PARAMETERS OF MULTIPLE SCALES
 
Metodo gauss_newton.pdf
Metodo gauss_newton.pdfMetodo gauss_newton.pdf
Metodo gauss_newton.pdf
 
On a Deterministic Property of the Category of k-almost Primes: A Determinist...
On a Deterministic Property of the Category of k-almost Primes: A Determinist...On a Deterministic Property of the Category of k-almost Primes: A Determinist...
On a Deterministic Property of the Category of k-almost Primes: A Determinist...
 
Common fixed point theorems of integral type in menger pm spaces
Common fixed point theorems of integral type in menger pm spacesCommon fixed point theorems of integral type in menger pm spaces
Common fixed point theorems of integral type in menger pm spaces
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motion
 
PCA on graph/network
PCA on graph/networkPCA on graph/network
PCA on graph/network
 

Recently uploaded

Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfTukamushabaBismark
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 

Recently uploaded (20)

Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 

Double Robustness: Theory and Applications with Missing Data

  • 1. Double Robustness: Theory and Applications with Missing Data Lu Mao Department of Biostatistics The University of North Carolina at Chapel Hill Email: lmao@unc.edu April 17, 2013 1/49
  • 2. Table of Contents Part I: A Semiparametric Perspective A motivating example Semiparametric approachs to coarsened data Constructing the estimating equation Part II: Applications in Missing Data Problems Data with two levels of missingness Monotone coarsened data 2/49
  • 3. Part I: A Semiparametric Perspective 3/49
  • 4. A Motivating Example I Given an iid sample Y1; ; Yn from an arbitrary distribution, consider the estimation of the population mean = EY by Y , which solves Pn(Y ) = 0; where PnZ 1 n Pn i=1 Zi. I Suppose some of the Yi's are missing. Let Ri = 1 if Yi is observed and = 0 if otherwise. Let (Y ) = P(R = 1jY ). Now consider estimating by solving PnR(Y ) = 0 resulting in ^CC = P PRiYi Ri !p E[(Y )Y ] E(Y ) 6= : 4/49
  • 5. I Suppose in addition to Yi, an auxilary variable Xi is also collected, and R ? Y jX. Assume P(R = 1jY;X) = (X; ). To correct the bias, we apply the estimating equation ( ^n is a consistent estimator for 0) IPW n = Pn R (X; ^n) (Y ) resulting in ^IPW = Pn[RY=(X; ^n)] Pn[R=(X; ^n)] = Pn[RY=(X; 0)] Pn[R=(X; 0)] + op(1) !p E[RY=(X; 0)] E[R=(X; 0)] = (1) 5/49
  • 6. I Assume (X) = E(Y jX; ), and consider a new estimating equation as a modi
  • 7. cation of IPW n : DR n = Pn R (X; ^n) (Y ) R (X; ^n) (X; ^n) ! ; ((X; ^n) ) (2) resulting in ^DR = Pn R (X; ^n) Y R (X; ^n) (X; ^n) ! : (3) (X; ^n) Now let's study the consistency of ^DR under dierent assumptions. 6/49
  • 8. I Scenario 1. (X; ) correct; (X; ) incorrect. So, ^n !p 0, but ^n ! , with (X; )6= E(Y jX): ^DR = Pn R (X; 0) Y R (X; 0) (X; 0) (X; ) + op(1) !p E R (X; 0) Y E R (X; 0) (X; 0) (X; ) = E Y E R (X; 0)
  • 9.
  • 10.
  • 11.
  • 12. Y;X E (X; )E R (X; 0) (X; 0)
  • 13.
  • 14.
  • 15.
  • 16. Y;X = 0 = (4) 7/49
  • 17. I Scenario 2.(X; ) correct; (X; ) incorrect; So, ^n !p 0, but ^n ! , with (X; )6= E(RjY;X): ^DR = Pn R (X; ) Y R (X; ) (X; ) (X; 0) + op(1) !p E R (X; ) Y E R (X; ) (X; ) (X; 0) = E R (X; ) E(Y jR;X) E R (X; ) (X; ) (X; 0) = E R (X; ) E (X; 0) R (X; ) (X; ) (X; 0) = E[E((X; 0))] = : (5) 8/49
  • 18. Result 1 (Double robustness) ^DR is consistent if either the model or the model is correct, that is, under M1 [M2, where M1 = fp(rjy; x; ) : 2 1g, and M2 = fp(yjx; ) : 2 2g. In other words, ^DR is doubly robust. I Now, let's consider a somewhat dierent question: eciency under M1 M2. For simplicity, we assume we know the true values (0; 0). I Denote Gng(Z) = p1 n Pn i=1[g(Zi) Eg(Z)]. Algebraic manipulations yield: p n(^IPW ) = 1 Pn[R=(X; 0)] Gn R 0(X; ) (Y ) N(0; 2 IPW): (6) 9/49
  • 19. I where 2 IPW = E R (X; 0) (Y ) 2 = E (Y )2 (X; 0) : I Similarly p n(^DR ) = Gn R (X; 0) (Y ) R (X; 0) (X; 0) ((X; 0) ) N(0; 2D R); (7) where 2D R = E R (X; 0) 2 (Y ) 2E R (X; 0) (Y ) R (X; 0) (X; 0) ((X; 0) ) + E R (X; 0) (X; 0) 2 ((X; 0) ) 10/49
  • 20. I 2D R = E (Y )2 (X; 0) 2E 1 (X; 0) (X; 0) ((X; 0) )2 + E 1 (X; 0) (X; 0) ((X; 0) )2 = 2 IPW E 1 (X; 0) (X; 0) ((X; 0) )2 (X;0) (Y ), A = R(X;0) (X;0) ((X; 0) ), I IPW = R 22 and DR = IPW + A. Consider the Hilbert space L2(P). Since IPW ^and DR ^have in uence functions IPW and DR respectively, their squared length (jj jj E()2) are the asumptotic variances for ^IPW and ^DR. 11/49
  • 22. gure provides a geometric illustration: Figure : A geometric interpretation of eciency improvement by the DR. Result 2 (Eciency of DR) ^DR is more ecient than ^IPW under M1 M2. 12/49
  • 23. Remark 1.1 The above example suggests that I For a full data problem, there is a natural extension, via the IPW (inverse probability weighting) method, to a corresponding missing data problem; I By positing a working model p(zmisjzobs; ), the IPW estimating equation can be modi
  • 24. ed by adding a suitable augmentation term, resulting in an estimator that is still consistent even if the working model p(zmisjzobs; ) is not correct; I If in case p(zmisjzobs; ) is correct, the new estimator is consistent even if missing mechanism is incorrectly modeled. In this sense, the new estimator is doubly robust; I The doubly robust estimator has improved eciency if both models are correct. 13/49
  • 25. Semiparametric approachs to coarsened data I First we introduce the terminology of coarsening, which contains missing data as a special case: De
  • 26. nition 1.2 (Coarsening) Suppose the full data consist of iid observations of an l-dimensional random vector Z. De
  • 27. ne a coarsening variable C such that when C = r, we only observe Gr(Z), where Gr() is a many-to-one function. Further denote C = 1 if Z is completely observed (no coarsening), that is, G1(Z) = Z. Thus, the observed data consist of iid copies of (C;GC(Z)). De
  • 28. nition 1.3 (Coarsening at random) The data are said to be coarsened at random (CAR) if C ? ZjGC(Z). Remark 1.4 (Assumption) All problems considered are under the assumption of CAR. 14/49
  • 29. Terminology I Z: Full data; I GC(Z): Observed data; I (C;GC(Z)): Coarsened data. I Semiparametric models arise naturally in coarsened data problems. I Consider a full data regression model, z = (y; x)0: p(zj
  • 30. ; ) = p(yjx;
  • 32. is the regression parameter, and is in
  • 33. nite dimensional (e.g. arbitrary cdf F for x). I Now suppose some components of x is missing (at random), then the likelihood becomes q(y; xobs; rj
  • 34. ; ; ) = p(rjy; xobs; ) Z p(yjx;
  • 36. I Now the in
  • 37. nite dimensional nuisance cannot be ignored. Hence we have arrived at a semiparametric model. I Let's review some basic theory about semiparametric inference. We assume as previously that
  • 38. is p-dimensional, the parameter of interest, and is a possibly in
  • 40. nition 1.5 (RAL and in uence function) The estimator ^
  • 41. n is regular asymptotically linear (RAL) if p n( ^
  • 42. n
  • 43. 0) = Gn ~
  • 44. 0;0 + op(1): (8) The mean-zero function ~
  • 45. 0;0 is said to be the in uence function of ^
  • 46. n. Remark 1.6 (RAL estimator) If (8) holds, by CLT we easily have p n( ^
  • 47. n
  • 48. 0) N(0;E ~ 2) 16/49
  • 49. De
  • 50. nition 1.7 (Tangent spaces) Let H denote the Hilbert space of all mean-zero functions in L2(P).
  • 51. , the tangent space for
  • 52. is de
  • 53. ned as the linear span, in H, of the score function S
  • 54. = @ @
  • 56. ; 0)j
  • 57. =
  • 60. g: Similarly, the nuisance tangent space is de
  • 61. ned as the linear span of the union of the score functions for all one-dimensional parametric submodels S = @ @ p(zj
  • 62. 0; ). I The following important theorem provides a characterization of all in uence functions of semiparametric RAL estimators ^
  • 64. ? Theorem 1.8 (The space of IF for
  • 65. ) The space of in uence functions of RAL estimators for
  • 66. consists of all ~ satisfying I ~ is orthogonal to , i.e. ~ 2 ; I E ~ ST
  • 67. = Ipp. Remark 1.9 (Z-estimation) Consider estimating
  • 68. from the estimating equation Pn
  • 70. 0 = 0: Then by standard Z-estimation theory, p n( ^
  • 71. n
  • 72. ) = fE _
  • 73. 0g1Gn
  • 74. 0 + op(1): 18/49
  • 75. Remark 1.10 (Z-estimation with estimated nuisance) In the presence of a nuisance paramter , the estimating equation generally involves . A natural strategy is to insert a consistent estimator ^n. Pn
  • 76. ;^n = 0; where E
  • 77. 0;0 = 0: Now, p n( ^
  • 78. n
  • 79. ) = fE _
  • 82. 0g1 = fE _ Gn
  • 83. 0;^n + + op(1) p n[E
  • 85. 0;^n]
  • 86. 0g1 = fE _ Gn
  • 88. 0;0S] + op(1): p n(^n 0)] If is constructed such that
  • 89. 0;0 2 ? , we have E[
  • 90. 0;0S] = 0, and so p n( ^
  • 91. n
  • 92. ) = fE _
  • 93. 0g1Gn
  • 94. 0;0 + op(1); which is equivalent to the estimator solving Pn
  • 95. ;0 = 0. 19/49
  • 96. In the following development for methods with coarsened data, we start with the assumption that the full data problem p(zj
  • 97. ; ) is well studied. This includes that I The full data tangent spaces F
  • 98. and F are completely characterized; I We have a full data estimating function
  • 99. (Z) 2 F? . The likelihood for the coarsened data, consisting of (Ci;GCi(Zi)), is q(r; grj
  • 100. ; ; ) = (rjgr; ) Z z:Gr(z)=gr p(zj
  • 101. ; )d(z) (9) Now the nuisance parameter consists of (; ). 20/49
  • 102. I We start by investigating the relationships between coarsened data tangent spaces with the full data counterparts. I Consider S
  • 103. (r; gr) = @ @
  • 105. ; ; ) = @ @
  • 107. ; )d(z) = R z:Gr(z)=gr (@p(zj
  • 108. ; )=@
  • 110. ; )d(z) = R z:Gr(z)=gr (@ log p(zj
  • 111. ; )=@
  • 112. )p(zj
  • 113. ; )d(z) R p(zj
  • 115. (Z)jGr(Z) = gr) = E(SF
  • 116. (Z)jC = r;Gr(Z) = gr) I Similarly we have the following theorem about : 21/49
  • 117. Theorem 1.11 (Characterization of ) The coarsened data tangent space for is characterized by = fE[F (Z)jC;GC(Z)] : F 2 F g: (10) I Remember that the important task is to characterize ? , which will aid us in constructing coarsened data estimating equations for
  • 118. . Theorem 1.12 (Characterization of ? ) The space ?consists of all elements h(C;GC(Z)) 2 H such that E[h(C;GC(Z))jZ] 2 F? : (11) 22/49
  • 119. Proof. By Theorem 1.11, The space ? consists of all elements h(C;GC(Z)) 2 H such that Efh(C;GC(Z))E[F (Z)jC;GC(Z)])g = 0; 8F (Z) 2 F : This is equivalent to Efh(C;GC(Z))F (Z)g = 0, which is equivalent to EfF (Z)E[h(C;GC(Z))jZ]g = 0 Remark 1.13 (An linear operator perspective) De
  • 120. ne the linear operator K : H ! HF by K() = E[jZ]. Then ? = K1(F? ): (12) Given
  • 121. (Z) 2 F? , the inverse operation K1(
  • 122. (Z)) with provide us a usable collection of estimating functions. 23/49
  • 123. Constructing the estimating equation Theorem 1.14 (The Space K1(
  • 124. (Z))) If (C;GC(Z)) 2 H is such that E[ (C;GC(Z))jZ] =
  • 126. (Z)) = (C;GC(Z)) + K1(0): De
  • 127. nition 1.15 (Augmentation space) We denote A = K1(0), and call it the augmentation space Corollary 1.16 Assume (1;Z; 0) = P(C = 1jZ; 0) 0 a.s.. Then K1(
  • 129. ;0 I(C = 1)
  • 130. (Z) (1;Z; 0) + h(C;GC(Z)); h 2 A: (13) 24/49
  • 131. Suppose ^n is an ecient estimator of 0. Take h 0, and we obtain the inverse probability weighted (IPW) estimating equation IPW n = Pn I(C=1)
  • 132. (Z) (1;Z;0) : In practice, the choice of h 2 A will be based on eciency considerations. We have the following theorem regarding the in uence function resulting from the estimating function Pn h
  • 133. ; ^n Theorem 1.17 The in uence function for ^
  • 134. h n solving Pn h
  • 135. ; ^n is ~ h = (E _
  • 136. 0)1 I(C = 1)
  • 137. 0(Z) (1;Z; 0) + h(C;GC(Z)) [j] (14) Remark 1.18 ( A) By calculus we easily obtain E[SjZ] = 0, and therefore A. 25/49
  • 138. From a geometric point of view, we easily get the following result: Theorem 1.19 (Eciency among ~ h) Arg min h2A jj ~ hjj2 = I(C = 1)
  • 139. (Z) (1;Z; 0) jA ; resulting in the estimating equation DR
  • 140. ;0 = I(C = 1)
  • 141. (Z) (1;Z; 0) I(C = 1)
  • 142. (Z) (1;Z; 0) jA : (15) I Typically, calculating the projection [jA] requires us to posit working parametric models p(zj). But the DR estimating equation will still be valid even if p(zj) does not contain the truth. 26/49
  • 143. We conclude this section by a theorem characterizing the augmentation space A. Theorem 1.20 (Characterization of A) The space A consists of all elements that can be written as X r6=1 I(C = 1) (1;Z) (r;Gr(Z)) I(C = r) hr(Gr(Z)); (16) where hr(Gr(Z)) is an arbitrary function of Gr(Z). Proof. See Theorem 7.2 of Tsiatis (2005). 27/49
  • 144. Part II: Applications in Missing Data Problems 28/49
  • 145. Data with two levels of missingness Suppose Z = (Z1;Z2) and Z2 is missing on some observations. Denote R = 1 if Z2 is observed and = 0 if otherwise. Let (Z1; 0) = P(R = 1jZ; 0). The following theorem states explicitly how to calculate [ R
  • 146. (Z) (Z1;0) jA]. Theorem 2.1 R
  • 147. (Z) (Z1; 0) jA = R (Z1; 0) (Z1; 0) E[
  • 148. (Z)jZ1]: A sketch of proof. Wh e
  • 149. rst use Theorem 1.20 to
  • 150. nd that a typical element in A is R(Z1;0) (Z1;0) i h(Z1). Then we
  • 151. nd that the unique function h0(Z1) such that n R
  • 152. (Z) (Z1;0) h R(Z1;0) (Z1;0) i h0(Z1) o ? h R(Z1;0) (Z1;0) i h(Z1); for all h(Z1) is E[
  • 154. Remark 2.2 (DR estimating equation) From Thereom 2.1 we have that DR
  • 155. ;0 = R
  • 156. (Z) (Z1; 0) R (Z1; 0) (Z1; 0) E[
  • 157. (Z)jZ1]: (17) To compute E[
  • 158. (Z)jZ1], we need to posit a parametric model p(zj), or at least p(Z2jZ1; ), and
  • 159. nd a consistent estimator ^n for 0. Then the projection can be computed by E[
  • 160. (Z)jZ1; ^n]. We should note that the parametric models need to be consistent with the original semiparametric model. Similar to the motivating example, we can show that the resulting estimating equation is doubly robust to p(rjz; ) and p(zj). DR
  • 161. ; ^n;^n = R
  • 162. (Z) (Z1; ^n) R (Z1; ^n) (Z1; ^n) # E[
  • 164. From the theoretic developement in Part I, we know that the estimating equation Pn DR
  • 165. ; ^n;0 is the most ecient among the augmented IPW if the working model p(zj) is true. It can be shown that DR
  • 166. ; ^n;^n and DR
  • 167. ; ^n;0 give assumptotic equivalent estimators under the working model. If we are to conduct robust inference based on DR
  • 168. ; ^n;^n , we need to derive the variance without relying on the correctness of the working model p(zj). Let h(Xi;
  • 169. ; 0) = E[
  • 170. (Z)jZ1; 0] and assume ^n ! , where h(Xi;
  • 173. Now, p n( ^
  • 174. n
  • 175. 0) = E @ @
  • 176. T DR
  • 177. 0;0; 1 Gn DR
  • 178. 0;0; + p n[E DR
  • 179. 0; ^n; E DR
  • 180. 0;0; ] + p n[E DR
  • 181. 0;0;^ E DR
  • 182. 0;0; ] + op(1) = h E _
  • 183. 0 i1 Gn DR
  • 184. 0;0; E[ DR
  • 185. 0;0;ST p n( ^n 0) ] + op(1) = h E _
  • 186. 0 i1 Gn DR
  • 187. 0;0; E[ DR
  • 188. 0;0;ST ][ESST ]1GnS + op(1) 32/49
  • 189. Denote )(ES 2 = DR (E DRST )1S: The we have p n( ^
  • 190. n
  • 191. 0) = h E _
  • 193. 0;0; + op(1) N(0; ); (18) where can be consistently estimated by Pn R _ ^
  • 194. n (Z1; ^n) #1 Pn ^
  • 195. n; ^n;^n (Z1;R) 2 Pn R _ ^
  • 196. n (Z1; ^n) #1 (19) 33/49
  • 197. Example: Logistic regression with missing covariate Consider a logistic regression P(Y = 1jX;
  • 198. ) = e
  • 199. 0+
  • 201. 2X2 1 + e
  • 202. 0+
  • 204. 2X2 where X2 is a real-valued continuous covariate and is missing on some subjects; (Y;XT 1 )T is always observed. I The full data model is p(yjx;
  • 205. )(x): Let X = (1;XT 1 ;X2). The full data estimating equation is Pn
  • 206. = PnX y e
  • 207. TX 1 + e
  • 209. I To use the IPW, we use a logistic regression for the missing mechanism P(R = 1jY;X; ) = e0+1Y +T2 X1 1 + e0+1Y +T2 X1 : The MLE ^n can be computed by solving PnS = 0, where S = 0 @ 1 Y X1 1 A R e0+1Y +T2 X1 1 + e0+1Y +T2 X1 ! : I To construct the DR estimating equation, we need to compute the conditional expectation E X y e
  • 210. TX 1 + e
  • 211. TX !
  • 212.
  • 213.
  • 214.
  • 215.
  • 216. Y;X1 # : 35/49
  • 217. I Therefore we need to posit a working model for p(xj), or at least for p(z2jy; z1; ). If we do the latter, we should be aware that p(z2jy; z1; ) must be compatible with the regression model p(yjx;
  • 218. ). In fact, if the covariate distribution is MVN, we can show that xjy is multivariate normal. This motivates the following working model X2jY;X1; N(0 + 1Y + T 2 X1; 3): The MLE ^n is easily computed by the Least Squares with a complete case analysis. I Finally we need to compute E X Y e
  • 219. TX 1 + e
  • 220. TX !
  • 221.
  • 222.
  • 223.
  • 224.
  • 225. Y;X1; ^n # : This can be completed using numerical or Monte Carlo integration. 36/49
  • 226. Hence the DR estimating equation is DR n (
  • 227. ) = Pn ( R 1 ; ^n) (Y;XT X Y e
  • 228. TX 1 + e
  • 229. TX ! 1 ; ^n) 1 (Y;XT 1 ; ^n) (Y;XT E X Y e
  • 230. TX 1 + e
  • 231. TX !
  • 232.
  • 233.
  • 234.
  • 235.
  • 236. Y;X1; ^n #) (20) ^
  • 237. n can be obtained using the Newton-Raphson algorithm, and its variance estimated using (19). 37/49
  • 240. nition) If we can order the levels of coarsening in such a way that Gr(Z) is a coarsened version of Gr+1(Z); r = 1; 2; . That is Gr(Z) = fr(Gr+1(Z)); where fr is a many-to-one function, then coarsening is said to be monotone. Example 2.4 (Monotone missing in longitudinal data) When subject is followed over time, we observe (Y1; ; Yk), where Yj is the measurement at the jth time point. Incomplete data arise if a subject is lost to follow-up at certain point. In this case, if a measurement is missing at the rth point, then all measurements after that will be missing. 38/49
  • 241. C = r Gr(Z) 1 Y1 2 Y1; Y2 ... ... k 1 Y1; ; Yk1 1 Y1; ; Yk For monotone coarsened data, it is natural and convenient to model missingness via the discrete hazard function r(Gr) = P(C = rjC r;Z); r6= 1 1; r = 1 : De
  • 242. ne Kr(Gr) = P(C rjZ) = rj =1[1 j(Gj )]: Then the function can be expressed as (r;Gr(Z)) = Kr(Gr(Z))r(Gr): 39/49
  • 243. As in the case with two levels of missingness, we
  • 244. rst need to chacterize the augmentation space A using Theorem 1.20. Then we use the characterization to derive ( IPWjA). We provide the end result in the following theorem. Theorem 2.5 (( IPWjA) in monotone coarsened data) The projection of I(C=1)
  • 245. (Z) (1;Z) onto A is X r6=1 I(C = r) r(Gr)I(C r) Kr(Gr) E[
  • 246. (Z)jGr(Z)] (21) Again, to compute the conditional expectations E[
  • 247. (Z)jGr(Z)] we need to posit a parametric working model p(zj), or at least a series of conditional models p(gr+1jgr; r). 40/49
  • 248. Remark 2.6 (Modeling the coarsening hazard) Instead of modeling the coarsening probability, we model the discrete hazard P(C = rjC r;Z; r) = r(Gr; r): With monotone missing longitudinal data, for example, we may apply the logistic model r(Gr; r) = e0r+1rY1++rrYr 1 + e0r+1rY1++rrYr : The likelihood of C now has the following form Y r Y i:Cir r(Gr(Zi); r) I(Ci=r) I(Cir) 1 r(Gr(Zi); r) : Note that the likelihood for r factorizes. So maximization can be done separately. 41/49
  • 249. If we use logistic regression for monotone missing longitudinal data, the likelihood is given by kY1 r Y i:Cir e0r+1rY1++rrYrI(Ci = r) 1 + e0r+1rY1++rrYr : Each r can be estimated using logistic regression on the data fi : Ci rg, and S = (ST 1 ; ; ST k1)T : 42/49
  • 250. Now we look at the problem of double robustness. Let ^n ! and ^n ! . Theorem 2.7 (Double robustness of DR) E I(C = 1)
  • 251. 0 (Z) (1;Z; ) + X r6=1 I(C = r) r(Gr; )I(C r) Kr(Gr; ) E[
  • 252. 0 (Z)jGr(Z); ] = 0; if either the model for r(gr; ) or the working model p(zj) is correctly speci
  • 255. ltration Fr fI(C = 1); ; I(C = r 1);Zg: and use martingale arguments. 43/49
  • 256. Remark 2.8 (Inference with DR) Denote )(ES 2 = DR (E DRST )1S: Similar to the case with two levels of missingness, we can show that p n( ^
  • 257. n
  • 258. 0) = h E _
  • 260. 0;0; + op(1) N(0; ); (22) where can be consistently estimated by Pn I(C = 1) _ ^
  • 261. n (1;Z; ^n) #1 Pn ^
  • 262. n; ^n;^n 2 Pn I(C = 1) _ ^
  • 263. n (1;Z; ^n) #1 (23) 44/49
  • 264. Example: A longitudinal RCT with dropout Tsiatis (2006) describes a randomized clinical trial on a new drug for HIV/AIDS. The primary outcome is CD4 count, denoted as Y . We also denote X as the indicator variable for the treatment. Measurements of Y are taken at baseline t1 = 0 and l 1 subsequent time points, denoted t2; ; tl. We want to model the mean CD4 count as a function of treatment and time through E[YjijXi] =
  • 265. 0 +
  • 266. 1tj +
  • 267. 2Xitj ; j = 1; ; l: Let the design matrix be D(X), that is E[YijXi] = D(Xi)
  • 268. : If there is no dropout, we may use the GEE with independent working correlation, resulting in the estimating equation Pn
  • 269. (Y;X) = PnDT (X)(Y D(X)
  • 271. Now suppose there is random dropout, and the mechanism is MAR. I First we use the logistic regression for the missing hazard r(Gr; r) = e0r+1rY1++rrYr+r+1;rX 1 + e0r+1rY1++rrYr+r+1;rX : and obtain the MLE ^n I Denote Y r = (Y1; ; Yr)T and Y r = (Yr+1; ; Yl)T . From Theorem 2.5, we need to compute the conditional expectation E[DT (X)(Y D(X)
  • 272. )jY r;X] = DT (X)E[(Y D(X)
  • 274. I If we posit the working model Y j(X = k) N(k; ); and denote rr to be the variance of Yr and rr to be the covariance between Yr and Yr. Then we have E[(Y D(X)
  • 275. )jYr;X; ] = Y r Dr3(X)
  • 276. rr(rr)1(Y r Dr3(X)
  • 277. ) : I The MLE ^ n can be computed using standard statistical package (e.g. PROC MIXED in SAS). 47/49
  • 279. can be estimated through the DR estimating equation using the Newton-Raphson algorithm Pn DR
  • 280. ; ^n;^ n = Pn ( I(C = 1) (1; ^n) DT (X)(Y D(X)
  • 281. ) + Xl1 r=1 I(C = r) r(Yr;X; ^n)I(C r) Kr(Yr;X; ^n) # ) DT (X)E[(Y D(X)
  • 282. )jYr;X; ^ n] : (24) I The asymptotic variance of
  • 283. n can be estimated using the sandwich-type estimator described in (23). 48/49
  • 284. References Bang H, Robins JM (2005). Doubly robust estimation in missing data and causal inference models. Biometrics 61, 962-73. Bickel J, Klaassen C, Ritov Y, Wellner JA (1993). Ecient and Adaptive Estimation for Semiparametric Models. Springer. Kosorok MR (2008). Introduction to empirical processes and semiparametric inference. Springer Lipsitz SR, Ibrahim JG, Zhao LP (1999). A Weighted Estimating Equation for Missing Covariate Data with Properties Similar to Maximum Likelihood. Journal of the American Statistical Association 94, 1147-1160. Robins JM, Rotnitzky A. (2001). Comment on the Bickel and Kwon article, Inference for semiparametric models: Some questions and an answer Statistica Sinica 11, 920-936.. Scharfstein DO, Rotnitzky A, Robins JM (1999). Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models. Journal of the American Statistical Association 94, 1135-1146. Tsiatis (2006). Semiparametric Theory and Missing Data. Springer Series in Statistics 49/49