It has long been recognized that covariate adjustment can increase precision in randomized experiments, even when it is not strictly necessary. Adjustment is often straightforward when a discrete covariate partitions the sample into a handful of strata, but becomes more involved with even a single continuous covariate such as age. As randomized experiments remain a gold standard for scientific inquiry, and the information age facilitates a massive collection of baseline information, the longstanding problem of if and how to adjust for covariates is likely to engage investigators for the foreseeable future.
In the locally efficient estimation approach introduced for general coarsened data structures by James Robins and collaborators, one first fits a relatively small working model, often with maximum likelihood, giving a nuisance parameter fit in an estimating equation for the parameter of interest. The usual advertisement is that the estimator will be asymptotically efficient if the working model is correct, but otherwise will still be consistent and asymptotically Gaussian.
However, by applying standard likelihood-based fits to misspecified working models in covariate adjustment problems, one can poorly estimate the parameter of interest. We propose a new method, empirical efficiency maximization, to optimize the working model fit for the resulting parameter estimate.
In addition to the randomized experiment setting, we show how our covariate adjustment procedure can be used in survival analysis applications. Numerical asymptotic efficiency calculations demonstrate gains relative to standard locally efficient estimators.
KEYWORDS: empirical efficiency maximization, covariate adjustment, locally efficient estimation, two-phase designs, clinical trials, survival analysis