The primary goal of randomized trials is to compare the effects of different
interventions on some outcome of interest. In addition to the treatment
assignment and outcome, data on baseline covariates, such as demographic
characteristics or biomarker measurements, are typically collected.
Incorporating such auxiliary covariates in the analysis of randomized trials
can increase power, but questions remain about how to preserve type I error
when incorporating such covariates in a flexible way, particularly when the
number of randomized units is small. Using the Young Citizens study, a
cluster-randomized trial of an educational intervention to promote HIV
awareness, we compare several methods to evaluate intervention effects when
baseline covariates are incorporated adaptively. To ascertain the validity of
the methods shown in small samples, extensive simulation studies were
conducted. We demonstrate that randomization inference preserves type I error
under model selection while tests based on asymptotic theory may yield invalid
results. We also demonstrate that covariate adjustment generally increases
power, except at extremely small sample sizes using liberal selection
procedures. Although shown within the context of HIV prevention research, our
conclusions have important implications for maximizing efficiency and
robustness in randomized trials with small samples across disciplines.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS679 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org