In microarray technology, a number of critical steps are required to convert
the raw measurements into the data relied upon by biologists and clinicians.
These data manipulations, referred to as preprocessing, influence the quality
of the ultimate measurements and studies that rely upon them. Standard
operating procedure for microarray researchers is to use preprocessed data as
the starting point for the statistical analyses that produce reported results.
This has prevented many researchers from carefully considering their choice of
preprocessing methodology. Furthermore, the fact that the preprocessing step
affects the stochastic properties of the final statistical summaries is often
ignored. In this paper we propose a statistical framework that permits the
integration of preprocessing into the standard statistical analysis flow of
microarray data. This general framework is relevant in many microarray
platforms and motivates targeted analysis methods for specific applications. We
demonstrate its usefulness by applying the idea in three different applications
of the technology.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS116 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org