The vast amount of biological knowledge accumulated over the years has
allowed researchers to identify various biochemical interactions and define
different families of pathways. There is an increased interest in identifying
pathways and pathway elements involved in particular biological processes. Drug
discovery efforts, for example, are focused on identifying biomarkers as well
as pathways related to a disease. We propose a Bayesian model that addresses
this question by incorporating information on pathways and gene networks in the
analysis of DNA microarray data. Such information is used to define pathway
summaries, specify prior distributions, and structure the MCMC moves to fit the
model. We illustrate the method with an application to gene expression data
with censored survival outcomes. In addition to identifying markers that would
have been missed otherwise and improving prediction accuracy, the integration
of existing biological knowledge into the analysis provides a better
understanding of underlying molecular processes.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS463 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org