DNA copy number and mRNA expression are widely used data types in cancer
studies, which combined provide more insight than separately. Whereas in
existing literature the form of the relationship between these two types of
markers is fixed a priori, in this paper we model their association. We employ
piecewise linear regression splines (PLRS), which combine good interpretation
with sufficient flexibility to identify any plausible type of relationship. The
specification of the model leads to estimation and model selection in a
constrained, nonstandard setting. We provide methodology for testing the effect
of DNA on mRNA and choosing the appropriate model. Furthermore, we present a
novel approach to obtain reliable confidence bands for constrained PLRS, which
incorporates model uncertainty. The procedures are applied to colorectal and
breast cancer data. Common assumptions are found to be potentially misleading
for biologically relevant genes. More flexible models may bring more insight in
the interaction between the two markers.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS605 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org