As an increasing number of genome-wide association studies reveal the
limitations of attempting to explain phenotypic heritability by single genetic
loci, there is growing interest for associating complex phenotypes with sets of
genetic loci. While several methods for multi-locus mapping have been proposed,
it is often unclear how to relate the detected loci to the growing knowledge
about gene pathways and networks. The few methods that take biological pathways
or networks into account are either restricted to investigating a limited
number of predetermined sets of loci, or do not scale to genome-wide settings.
We present SConES, a new efficient method to discover sets of genetic loci
that are maximally associated with a phenotype, while being connected in an
underlying network. Our approach is based on a minimum cut reformulation of the
problem of selecting features under sparsity and connectivity constraints that
can be solved exactly and rapidly.
SConES outperforms state-of-the-art competitors in terms of runtime, scales
to hundreds of thousands of genetic loci, and exhibits higher power in
detecting causal SNPs in simulation studies than existing methods. On flowering
time phenotypes and genotypes from Arabidopsis thaliana, SConES detects loci
that enable accurate phenotype prediction and that are supported by the
literature.
Matlab code for SConES is available at
http://webdav.tuebingen.mpg.de/u/karsten/Forschung/scones/Comment: 20 pages, 6 figures, accepted at ISMB (International Conference on
Intelligent Systems for Molecular Biology) 201