We introduce a statistical method that can reconstruct nonlinear genetic
models (i.e., including epistasis, or gene-gene interactions) from
phenotype-genotype (GWAS) data. The computational and data resource
requirements are similar to those necessary for reconstruction of linear
genetic models (or identification of gene-trait associations), assuming a
condition of generalized sparsity, which limits the total number of gene-gene
interactions. An example of a sparse nonlinear model is one in which a typical
locus interacts with several or even many others, but only a small subset of
all possible interactions exist. It seems plausible that most genetic
architectures fall in this category. Our method uses a generalization of
compressed sensing (L1-penalized regression) applied to nonlinear functions of
the sensing matrix. We give theoretical arguments suggesting that the method is
nearly optimal in performance, and demonstrate its effectiveness on broad
classes of nonlinear genetic models using both real and simulated human
genomes.Comment: 20 pages, 8 figures. arXiv admin note: text overlap with
arXiv:1408.342