169 research outputs found
The Statistics of Gene Mapping
Abstracts not available for BookReview
Reconstructing DNA copy number by joint segmentation of multiple sequences
The variation in DNA copy number carries information on the modalities of
genome evolution and misregulation of DNA replication in cancer cells; its
study can be helpful to localize tumor suppressor genes, distinguish different
populations of cancerous cell, as well identify genomic variations responsible
for disease phenotypes. A number of different high throughput technologies can
be used to identify copy number variable sites, and the literature documents
multiple effective algorithms. We focus here on the specific problem of
detecting regions where variation in copy number is relatively common in the
sample at hand: this encompasses the cases of copy number polymorphisms,
related samples, technical replicates, and cancerous sub-populations from the
same individual. We present an algorithm based on regularization approaches
with significant computational advantages and competitive accuracy. We
illustrate its applicability with simulated and real data sets.Comment: 54 pages, 5 figure
Catch me if you can: Signal localization with knockoff e-values
We consider problems where many, somewhat redundant, hypotheses are tested
and we are interested in reporting the most precise rejections, with false
discovery rate (FDR) control. For example, a common goal in genetics is to
identify DNA variants that carry distinct information on a trait of interest.
However, strong local dependencies between nearby variants make it challenging
to distinguish which of the many correlated features most directly influence
the phenotype. A common solution is then to identify sets of variants that
cover the truly important ones. Depending on the signal strengths, it is
possible to resolve the individual variant contributions with more or less
precision. Assuring FDR control on the reported findings with these adaptive
searches is, however, often impossible. To design a multiple comparison
procedure that allows for an adaptive choice of resolution with FDR control, we
leverage e-values and linear programming. We adapt this approach to problems
where knockoffs and group knockoffs have been successfully applied to test
conditional independence hypotheses. We demonstrate its efficacy by analyzing
data from the UK Biobank.Comment: 43 pages, 34 figures; text edit
- ā¦