128,184 research outputs found
Genome-wide inference of ancestral recombination graphs
The complex correlation structure of a collection of orthologous DNA
sequences is uniquely captured by the "ancestral recombination graph" (ARG), a
complete record of coalescence and recombination events in the history of the
sample. However, existing methods for ARG inference are computationally
intensive, highly approximate, or limited to small numbers of sequences, and,
as a consequence, explicit ARG inference is rarely used in applied population
genomics. Here, we introduce a new algorithm for ARG inference that is
efficient enough to apply to dozens of complete mammalian genomes. The key idea
of our approach is to sample an ARG of n chromosomes conditional on an ARG of
n-1 chromosomes, an operation we call "threading." Using techniques based on
hidden Markov models, we can perform this threading operation exactly, up to
the assumptions of the sequentially Markov coalescent and a discretization of
time. An extension allows for threading of subtrees instead of individual
sequences. Repeated application of these threading operations results in highly
efficient Markov chain Monte Carlo samplers for ARGs. We have implemented these
methods in a computer program called ARGweaver. Experiments with simulated data
indicate that ARGweaver converges rapidly to the true posterior distribution
and is effective in recovering various features of the ARG for dozens of
sequences generated under realistic parameters for human populations. In
applications of ARGweaver to 54 human genome sequences from Complete Genomics,
we find clear signatures of natural selection, including regions of unusually
ancient ancestry associated with balancing selection and reductions in allele
age in sites under directional selection. Preliminary results also indicate
that our methods can be used to gain insight into complex features of human
population structure, even with a noninformative prior distribution.Comment: 88 pages, 7 main figures, 22 supplementary figures. This version
contains a substantially expanded genomic data analysi
Bayesian variable selection and data integration for biological regulatory networks
A substantial focus of research in molecular biology are gene regulatory
networks: the set of transcription factors and target genes which control the
involvement of different biological processes in living cells. Previous
statistical approaches for identifying gene regulatory networks have used gene
expression data, ChIP binding data or promoter sequence data, but each of these
resources provides only partial information. We present a Bayesian hierarchical
model that integrates all three data types in a principled variable selection
framework. The gene expression data are modeled as a function of the unknown
gene regulatory network which has an informed prior distribution based upon
both ChIP binding and promoter sequence data. We also present a variable
weighting methodology for the principled balancing of multiple sources of prior
information. We apply our procedure to the discovery of gene regulatory
relationships in Saccharomyces cerevisiae (Yeast) for which we can use several
external sources of information to validate our results. Our inferred
relationships show greater biological relevance on the external validation
measures than previous data integration methods. Our model also estimates
synergistic and antagonistic interactions between transcription factors, many
of which are validated by previous studies. We also evaluate the results from
our procedure for the weighting for multiple sources of prior information.
Finally, we discuss our methodology in the context of previous approaches to
data integration and Bayesian variable selection.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS130 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Bayesian testing of many hypotheses many genes: A study of sleep apnea
Substantial statistical research has recently been devoted to the analysis of
large-scale microarray experiments which provide a measure of the simultaneous
expression of thousands of genes in a particular condition. A typical goal is
the comparison of gene expression between two conditions (e.g., diseased vs.
nondiseased) to detect genes which show differential expression. Classical
hypothesis testing procedures have been applied to this problem and more recent
work has employed sophisticated models that allow for the sharing of
information across genes. However, many recent gene expression studies have an
experimental design with several conditions that requires an even more involved
hypothesis testing approach. In this paper, we use a hierarchical Bayesian
model to address the situation where there are many hypotheses that must be
simultaneously tested for each gene. In addition to having many hypotheses
within each gene, our analysis also addresses the more typical multiple
comparison issue of testing many genes simultaneously. We illustrate our
approach with an application to a study of genes involved in obstructive sleep
apnea in humans.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS241 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Signifying quantum benchmarks for qubit teleportation and secure communication using Einstein-Podolsky-Rosen steering inequalities
The demonstration of quantum teleportation of a photonic qubit from Alice to
Bob usually relies on data conditioned on detection at Bob's location. I show
that Bohm's Einstein-Podolsky-Rosen (EPR) paradox can be used to verify that
the quantum benchmark for qubit teleportation has been reached, without
postselection. This is possible for scenarios insensitive to losses at the
generation station, and with efficiencies of for the
teleportation process. The benchmark is obtained, if it is shown that Bob can
{}"steer" Alice's record of the qubit as stored by Charlie. EPR steering
inequalities involving measurement settings can also be used to confirm
quantum teleportation, for efficiencies , if one assumes trusted
detectors for Charlie and Alice. Using proofs of monogamy, I show that
two-setting EPR steering inequalities can signify secure teleportation of the
qubit state.Comment: 10 pages, 1 Figur
Parallel Implementation of Efficient Search Schemes for the Inference of Cancer Progression Models
The emergence and development of cancer is a consequence of the accumulation
over time of genomic mutations involving a specific set of genes, which
provides the cancer clones with a functional selective advantage. In this work,
we model the order of accumulation of such mutations during the progression,
which eventually leads to the disease, by means of probabilistic graphic
models, i.e., Bayesian Networks (BNs). We investigate how to perform the task
of learning the structure of such BNs, according to experimental evidence,
adopting a global optimization meta-heuristics. In particular, in this work we
rely on Genetic Algorithms, and to strongly reduce the execution time of the
inference -- which can also involve multiple repetitions to collect
statistically significant assessments of the data -- we distribute the
calculations using both multi-threading and a multi-node architecture. The
results show that our approach is characterized by good accuracy and
specificity; we also demonstrate its feasibility, thanks to a 84x reduction of
the overall execution time with respect to a traditional sequential
implementation
Recommended from our members
A tutorial on cue combination and Signal Detection Theory: Using changes in sensitivity to evaluate how observers integrate sensory information
Many sensory inputs contain multiple sources of information (‘cues’), such as two sounds of different frequencies, or a voice heard in unison with moving lips. Often, each cue provides a separate estimate of the same physical attribute, such as the size or location of an object. An ideal observer can exploit such redundant sensory information to improve the accuracy of their perceptual judgments. For example, if each cue is modeled as an independent, Gaussian, random variable, then combining Ncues should provide up to a √N improvement in detection/discrimination sensitivity. Alternatively, a less efficient observer may base their decision on only a subset of the available information, and so gain little or no benefit from having access to multiple sources of information. Here we use Signal Detection Theory to formulate and compare various models of cue-combination, many of which are commonly used to explain empirical data. We alert the reader to the key assumptions inherent in each model, and provide formulas for deriving quantitative predictions. Code is also provided for simulating each model, allowing expected levels of measurement error to be quantified. Based on these results, it is shown that predicted sensitivity often differs surprisingly little between qualitatively distinct models of combination. This means that sensitivity alone is not sufficient for understanding decision efficiency, and the implications of this are discussed
Explicit and Inferred Motives for Nonsuicidal Self-Injurious Acts and Urges in Borderline and Avoidant Personality Disorders
Nonsuicidal self-injury (NSSI) is a perplexing phenomenon that may have differing motives. The present study used experience sampling methods (ESM) which inquired explicitly about the motives for NSSI, but also enabled a temporal examination of the antecedents/consequences of NSSI; these allow us to infer other motives which were not explicitly endorsed. Adults (n = 152, aged 18–65) with borderline personality disorder (BPD), avoidant personality disorder (APD), or no psychopathology participated in a 3-week computerized diary study. We examined 5 classes of explicit motives for engaging in NSSI, finding support primarily for internally directed rather than interpersonally directed ones. We then used multilevel regression to examine changes in affect, cognition, and behavior surrounding moments of NSSI acts/urges compared with control moments (i.e., without NSSI). We examined changes in 5 scales of inferred motives, designed to correspond to the 5 classes of explicit motives. The results highlight differing motives for NSSI among individuals with BPD and APD, with some similarities (mostly in the explicit motives) and some differences (mostly in the inferred motives) between the disorders. Despite their infrequent explicit endorsement, fluctuations in interpersonally oriented scales were found surrounding NSSI acts/urges. This highlights the need to continue attending to interpersonal aspects of NSSI in research and in clinical practice. Additionally, NSSI urges, like acts, were followed by decline in affective/interpersonal distress (although in a delayed manner). Thus, interventions that build distress tolerance and enhance awareness for affective changes, and for antecedent/consequence patterns in NSSI, could help individuals resist the urge to self-injure
- …