128,184 research outputs found

    Genome-wide inference of ancestral recombination graphs

    Get PDF
    The complex correlation structure of a collection of orthologous DNA sequences is uniquely captured by the "ancestral recombination graph" (ARG), a complete record of coalescence and recombination events in the history of the sample. However, existing methods for ARG inference are computationally intensive, highly approximate, or limited to small numbers of sequences, and, as a consequence, explicit ARG inference is rarely used in applied population genomics. Here, we introduce a new algorithm for ARG inference that is efficient enough to apply to dozens of complete mammalian genomes. The key idea of our approach is to sample an ARG of n chromosomes conditional on an ARG of n-1 chromosomes, an operation we call "threading." Using techniques based on hidden Markov models, we can perform this threading operation exactly, up to the assumptions of the sequentially Markov coalescent and a discretization of time. An extension allows for threading of subtrees instead of individual sequences. Repeated application of these threading operations results in highly efficient Markov chain Monte Carlo samplers for ARGs. We have implemented these methods in a computer program called ARGweaver. Experiments with simulated data indicate that ARGweaver converges rapidly to the true posterior distribution and is effective in recovering various features of the ARG for dozens of sequences generated under realistic parameters for human populations. In applications of ARGweaver to 54 human genome sequences from Complete Genomics, we find clear signatures of natural selection, including regions of unusually ancient ancestry associated with balancing selection and reductions in allele age in sites under directional selection. Preliminary results also indicate that our methods can be used to gain insight into complex features of human population structure, even with a noninformative prior distribution.Comment: 88 pages, 7 main figures, 22 supplementary figures. This version contains a substantially expanded genomic data analysi

    Bayesian variable selection and data integration for biological regulatory networks

    Get PDF
    A substantial focus of research in molecular biology are gene regulatory networks: the set of transcription factors and target genes which control the involvement of different biological processes in living cells. Previous statistical approaches for identifying gene regulatory networks have used gene expression data, ChIP binding data or promoter sequence data, but each of these resources provides only partial information. We present a Bayesian hierarchical model that integrates all three data types in a principled variable selection framework. The gene expression data are modeled as a function of the unknown gene regulatory network which has an informed prior distribution based upon both ChIP binding and promoter sequence data. We also present a variable weighting methodology for the principled balancing of multiple sources of prior information. We apply our procedure to the discovery of gene regulatory relationships in Saccharomyces cerevisiae (Yeast) for which we can use several external sources of information to validate our results. Our inferred relationships show greater biological relevance on the external validation measures than previous data integration methods. Our model also estimates synergistic and antagonistic interactions between transcription factors, many of which are validated by previous studies. We also evaluate the results from our procedure for the weighting for multiple sources of prior information. Finally, we discuss our methodology in the context of previous approaches to data integration and Bayesian variable selection.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS130 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bayesian testing of many hypotheses ×\times many genes: A study of sleep apnea

    Full text link
    Substantial statistical research has recently been devoted to the analysis of large-scale microarray experiments which provide a measure of the simultaneous expression of thousands of genes in a particular condition. A typical goal is the comparison of gene expression between two conditions (e.g., diseased vs. nondiseased) to detect genes which show differential expression. Classical hypothesis testing procedures have been applied to this problem and more recent work has employed sophisticated models that allow for the sharing of information across genes. However, many recent gene expression studies have an experimental design with several conditions that requires an even more involved hypothesis testing approach. In this paper, we use a hierarchical Bayesian model to address the situation where there are many hypotheses that must be simultaneously tested for each gene. In addition to having many hypotheses within each gene, our analysis also addresses the more typical multiple comparison issue of testing many genes simultaneously. We illustrate our approach with an application to a study of genes involved in obstructive sleep apnea in humans.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS241 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Signifying quantum benchmarks for qubit teleportation and secure communication using Einstein-Podolsky-Rosen steering inequalities

    Full text link
    The demonstration of quantum teleportation of a photonic qubit from Alice to Bob usually relies on data conditioned on detection at Bob's location. I show that Bohm's Einstein-Podolsky-Rosen (EPR) paradox can be used to verify that the quantum benchmark for qubit teleportation has been reached, without postselection. This is possible for scenarios insensitive to losses at the generation station, and with efficiencies of ηB>1/3\eta_{B}>1/3 for the teleportation process. The benchmark is obtained, if it is shown that Bob can {}"steer" Alice's record of the qubit as stored by Charlie. EPR steering inequalities involving mm measurement settings can also be used to confirm quantum teleportation, for efficiencies ηB>1/m\eta_{B}>1/m, if one assumes trusted detectors for Charlie and Alice. Using proofs of monogamy, I show that two-setting EPR steering inequalities can signify secure teleportation of the qubit state.Comment: 10 pages, 1 Figur

    Parallel Implementation of Efficient Search Schemes for the Inference of Cancer Progression Models

    Full text link
    The emergence and development of cancer is a consequence of the accumulation over time of genomic mutations involving a specific set of genes, which provides the cancer clones with a functional selective advantage. In this work, we model the order of accumulation of such mutations during the progression, which eventually leads to the disease, by means of probabilistic graphic models, i.e., Bayesian Networks (BNs). We investigate how to perform the task of learning the structure of such BNs, according to experimental evidence, adopting a global optimization meta-heuristics. In particular, in this work we rely on Genetic Algorithms, and to strongly reduce the execution time of the inference -- which can also involve multiple repetitions to collect statistically significant assessments of the data -- we distribute the calculations using both multi-threading and a multi-node architecture. The results show that our approach is characterized by good accuracy and specificity; we also demonstrate its feasibility, thanks to a 84x reduction of the overall execution time with respect to a traditional sequential implementation

    Explicit and Inferred Motives for Nonsuicidal Self-Injurious Acts and Urges in Borderline and Avoidant Personality Disorders

    Full text link
    Nonsuicidal self-injury (NSSI) is a perplexing phenomenon that may have differing motives. The present study used experience sampling methods (ESM) which inquired explicitly about the motives for NSSI, but also enabled a temporal examination of the antecedents/consequences of NSSI; these allow us to infer other motives which were not explicitly endorsed. Adults (n = 152, aged 18–65) with borderline personality disorder (BPD), avoidant personality disorder (APD), or no psychopathology participated in a 3-week computerized diary study. We examined 5 classes of explicit motives for engaging in NSSI, finding support primarily for internally directed rather than interpersonally directed ones. We then used multilevel regression to examine changes in affect, cognition, and behavior surrounding moments of NSSI acts/urges compared with control moments (i.e., without NSSI). We examined changes in 5 scales of inferred motives, designed to correspond to the 5 classes of explicit motives. The results highlight differing motives for NSSI among individuals with BPD and APD, with some similarities (mostly in the explicit motives) and some differences (mostly in the inferred motives) between the disorders. Despite their infrequent explicit endorsement, fluctuations in interpersonally oriented scales were found surrounding NSSI acts/urges. This highlights the need to continue attending to interpersonal aspects of NSSI in research and in clinical practice. Additionally, NSSI urges, like acts, were followed by decline in affective/interpersonal distress (although in a delayed manner). Thus, interventions that build distress tolerance and enhance awareness for affective changes, and for antecedent/consequence patterns in NSSI, could help individuals resist the urge to self-injure
    • …
    corecore