44 research outputs found

    On the Representability of Complete Genomes by Multiple Competing Finite-Context (Markov) Models

    Get PDF
    A finite-context (Markov) model of order yields the probability distribution of the next symbol in a sequence of symbols, given the recent past up to depth . Markov modeling has long been applied to DNA sequences, for example to find gene-coding regions. With the first studies came the discovery that DNA sequences are non-stationary: distinct regions require distinct model orders. Since then, Markov and hidden Markov models have been extensively used to describe the gene structure of prokaryotes and eukaryotes. However, to our knowledge, a comprehensive study about the potential of Markov models to describe complete genomes is still lacking. We address this gap in this paper. Our approach relies on (i) multiple competing Markov models of different orders (ii) careful programming techniques that allow orders as large as sixteen (iii) adequate inverted repeat handling (iv) probability estimates suited to the wide range of context depths used. To measure how well a model fits the data at a particular position in the sequence we use the negative logarithm of the probability estimate at that position. The measure yields information profiles of the sequence, which are of independent interest. The average over the entire sequence, which amounts to the average number of bits per base needed to describe the sequence, is used as a global performance measure. Our main conclusion is that, from the probabilistic or information theoretic point of view and according to this performance measure, multiple competing Markov models explain entire genomes almost as well or even better than state-of-the-art DNA compression methods, such as XM, which rely on very different statistical models. This is surprising, because Markov models are local (short-range), contrasting with the statistical models underlying other methods, where the extensive data repetitions in DNA sequences is explored, and therefore have a non-local character

    Protein interaction network of alternatively spliced isoforms from brain links genetic risk factors for autism

    Get PDF
    Increased risk for autism spectrum disorders (ASD) is attributed to hundreds of genetic loci. The convergence of ASD variants have been investigated using various approaches, including protein interactions extracted from the published literature. However, these datasets are frequently incomplete, carry biases and are limited to interactions of a single splicing isoform, which may not be expressed in the disease-relevant tissue. Here we introduce a new interactome mapping approach by experimentally identifying interactions between brain-expressed alternatively spliced variants of ASD risk factors. The Autism Spliceform Interaction Network reveals that almost half of the detected interactions and about 30% of the newly identified interacting partners represent contribution from splicing variants, emphasizing the importance of isoform networks. Isoform interactions greatly contribute to establishing direct physical connections between proteins from the de novo autism CNVs. Our findings demonstrate the critical role of spliceform networks for translating genetic knowledge into a better understanding of human diseases

    Efficient Network Reconstruction from Dynamical Cascades Identifies Small-World Topology of Neuronal Avalanches

    Get PDF
    Cascading activity is commonly found in complex systems with directed interactions such as metabolic networks, neuronal networks, or disease spreading in social networks. Substantial insight into a system's organization can be obtained by reconstructing the underlying functional network architecture from the observed activity cascades. Here we focus on Bayesian approaches and reduce their computational demands by introducing the Iterative Bayesian (IB) and Posterior Weighted Averaging (PWA) methods. We introduce a special case of PWA, cast in nonparametric form, which we call the normalized count (NC) algorithm. NC efficiently reconstructs random and small-world functional network topologies and architectures from subcritical, critical, and supercritical cascading dynamics and yields significant improvements over commonly used correlation methods. With experimental data, NC identified a functional and structural small-world topology and its corresponding traffic in cortical networks with neuronal avalanche dynamics

    Significance testing as perverse probabilistic reasoning

    Get PDF
    Truth claims in the medical literature rely heavily on statistical significance testing. Unfortunately, most physicians misunderstand the underlying probabilistic logic of significance tests and consequently often misinterpret their results. This near-universal misunderstanding is highlighted by means of a simple quiz which we administered to 246 physicians at two major academic hospitals, on which the proportion of incorrect responses exceeded 90%. A solid understanding of the fundamental concepts of probability theory is becoming essential to the rational interpretation of medical information. This essay provides a technically sound review of these concepts that is accessible to a medical audience. We also briefly review the debate in the cognitive sciences regarding physicians' aptitude for probabilistic inference

    Stem Cell Therapy: Pieces of the Puzzle

    Get PDF
    Acute ischemic injury and chronic cardiomyopathies can cause irreversible loss of cardiac tissue leading to heart failure. Cellular therapy offers a new paradigm for treatment of heart disease. Stem cell therapies in animal models show that transplantation of various cell preparations improves ventricular function after injury. The first clinical trials in patients produced some encouraging results, despite limited evidence for the long-term survival of transplanted cells. Ongoing research at the bench and the bedside aims to compare sources of donor cells, test methods of cell delivery, improve myocardial homing, bolster cell survival, and promote cardiomyocyte differentiation. This article reviews progress toward these goals

    Bayes, Thomas (1702–1761)

    No full text

    An Introduction to Parameter Estimation Using Bayesian Probability Theory

    No full text
    . Bayesian probability theory does not define a probability as a frequency of occurrence; rather it defines it as a reasonable degree of belief. Because it does not define a probability as a frequency of occurrence, it is possible to assign probabilities to propositions such as "The probability that the frequency had value ! when the data were taken," or "The probability that hypothesis x is a better description of the data than hypothesis y." Problems of the first type are parameter estimation problems, they implicitly assume the correct model. Problems of the second type are more general, they are model selections problems and do not assume the model. Both types of problems are straight forward applications of the rules of Bayesian probability theory. This paper is a tutorial on parameter estimation. The basic rules for manipulating and assigning probabilities are given and an example, the estimation of a single stationary sinusoidal frequency, is worked in detail. This example is su..
    corecore