44 research outputs found

    Temporal ordering of substitutions in RNA evolution : uncovering the structural evolution of the human accelerated region 1

    Get PDF
    The Human Accelerated Region 1 (HAR1) is the most rapidly evolving region in the human genome. It is part of two overlapping long non-coding RNAs, has a length of only 118 nucleotides and features 18 human specific changes compared to an ancestral sequence that is extremely well conserved across non-human primates. The human HAR1 forms a stable secondary structure that is strikingly different from the one in chimpanzee as well as other closely related species, again emphasizing its human-specific evolutionary history. This suggests that positive selection has acted to stabilize human-specific features in the ensemble of HAR1 secondary structures. To investigate the evolutionary history of the human HAR1 structure, we developed a computational model that evaluates the relative likelihood of evolutionary trajectories as a probabilistic version of a Hamiltonian path problem. The model predicts that the most likely last step in turning the ancestral primate HAR1 into the human HAR1 was exactly the substitution that distinguishes the modern human HAR1 sequence from that of Denisovan, an archaic human, providing independent support for our model. The MutationOrder software is available for download and can be applied to other instances of RNA structure evolution

    ViennaRNA Package 2.0

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties.</p> <p>Results</p> <p>The <monospace>ViennaRNA</monospace> Package has been a widely used compilation of RNA secondary structure related computer programs for nearly two decades. Major changes in the structure of the standard energy model, the <it>Turner 2004 </it>parameters, the pervasive use of multi-core CPUs, and an increasing number of algorithmic variants prompted a major technical overhaul of both the underlying <monospace>RNAlib</monospace> and the interactive user programs. New features include an expanded repertoire of tools to assess RNA-RNA interactions and restricted ensembles of structures, additional output information such as <it>centroid </it>structures and <it>maximum expected accuracy </it>structures derived from base pairing probabilities, or <it>z</it>-<it>scores </it>for locally stable secondary structures, and support for input in <monospace>fasta</monospace> format. Updates were implemented without compromising the computational efficiency of the core algorithms and ensuring compatibility with earlier versions.</p> <p>Conclusions</p> <p>The <monospace>ViennaRNA Package 2.0</monospace>, supporting concurrent computations <monospace>via OpenMP</monospace>, can be downloaded from <url>http://www.tbi.univie.ac.at/RNA</url>.</p

    Discovering biomarkers for myocardial infarction from SELDI-TOF spectra

    No full text
    Höner zu Siederdissen C, Ragg S, Rahmann S. Discovering biomarkers for myocardial infarction from SELDI-TOF spectra. In: Advances in Data Analysis. Proceedings of the 30th Annual Conference of the Gesellschaft für Klassifikation e.V. Studies in Classification, Data Analysis, and Knowledge Organization. Springer; 2007: 569-576

    Decoding cell-type contributions to the cfRNA transcriptomic landscape of liver cancer

    No full text
    Abstract Background Liquid biopsy, particularly cell-free RNA (cfRNA), has emerged as a promising non-invasive diagnostic tool for various diseases, including cancer, due to its accessibility and the wealth of information it provides. A key area of interest is the composition and cellular origin of cfRNA in the blood and the alterations in the cfRNA transcriptomic landscape during carcinogenesis. Investigating these changes can offer insights into the manifestations of tissue alterations in the blood, potentially leading to more effective diagnostic strategies. However, the consistency of these findings across different studies and their clinical utility remains to be fully elucidated, highlighting the need for further research in this area. Results In this study, we analyzed over 350 blood samples from four distinct studies, investigating the cell type contributions to the cfRNA transcriptomic landscape in liver cancer. We found that an increase in hepatocyte proportions in the blood is a consistent feature across most studies and can be effectively utilized for classifying cancer and healthy samples. Moreover, our analysis revealed that in addition to hepatocytes, liver endothelial cell signatures are also prominent in the observed changes. By comparing the classification performance of cellular proportions to established markers, we demonstrated that cellular proportions could distinguish cancer from healthy samples as effectively as existing markers and can even enhance classification when used in combination with these markers. Conclusions Our comprehensive analysis of liver cell-type composition changes in blood revealed robust effects that help classify cancer from healthy samples. This is especially noteworthy, considering the heterogeneous nature of datasets and the etiological distinctions of samples. Furthermore, the observed differences in results across studies underscore the importance of integrative and comparative approaches in the future research to determine the consistency and robustness of findings. This study contributes to the understanding of cfRNA composition in liver cancer and highlights the potential of cellular deconvolution in liquid biopsy

    Algebraic dynamic programming over general data structures

    No full text
    Background: Dynamic programming algorithms provide exact solutions to many problems in computational biology, such as sequence alignment, RNA folding, hidden Markov models (HMMs), and scoring of phylogenetic trees. Structurally analogous algorithms compute optimal solutions, evaluate score distributions, and perform stochastic sampling. This is explained in the theory of Algebraic Dynamic Programming (ADP) by a strict separation of state space traversal (usually represented by a context free grammar), scoring (encoded as an algebra), and choice rule. A key ingredient in this theory is the use of yield parsers that operate on the ordered input data structure, usually strings or ordered trees. The computation of ensemble properties, such as a posteriori probabilities of HMMs or partition functions in RNA folding, requires the combination of two distinct, but intimately related algorithms, known as the inside and the outside recursion. Only the inside recursions are covered by the classical ADP theory. Results: The ideas of ADP are generalized to a much wider scope of data structures by relaxing the concept of parsing. This allows us to formalize the conceptual complementarity of inside and outside variables in a natural way. We demonstrate that outside recursions are generically derivable from inside decomposition schemes. In addition to rephrasing the well-known algorithms for HMMs, pairwise sequence alignment, and RNA folding we show how the TSP and the shortest Hamiltonian path problem can be implemented efficiently in the extended ADP framework. As a showcase application we investigate the ancient evolution of HOX gene clusters in terms of shortest Hamiltonian paths. Conclusions: The generalized ADP framework presented here greatly facilitates the development and implementation of dynamic programming algorithms for a wide spectrum of applications
    corecore