Clustering strings with mutations using an expectation-maximization algorithm In the context of RNA structure prediction

Ponty, Yann; Regnier, Mireille; Saaidi, Afaf

Clustering strings with mutations using an expectation-maximization algorithm In the context of RNA structure prediction

Authors: Yann Ponty
Mireille Regnier
Afaf Saaidi
Publication date: 19 October 2019
Publisher: HAL CCSD

Abstract

International audienceIn comparative analysis, an RNA structure (a set of base pairs and unpaired nucleotides) is predicted from a set of RNA variants (similar sequences) under the assumption of the conservation of the structure during evolution. The combination of RNA variants with Experimental data informing about the local (nucleotide) structure may lead to more accurate structure prediction. The experimental protocol consists of mutating nucleotides likely to be 'unpaired'. A simultaneous reading of RNA variants sequences that underwent the experimental mutation protocol lead to the following issue: How to cluster 'mutated' substrings of similar parent strings such that each substring is correctly assigned to its parent string? We developed an Expectation Maximization algorithm that uses Mutational profiles (mutation distributions) to assign the substrings to their strings of origin

Similar works

Full text

Available Versions

Archive Ouverte en Sciences de l'Information et de la Communication

oai:HAL:hal-02332313v1

Last time updated on 09/11/2019

HAL-Polytechnique

oai:HAL:hal-02332313v1

Last time updated on 05/12/2019