Search CORE

4 research outputs found

Safely Filling Gaps with Partial Solutions Common to All Solutions

Author: Salmela Leena
Tomescu Alexandru I.
Publication venue
Publication date: 01/01/2019
Field of study

Gap filling has emerged as a natural sub-problem of many de novo genome assembly projects. The gap filling problem generally asks for an s-t path in an assembly graph whose length matches the gap length estimate. Several methods have addressed it, but only few have focused on strategies for dealing with multiple gap filling solutions and for guaranteeing reliable results. Such strategies include reporting only unique solutions, or exhaustively enumerating all filling solutions and heuristically creating their consensus. Our main contribution is a new method for reliable gap filling: filling gaps with those sub-paths common to all gap filling solutions. We call these partial solutions safe, following the framework of (Tomescu and Medvedev, RECOMB 2016). We give an efficient safe algorithm running in O(dm) time and space, where d is the gap length estimate and m is the number of edges of the assembly graph. To show the benefits of this method, we implemented this algorithm for the problem of filling gaps in scaffolds. Our experimental results on bacterial and on conservative human assemblies show that, on average, our method can retrieve over 73 percent more safe and correct bases as compared to previous methods, with a similar precision.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Variant genotyping with gap filling

Author: Mäkinen Veli
Salmela Leena
Walve Riku
Publication venue
Publication date: 01/01/2017
Field of study

Peer reviewe

Crossref

Directory of Open Access Journals

Helsingin yliopiston digitaalinen arkisto

An Optimal O(nm) Algorithm for Enumerating All Walks Common to All Closed Edge-covering Walks of a Graph

Author: Acosta Nidia Obscura
Cairo Massimo
Medvedev Paul
Rizzi Romeo
Tomescu Alexandru I.
Publication venue
Publication date: 01/01/2019
Field of study

In this article, we consider the following problem. Given a directed graph G, output all walks of G that are sub-walks of all closed edge-covering walks of G. This problem was first considered by Tomescu and Medvedev (RECOMB 2016), who characterized these walks through the notion of omnitig. Omnitigs were shown to be relevant for the genome assembly problem from bioinformatics, where a genome sequence must be assembled from a set of reads from a sequencing experiment. Tomescu and Medvedev (RECOMB 2016) also proposed an algorithm for listing all maximal omnitigs, by launching an exhaustive visit from every edge. In this article, we prove new insights about the structure of omnitigs and solve several open questions about them. We combine these to achieve an O(nm)-time algorithm for outputting all the maximal omnitigs of a graph (with n nodes and m edges). This is also optimal, as we show families of graphs whose total omnitig length is Omega(nm). We implement this algorithm arid show that it is 9-12 times faster in practice than the one of Tomescu and Medvedev (RECOMB 2016).Peer reviewe

Aaltodoc Publication Archive

Catalogo dei prodotti della ricerca

Helsingin yliopiston digitaalinen arkisto

Safely Filling Gaps with Partial Solutions Common to All Solutions

Author: Alexandru I. Tomescu
Leena Salmela
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref