Search CORE

52 research outputs found

Minimum triplet covers of binary phylogenetic X-trees

Author: A Dress
A Dress
A Guénoche
B Leclerc
J Moon
K. T. Huber
KT Huber
M. Steel
N Saitou
V. Moulton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/01/2017
Field of study

Trees with labelled leaves and with all other vertices of degree three play an important role in systematic biology and other areas of classification. A classical combinatorial result ensures that such trees can be uniquely reconstructed from the distances between the leaves (when the edges are given any strictly positive lengths). Moreover, a linear number of these pairwise distance values suffices to determine both the tree and its edge lengths. A natural set of pairs of leaves is provided by any `triplet cover' of the tree (based on the fact that each non-leaf vertex is the median vertex of three leaves). In this paper we describe a number of new results concerning triplet covers of minimum size. In particular, we characterize such covers in terms of an associated graph being a 2-tree. Also, we show that minimum triplet covers are `shellable' and thereby provide a set of pairs for which the inter-leaf distance values will uniquely determine the underlying tree and its associated branch lengths

arXiv.org e-Print Archive

Crossref

UC Research Repository

University of East Anglia digital repository

Safe and complete contig assembly via omnitigs

Author: A Bankevich
A Guénoche
AR Rubinov
AS Motahari
C Kingsford
D Haussler
DR Zerbino
E Kapun
E Kapun
ES Lander
G Bresler
G Narzisi
I Lysov
JD Kececioglu
JR Miller
JT Simpson
JT Simpson
K Lam
K Sahlin
L Salmela
M Boetzer
M Boetzer
N Nagarajan
N Nagarajan
N Vyahhi
P Medvedev
P Medvedev
P Medvedev
PA Pevzner
PA Pevzner
R Chikhi
R Chikhi
R Luo
R Uricaru
RM Idury
SL Salzberg
Publication venue
Publication date: 16/08/2016
Field of study

Contig assembly is the first stage that most assemblers solve when reconstructing a genome from a set of reads. Its output consists of contigs -- a set of strings that are promised to appear in any genome that could have generated the reads. From the introduction of contigs 20 years ago, assemblers have tried to obtain longer and longer contigs, but the following question was never solved: given a genome graph

G

(e.g. a de Bruijn, or a string graph), what are all the strings that can be safely reported from

G

as contigs? In this paper we finally answer this question, and also give a polynomial time algorithm to find them. Our experiments show that these strings, which we call omnitigs, are 66% to 82% longer on average than the popular unitigs, and 29% of dbSNP locations have more neighbors in omnitigs than in unitigs.Comment: Full version of the paper in the proceedings of RECOMB 201

arXiv.org e-Print Archive

Crossref

Comparing performance of algorithms for generating concept lattices

Author: Aho A. V.
Blake C. L.
Bordat J. P.
Carpineto C.
Chein M.
Finn V. K.
Ganter B.
Ganter B.
Goldberg L. A.
Guénoche A.
Kuznetsov S. O.
Kuznetsov S. O.
Kuznetsov S. O.
Lindig C.
Mephu Nguifo E.
Norris E. M.
Obiedkov S. A.
Sergei A. Obiedkov
Sergei O. Kuznetsov
Valtchev P.
Van Der Merwe F. J.
Zabezhailo M. I.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Assembly complexity of prokaryotic genomes using short reads

Author: A Guénoche
AR Rubinov
B Bollobás
B Haubold
C Smith
Carl Kingsford
D Gusfield
DH Huson
DR Zerbino
Dvan den Broek
E Myers
EW Myers
I Simon
J Butler
J Parkhill
JAA Quitzau
JC Dohm
JP Hutchinson
JP Hutchinson
M Antoniotti
M Margulies
Michael C Schatz
Mihai Pop
MJ Chaisson
MJ Chaisson
MS Waterman
N de Bruijn
N Whiteford
OG Troyanskaya
P Medvedev
PA Pevzner
PA Pevzner
R Barrangou
R Idury
S Batzoglou
T van Aardenne-Ehrenfest
TD Harris
WR Jeck
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes. Results We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for <it>de novo </it>reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages). Conclusions Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.</p

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Repository at the University of Maryland