Search CORE

488 research outputs found

The Fibers and Range of Reduction Graphs in Ciliates

Author: A. Bergeron
A. Ehrenfeucht
Hendrik Jan Hoogeboom
J. Setubal
P. Pevzner
R. Brijder
R. Brijder
R. Brijder
Robert Brijder
S. Hannenhalli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/02/2007
Field of study

The biological process of gene assembly has been modeled based on three types of string rewriting rules, called string pointer rules, defined on so-called legal strings. It has been shown that reduction graphs, graphs that are based on the notion of breakpoint graph in the theory of sorting by reversal, for legal strings provide valuable insights into the gene assembly process. We characterize which legal strings obtain the same reduction graph (up to isomorphism), and moreover we characterize which graphs are (isomorphic to) reduction graphs.Comment: 24 pages, 13 figure

arXiv.org e-Print Archive

Crossref

A New Simulated Annealing Algorithm for the Multiple Sequence Alignment Problem: The approach of Polymers in a Random Media

Author: A. Godzik
D. Gunsfield
J. Kim
M. Hernández-Guía
M. Ishikawa
M. S. Waterman
P. Pevzner
R. Durbin
R. Mulet
S. Geman
S. Rodríguez-Pérez
Publication venue: 'American Physical Society (APS)'
Publication date: 10/01/2005
Field of study

We proposed a probabilistic algorithm to solve the Multiple Sequence Alignment problem. The algorithm is a Simulated Annealing (SA) that exploits the representation of the Multiple Alignment between

D

sequences as a directed polymer in

D

dimensions. Within this representation we can easily track the evolution in the configuration space of the alignment through local moves of low computational cost. At variance with other probabilistic algorithms proposed to solve this problem, our approach allows for the creation and deletion of gaps without extra computational cost. The algorithm was tested aligning proteins from the kinases family. When D=3 the results are consistent with those obtained using a complete algorithm. For

D>3

where the complete algorithm fails, we show that our algorithm still converges to reasonable alignments. Moreover, we study the space of solutions obtained and show that depending on the number of sequences aligned the solutions are organized in different ways, suggesting a possible source of errors for progressive algorithms.Comment: 7 pages and 11 figure

arXiv.org e-Print Archive

Crossref

Structural and Functional Organization of the Vestibular Apparatus in Rats Subjected to Weightlessness for 19.5 Days Aboard the Kosmos-782 Satellite

Author: Aronova M. Z.
Bronshteyn A. A.
Gazenko O. G.
Govardovskiy V. I.
Gribakin G. G.
Kharkeyevich T. A.
Pevzner R. A.
Titova L. K.
Tsirulis T. P.
Vinnikov Y. A.
Publication venue
Publication date
Field of study

The vestibular apparatus was investigated in rats subjected to weightlessness for 19.5 days. The vestibular apparatus was removed and its sections were fixed in a glutaraldehyde solution for investigation by light and electron microscopes. Structural and functional charges were noted in the otolith portions of the ear, with the otolith particles clinging to the utricular receptor surface and with the peripheral arrangement of the nucleolus in the nuclei of the receptor cells. It is possible that increased edema of the vestibular tissue resulted in the destruction of some receptor cells and in changes in the form and structure of the otolith. In the horizontal crista, the capula was separated

NASA Technical Reports Server

Expected length of the longest common subsequence for large alphabets

Author: A. Frieze
A. Vershik
B. Bollobás
B. Logan
C. Schensted
D. Aldous
J. Baik
J. Gravner
J.F.C. Kingman
K. Johannson
M. Kiwi
P. Erdös
P. Pevzner
R. Baeza-Yates
R. Stanley
S. Janson
S. Ulam
V. Chvátal
Publication venue
Publication date: 01/01/2003
Field of study

We consider the length L of the longest common subsequence of two randomly uniformly and independently chosen n character words over a k-ary alphabet. Subadditivity arguments yield that the expected value of L, when normalized by n, converges to a constant C_k. We prove a conjecture of Sankoff and Mainville from the early 80's claiming that C_k\sqrt{k} goes to 2 as k goes to infinity.Comment: 14 pages, 1 figure, LaTe

arXiv.org e-Print Archive

CiteSeerX

Crossref

Safe and complete contig assembly via omnitigs

Author: A Bankevich
A Guénoche
AR Rubinov
AS Motahari
C Kingsford
D Haussler
DR Zerbino
E Kapun
E Kapun
ES Lander
G Bresler
G Narzisi
I Lysov
JD Kececioglu
JR Miller
JT Simpson
JT Simpson
K Lam
K Sahlin
L Salmela
M Boetzer
M Boetzer
N Nagarajan
N Nagarajan
N Vyahhi
P Medvedev
P Medvedev
P Medvedev
PA Pevzner
PA Pevzner
R Chikhi
R Chikhi
R Luo
R Uricaru
RM Idury
SL Salzberg
Publication venue
Publication date: 16/08/2016
Field of study

Contig assembly is the first stage that most assemblers solve when reconstructing a genome from a set of reads. Its output consists of contigs -- a set of strings that are promised to appear in any genome that could have generated the reads. From the introduction of contigs 20 years ago, assemblers have tried to obtain longer and longer contigs, but the following question was never solved: given a genome graph

G

(e.g. a de Bruijn, or a string graph), what are all the strings that can be safely reported from

G

as contigs? In this paper we finally answer this question, and also give a polynomial time algorithm to find them. Our experiments show that these strings, which we call omnitigs, are 66% to 82% longer on average than the popular unitigs, and 29% of dbSNP locations have more neighbors in omnitigs than in unitigs.Comment: Full version of the paper in the proceedings of RECOMB 201

arXiv.org e-Print Archive

Crossref

Anyui Volcano in Chukotka: Age, structure, pecularities of rocks' composition and eruptions

Author: Fedorov P. I.
Gertsev D. O.
Kushcheva Y.-U. V.
Pevzner M. M.
Romanenko F. A.
Publication venue
Publication date: 01/01/2017
Field of study

The study of lavas and pyroclastics from Anyui Volcano made it possible to reconstruct succession of its eruption events. The age of the eruption is estimated by isotopic methods to be 0.248 ± 0.030 Ma. It is established that the last episode of volcanic activity in northeastern Russia occurred 0.2‒0.5 Ma ago (in its continental part, 0.2‒0.3 Ma ago). This episode is chronologically close to the last peak in activation of volcanism in the Arctic and Subarctic regions. The absence of features indicating glacial influence on lavas from Anyui Volcano provides grounds for an assumption that no significant glaciations took place in the continental areas of western Chukotka during the last 250 ka

ZENODO

Parking functions, labeled trees and DCJ sorting scenarios

Author: A. Bergeron
A. McLysaght
A.C. Siepel
A.G. Konheim
A.W. Xu
D. Sankoff
D. Sankoff
E. Barcucci
I. Miklós
I. Miklós
M. Ozery-flato
M.D.V. Braga
P. Pevzner
R.P. Stanley
R.P. Stanley
R.P. Stanley
S. Bérard
S. Yancopoulos
Y. Ajana
Y. Diekmann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

In genome rearrangement theory, one of the elusive questions raised in recent years is the enumeration of rearrangement scenarios between two genomes. This problem is related to the uniform generation of rearrangement scenarios, and the derivation of tests of statistical significance of the properties of these scenarios. Here we give an exact formula for the number of double-cut-and-join (DCJ) rearrangement scenarios of co-tailed genomes. We also construct effective bijections between the set of scenarios that sort a cycle and well studied combinatorial objects such as parking functions and labeled trees.Comment: 12 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Stationary Distribution and Eigenvalues for a de Bruijn Process

Author: Abbas Alhakim
Anthony Ralston
B. Nooten Van
Donald E. Knuth
Haiyan Chen
Herbert S. Wilf
J. Sherman
N. G. Bruijn
P. Flajolet
Pavel A. Pevzner
R A Blythe
R. Dawson
T. Aardenne-Ehrenfest van
T. Mori
V. V. Strok
W. T. Tutte
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2011
Field of study

We define a de Bruijn process with parameters n and L as a certain continuous-time Markov chain on the de Bruijn graph with words of length L over an n-letter alphabet as vertices. We determine explicitly its steady state distribution and its characteristic polynomial, which turns out to decompose into linear factors. In addition, we examine the stationary state of two specializations in detail. In the first one, the de Bruijn-Bernoulli process, this is a product measure. In the second one, the Skin-deep de Bruin process, the distribution has constant density but nontrivial correlation functions. The two point correlation function is determined using generating function techniques.Comment: Dedicated to Herb Wilf on the occasion of his 80th birthda

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Limited Lifespan of Fragile Regions in Mammalian Evolution

Author: A. Bergeron
A. Bhutkar
A. Kulemzina
A. Ruiz-Herrera
A. Ruiz-Herrera
A.E. Wind van der
C. Webber
D. Larkin
D. Misceo
D. San Mauro
D. Sankoff
D. Sankoff
D.M. Larkin
D.M. Larkin
E. Mlynarski
E. Mongin
E.E. Eichler
G. Fertin
H. Hinsch
H. Kikuta
H. Zhao
J. Ma
J. Ma
J.H. Nadeau
L. Armengol
L. Gordon
M. Caceres
M. Longo
M.A. Alekseyev
M.A. Alekseyev
M.A. Alekseyev
M.A. Alekseyev
M.R. Mehan
O. Lecompte
P. Pevzner
P.A. Pevzner
R. Koszul
S. Myers
S. Ohno
S. Yancopoulos
S. Zhao
W.J. Kent
W.J. Murphy
Y. Yue
Z. Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

An important question in genome evolution is whether there exist fragile regions (rearrangement hotspots) where chromosomal rearrangements are happening over and over again. Although nearly all recent studies supported the existence of fragile regions in mammalian genomes, the most comprehensive phylogenomic study of mammals (Ma et al. (2006) Genome Research 16, 1557-1565) raised some doubts about their existence. We demonstrate that fragile regions are subject to a "birth and death" process, implying that fragility has limited evolutionary lifespan. This finding implies that fragile regions migrate to different locations in different mammals, explaining why there exist only a few chromosomal breakpoints shared between different lineages. The birth and death of fragile regions phenomenon reinforces the hypothesis that rearrangements are promoted by matching segmental duplications and suggests putative locations of the currently active fragile regions in the human genome

arXiv.org e-Print Archive

CiteSeerX

Crossref

Reconstructing cancer genomes from paired-end sequencing data

Author: A Kotzig
AA Steinhardt
Anna Ritz
AR Quinlan
B Raphael
Benjamin J Raphael
BJ Druker
BJ Raphael
BJ Raphael
C Greenman
CD Greenman
CK Ng
D Hochbaum
DG Albertson
DR Bentley
DY Chiang
E Tuzun
ER Mardis
F Hormozdiari
JO Korbel
K Chen
Layla Oesper
LE Kelemen
M Meyerson
MA Alekseyev
MC Schatz
P Kauraniemi
P Medvedev
P Medvedev
P Medvedev
P Pevzner
PA Pevzner
PA Pevzner
PA Pevzner
PJ Campbell
PJ Stephens
R Wittler
R Xi
RE Mills
Ryan Drebin
S Durinck
S Hannenhalli
S Sindi
S Takakura
S Volik
S Yoon
SA Moestue
Sarah J Aerni
Y Jung
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. Results By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. Conclusions We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at <url>http://compbio.cs.brown.edu/software/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central