Search CORE

66 research outputs found

Observation of Sommerfeld precursors on a fluid surface

Author: A. Sommerfeld
Claude Laroche
G. B. Whitham
H. Jeffreys
H. Lamb
H. C. Kranzer
I. Kececioglu
J. Aaviksoo
J. D. Jackson
J. E. Prins
J. J. Stoker
L. Brillouin
L. Brillouin
M. S. Smith
P. Flaud
R. Albanese
Stéphan Fauve
T. B. Moodie
T. H. Havelock
V. I. Karpman
Éric Falcon
Publication venue: 'American Physical Society (APS)'
Publication date: 04/07/2003
Field of study

We report the observation of two types of Sommerfeld precursors (or forerunners) on the surface of a layer of mercury. When the fluid depth increases, we observe a transition between these two precursor surface waves in good agreement with the predictions of asymptotic analysis. At depths thin enough compared to the capillary length, high frequency precursors propagate ahead of the ''main signal'' and their period and amplitude, measured at a fixed point, increase in time. For larger depths, low frequency ''precursors'' follow the main signal with decreasing period and amplitude. These behaviors are understood in the framework of the analysis first introduced for linear transient electromagnetic waves in a dielectric medium by Sommerfeld and Brillouin [1].Comment: to be published in Physical Review Letter

arXiv.org e-Print Archive

Crossref

CERN Document Server

Safe and complete contig assembly via omnitigs

Author: A Bankevich
A Guénoche
AR Rubinov
AS Motahari
C Kingsford
D Haussler
DR Zerbino
E Kapun
E Kapun
ES Lander
G Bresler
G Narzisi
I Lysov
JD Kececioglu
JR Miller
JT Simpson
JT Simpson
K Lam
K Sahlin
L Salmela
M Boetzer
M Boetzer
N Nagarajan
N Nagarajan
N Vyahhi
P Medvedev
P Medvedev
P Medvedev
PA Pevzner
PA Pevzner
R Chikhi
R Chikhi
R Luo
R Uricaru
RM Idury
SL Salzberg
Publication venue
Publication date: 16/08/2016
Field of study

Contig assembly is the first stage that most assemblers solve when reconstructing a genome from a set of reads. Its output consists of contigs -- a set of strings that are promised to appear in any genome that could have generated the reads. From the introduction of contigs 20 years ago, assemblers have tried to obtain longer and longer contigs, but the following question was never solved: given a genome graph

G

(e.g. a de Bruijn, or a string graph), what are all the strings that can be safely reported from

G

as contigs? In this paper we finally answer this question, and also give a polynomial time algorithm to find them. Our experiments show that these strings, which we call omnitigs, are 66% to 82% longer on average than the popular unitigs, and 29% of dbSNP locations have more neighbors in omnitigs than in unitigs.Comment: Full version of the paper in the proceedings of RECOMB 201

arXiv.org e-Print Archive

Crossref

Dependence of paracentric inversion rate on tract length

Author: A Brehm
AH Sturtevant
B Larget
GA Watterson
I Miklos
J Kececioglu
K Yogeeswaran
M Caceres
R Durrett
R Pinter
Rasmus Nielsen
Rick Durrett
S Hannenhalli
Thomas L York
TL York
V Bafna
WJ Kent
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: We develop a Bayesian method based on MCMC for estimating the relative rates of pericentric and paracentric inversions from marker data from two species. The method also allows estimation of the distribution of inversion tract lengths. RESULTS: We apply the method to data from Drosophila melanogaster and D. yakuba. We find that pericentric inversions occur at a much lower rate compared to paracentric inversions. The average paracentric inversion tract length is approx. 4.8 Mb with small inversions being more frequent than large inversions. If the two breakpoints defining a paracentric inversion tract are uniformly and independently distributed over chromosome arms there will be more short tract-length inversions than long; we find an even greater preponderance of short tract lengths than this would predict. Thus there appears to be a correlation between the positions of breakpoints which favors shorter tract lengths. CONCLUSION: The method developed in this paper provides the first statistical estimator for estimating the distribution of inversion tract lengths from marker data. Application of this method for a number of data sets may help elucidate the relationship between the length of an inversion and the chance that it will get accepted

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Copenhagen University Research Information System

eScholarship - University of California

Viral population estimation using pyrosequencing

Author: A Dempster
A Rambaut
AMN Tsibris
B Gaschen
Baback Gharizadeh
C Wang
Chunlin Wang
D O'Meara
DC Douek
E Domingo
E Halperin
EH Simpson
ES Lander
Glenn Tesler
GS Gottlieb
GW Tyson
H Fakhrai-Rad
I Malet
IM Rouzine
J Kececioglu
JE Hopcroft
JF Simons
K Chen
KJ Metzner
L Bacheler
L Doukhan
L Excoffier
Lior Pachter
LR Ford
M Breitbart
M Eigen
M Margulies
M Stephens
MA Nowak
MJ Gonzales
ML Collins
ML Sogin
Mostafa Ronaghi
MT Tammi
N Beerenwinkel
Nicholas Eriksson
Niko Beerenwinkel
P Jenkins
PA Pevzner
R Schmid
R Shankarappa
Robert W. Shafer
RP Dilworth
S Huse
S-Y Rhee
S-Y Rhee
Soo-Yon Rhee
VA Johnson
Yumi Mitsuya
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2008
Field of study

The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an EM algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.Comment: 23 pages, 13 figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

PubMed Central

Caltech Authors

Dislocation-induced spin tunneling in Mn-12 acetate

Comprehensive theory of quantum spin relaxation in Mn-12 acetate crystals is developed, that takes into account imperfections of the crystal structure and is based upon the generalization of the Landau-Zener effect for incoherent tunneling from excited energy levels. It is shown that linear dislocations at plausible concentrations provide the transverse anisotropy which is the main source of tunneling in Mn-12. Local rotations of the easy axis due to dislocations result in a transverse magnetic field generated by the field applied along the c-axis of the crystal, which explains the presence of odd tunneling resonances. Long-range deformations due to dislocations produce a broad distribution of tunnel splittings. The theory predicts that at subkelvin temperatures the relaxation curves for different tunneling resonances can be scaled onto a single master curve. The magnetic relaxation in the thermally activated regime follows the stretched-exponential law with the exponent depending on the field, temperature, and concentration of defects.Comment: 17 pages, 14 figures, 1 table, submitted to PR

arXiv.org e-Print Archive

Crossref

How reliably can we predict the reliability of protein structure predictions?

Author: A Drummond
A Krogh
A Löytynoja
A Löytynoja
B Knudsen
B Redelings
Balázs Dombai
D Gusfield
D Kneller
D Metzler
DF Feng
F Ronquist
G Lunter
G Lunter
H Zhou
I Holmes
I Holmes
I Holmes
I Holmes
I Miklós
István Miklós
J Felsenstein
J Garnier
J Kececioglu
J Skolnick
JL Thorne
JL Thorne
Jotun Hein
K Karplus
K Mizuguchi
K Mizuguchi
L Wang
M Dayhoff
M Suchard
M Waterman
M Waterman
N Goldman
N Metropolis
O Gotoh
P Hogeweg
R Bradley
R Durbin
R Fleissner
S Eddy
S Wu
SB Needleman
T Hubbard
TF Smith
W Hastings
W Press
Ádám Novák '
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: Comparative methods have been the standard techniques for in silico protein structure prediction. The prediction is based on a multiple alignment that contains both reference sequences with known structures and the sequence whose unknown structure is predicted. Intensive research has been made to improve the quality of multiple alignments, since misaligned parts of the multiple alignment yield misleading predictions. However, sometimes all methods fail to predict the correct alignment, because the evolutionary signal is too weak to find the homologous parts due to the large number of mutations that separate the sequences. Results: Stochastic sequence alignment methods define a posterior distribution of possible multiple alignments. They can highlight the most likely alignment, and above that, they can give posterior probabilities for each alignment column. We made a comprehensive study on the HOMSTRAD database of structural alignments, predicting secondary structures in four different ways. We showed that alignment posterior probabilities correlate with the reliability of secondary structure predictions, though the strength of the correlation is different for different protocols. The correspondence between the reliability of secondary structure predictions and alignment posterior probabilities is the closest to the identity function when the secondary structure posterior probabilities are calculated from the posterior distribution of multiple alignments. The largest deviation from the identity function has been obtained in the case of predicting secondary structures from a single optimal pairwise alignment. We also showed that alignment posterior probabilities correlate with the 3D distances between C α amino acids in superimposed tertiary structures. Conclusion: Alignment posterior probabilities can be used to a priori detect errors in comparative models on the sequence alignment level. </p

CiteSeerX

Crossref

SZTAKI Publication Repository

Springer - Publisher Connector

PubMed Central

Oxford University Research Archive

ELTE Digital Institutional Repository (EDIT)

Optimizing substitution matrix choice and gap parameters for sequence alignment

Author: CB Do
CB Do
CN Dewey
D Gusfield
DT Jones
E Kim
G Blackshields
GA Price
GH Gonnet
I Van Walle
J Flannick
J Kececioglu
J Pei
JD Thompson
JD Thompson
JG Henikoff
K Katoh
M Box
MA Larkin
MO Dayhoff
MP Styczynski
MS Waterman
O Chapelle
RC Edgar
RC Edgar
Robert C Edgar
S Henikoff
T Lassmann
T Muller
T Muller
TM Phuong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background While substitution matrices can readily be computed from reference alignments, it is challenging to compute optimal or approximately optimal gap penalties. It is also not well understood which substitution matrices are the most effective when alignment accuracy is the goal rather than homolog recognition. Here a new parameter optimization procedure, POP, is described and applied to the problems of optimizing gap penalties and selecting substitution matrices for pair-wise global protein alignments. Results POP is compared to a recent method due to Kim and Kececioglu and found to achieve from 0.2% to 1.3% higher accuracies on pair-wise benchmarks extracted from BALIBASE. The VTML matrix series is shown to be the most accurate on several global pair-wise alignment benchmarks, with VTML200 giving best or close to the best performance in all tests. BLOSUM matrices are found to be slightly inferior, even with the marginal improvements in the bug-fixed RBLOSUM series. The PAM series is significantly worse, giving accuracies typically 2% less than VTML. Integer rounding is found to cause slight degradations in accuracy. No evidence is found that selecting a matrix based on sequence divergence improves accuracy, suggesting that the use of this heuristic in CLUSTALW may be ineffective. Using VTML200 is found to improve the accuracy of CLUSTALW by 8% on BALIBASE and 5% on PREFAB. Conclusion The hypothesis that more accurate alignments of distantly related sequences may be achieved using low-identity matrices is shown to be false for commonly used matrix types. Source code and test data is freely available from the author's web site at <url>http://www.drive5.com/pop</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

LOCAS – A Low Coverage Assembly Tool for Resequencing Projects

Author: A Doring
AR Quinlan
B Langmead
C Nusbaum
D Hernandez
D Weigel
Daniel H. Huson
DC Richter
Detlef Weigel
DR Zerbino
EW Myers
H Li
H Li
I Birol
JD Kececioglu
JO Korbel
JT Simpson
Juliane D. Klein
K Schneeberger
K Schneeberger
Korbinian Schneeberger
LE Palmer
M Pop
M Pop
MC Wendl
MJ Chaisson
PA Pevzner
R Li
R Li
RM Durbin
S Ossowski
SL Salzberg
SM Rumble
SQ Le
Stephan Ossowski
T Rausch
Ying Xu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Motivation: Next Generation Sequencing (NGS) is a frequently applied approach to detect sequence variations between highly related genomes. Recent large-scale re-sequencing studies as the Human 1000 Genomes Project utilize NGS data of low coverage to afford sequencing of hundreds of individuals. Here, SNPs and micro-indels can be detected by applying an alignment-consensus approach. However, computational methods capable of discovering other variations such as novel insertions or highly diverged sequence from low coverage NGS data are still lacking. Results: We present LOCAS, a new NGS assembler particularly designed for low coverage assembly of eukaryotic genomes using a mismatch sensitive overlap-layout-consensus approach. LOCAS assembles homologous regions in a homologyguided manner while it performs de novo assemblies of insertions and highly polymorphic target regions subsequently to an alignment-consensus approach. LOCAS has been evaluated in homology-guided assembly scenarios with low sequence coverage of Arabidopsis thaliana strains sequenced as part of the Arabidopsis 1001 Genomes Project. While assembling the same amount of long insertions as state-of-the-art NGS assemblers, LOCAS showed best results regarding contig size, error rate and runtime. Conclusion: LOCAS produces excellent results for homology-guided assembly of eukaryotic genomes with short reads and low sequencing depth, and therefore appears to be the assembly tool of choice for the detection of novel sequenc

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

MPG.PuRe

ScholarBank@NUS