Search CORE

374 research outputs found

A general coverage theory for shotgun DNA sequencing

Author: Wendl Michael C
Publication venue: Digital Commons@Becker
Publication date: 01/01/2006
Field of study

Algebraic correction methods for computational assessment of clone overlaps in DNA fingerprint mapping

Author: Wendl Michael C
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The Sulston score is a well-established, though approximate metric for probabilistically evaluating postulated clone overlaps in DNA fingerprint mapping. It is known to systematically over-predict match probabilities by various orders of magnitude, depending upon project-specific parameters. Although the exact probability distribution is also available for the comparison problem, it is rather difficult to compute and cannot be used directly in most cases. A methodology providing both improved accuracy and computational economy is required. Results We propose a straightforward algebraic correction procedure, which takes the Sulston score as a provisional value and applies a power-law equation to obtain an improved result. Numerical comparisons indicate dramatically increased accuracy over the range of parameters typical of traditional agarose fingerprint mapping. Issues with extrapolating the method into parameter ranges characteristic of newer capillary electrophoresis-based projects are also discussed. Conclusion Although only marginally more expensive to compute than the raw Sulston score, the correction provides a vastly improved probabilistic description of hypothesized clone overlaps. This will clearly be important in overlap assessment and perhaps for other tasks as well, for example in using the ranking of overlap probabilities to assist in clone ordering.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Characteristics of de novo structural changes in the human genome

Author: et al
Wendl Michael C.
Publication venue: Digital Commons@Becker
Publication date: 01/01/2015
Field of study

Digital Commons@Becker

Algebraic Torsion in Contact Manifolds

Author: A. Weinstein
C. Wendl
C. Wendl
C. Wendl
C.H. Taubes
Chris Wendl
D. Salamon
D.L. Dragnev
D.T. Gay
F. Bourgeois
F. Bourgeois
F. Bourgeois
H. Geiges
H. Geiges
H. Hofer
H. Hofer
H. Hofer
H. Hofer
J.B. Etnyre
Janko Latschev
K. Cieliebak
M. Gromov
M. Hutchings
M. Hutchings
M. Hutchings
M. Hutchings
M.-L. Yau
Michael Hutchings
P. Albers
R. Lutz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/09/2011
Field of study

We extract a nonnegative integer-valued invariant, which we call the "order of algebraic torsion", from the Symplectic Field Theory of a closed contact manifold, and show that its finiteness gives obstructions to the existence of symplectic fillings and exact symplectic cobordisms. A contact manifold has algebraic torsion of order zero if and only if it is algebraically overtwisted (i.e. has trivial contact homology), and any contact 3-manifold with positive Giroux torsion has algebraic torsion of order one (though the converse is not true). We also construct examples for each nonnegative k of contact 3-manifolds that have algebraic torsion of order k but not k - 1, and derive consequences for contact surgeries on such manifolds. The appendix by Michael Hutchings gives an alternative proof of our cobordism obstructions in dimension three using a refinement of the contact invariant in Embedded Contact Homology.Comment: 53 pages, 4 figures, with an appendix by Michael Hutchings; v.3 is a final update to agree with the published paper, and also corrects a minor error that appeared in the published version of the appendi

arXiv.org e-Print Archive

Crossref

UCL Discovery

The theory of discovering rare variants via DNA sequencing

Author: Wendl Michael C
Wilson Richard K
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Rare population variants are known to have important biomedical implications, but their systematic discovery has only recently been enabled by advances in DNA sequencing. The design process of a discovery project remains formidable, being limited to <it>ad hoc </it>mixtures of extensive computer simulation and pilot sequencing. Here, the task is examined from a general mathematical perspective. Results We pose and solve the population sequencing design problem and subsequently apply standard optimization techniques that maximize the discovery probability. Emphasis is placed on cases whose discovery thresholds place them within reach of current technologies. We find that parameter values characteristic of rare-variant projects lead to a general, yet remarkably simple set of optimization rules. Specifically, optimal processing occurs at constant values of the per-sample redundancy, refuting current notions that sample size should be selected outright. Optimal project-wide redundancy and sample size are then shown to be inversely proportional to the desired variant frequency. A second family of constants governs these relationships, permitting one to immediately establish the most efficient settings for a given set of discovery conditions. Our results largely concur with the empirical design of the Thousand Genomes Project, though they furnish some additional refinement. Conclusion The optimization principles reported here dramatically simplify the design process and should be broadly useful as rare-variant projects become both more important and routine in the future.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Statistical aspects of discerning indel-type structural variation via DNA sequence alignment

Author: Wendl Michael C
Wilson Richard K
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Structural variations in the form of DNA insertions and deletions are an important aspect of human genetics and especially relevant to medical disorders. Investigations have shown that such events can be detected via tell-tale discrepancies in the aligned lengths of paired-end DNA sequencing reads. Quantitative aspects underlying this method remain poorly understood, despite its importance and conceptual simplicity. We report the statistical theory characterizing the length-discrepancy scheme for Gaussian libraries, including coverage-related effects that preceding models are unable to account for. Results Deletion and insertion statistics both depend heavily on physical coverage, but otherwise differ dramatically, refuting a commonly held doctrine of symmetry. Specifically, coverage restrictions render insertions much more difficult to capture. Increased read length has the counterintuitive effect of worsening insertion detection characteristics of short inserts. Variance in library insert length is also a critical factor here and should be minimized to the greatest degree possible. Conversely, no significant improvement would be realized in lowering fosmid variances beyond current levels. Detection power is examined under a straightforward alternative hypothesis and found to be generally acceptable. We also consider the proposition of characterizing variation over the entire spectrum of variant sizes under constant risk of false-positive errors. At 1% risk, many designs will leave a significant gap in the 100 to 200 bp neighborhood, requiring unacceptably high redundancies to compensate. We show that a few modifications largely close this gap and we give a few examples of feasible spectrum-covering designs. Conclusion The theory resolves several outstanding issues and furnishes a general methodology for designing future projects from the standpoint of a spectrum-wide constant risk.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Extension of Lander-Waterman theory for sequencing filtered DNA libraries

Author: Barbazuk W Brad
Wendl Michael C
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: The degree to which conventional DNA sequencing techniques will be successful for highly repetitive genomes is unclear. Investigators are therefore considering various filtering methods to select against high-copy sequence in DNA clone libraries. The standard model for random sequencing, Lander-Waterman theory, does not account for two important issues in such libraries, discontinuities and position-based sampling biases (the so-called "edge effect"). We report an extension of the theory for analyzing such configurations. RESULTS: The edge effect cannot be neglected in most cases. Specifically, rates of coverage and gap reduction are appreciably lower than those for conventional libraries, as predicted by standard theory. Performance decreases as read length increases relative to island size. Although opposite of what happens in a conventional library, this apparent paradox is readily explained in terms of the edge effect. The model agrees well with prototype gene-tagging experiments for Zea mays and Sorghum bicolor. Moreover, the associated density function suggests well-defined probabilistic milestones for the number of reads necessary to capture a given fraction of the gene space. An exception for applying standard theory arises if sequence redundancy is less than about 1-fold. Here, evolution of the random quantities is independent of library gaps and edge effects. This observation effectively validates the practice of using standard theory to estimate the genic enrichment of a library based on light shotgun sequencing. CONCLUSION: Coverage performance using a filtered library is significantly lower than that for an equivalent-sized conventional library, suggesting that directed methods may be more critical for the former. The proposed model should be useful for analyzing future projects

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

A General Coverage Theory for Shotgun DNA Sequencing

Author: Michael C. Wendl
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

PolyScan: An automatic indel and SNP detection approach to the analysis of human resequencing data

Author: Chen Ken
Ding Li
Kasai Yumi
McLellan Michael D.
Wendl Michael C.
Publication venue: Digital Commons@Becker
Publication date: 01/01/2007
Field of study

Digital Commons@Becker

Optimized signal deduction procedure for the MIEZE neutron spectroscopy technique

Author: Franz C.
Jochum J. K.
Leiner J. C.
Pfleiderer C.
Soltwedel O.
Spitz L.
Wendl A.
Publication venue
Publication date: 11/03/2021
Field of study

We report a method to determine the phase and amplitude of sinusoidally modulated event rates, binned into 4 bins per oscillation. The presented algorithm relies on a reconstruction of the unknown parameters. It omits a calculation intensive fitting procedure and avoids contrast reduction due to averaging effects. It allows the current data acquisition bottleneck to be relaxed by a factor of 4. Here, we explain the approach in detail and compare it to the established fitting procedures of time series having 4 and 16 time bins per oscillation. In addition we present the empirical estimates of the errors of the three methods and compare them to each other. We show that the reconstruction is unbiased, asymptotic, and efficient for estimating the phase. Reconstructing the contrast, which corresponds to the amplitude of the modulation, is roughly 10% less efficient than fitting 16 time binned oscillations. Finally, we give analytical equations to estimate the error for phase and contrast as a function of their initial values and counting statistics.Comment: 14 pages, 5 figures, submitted to IOP Measurement Science and Technolog

arXiv.org e-Print Archive

PubMed Central

Juelich Shared Electronic Resources