Search CORE

10 research outputs found

Optimal Assembly for High Throughput Shotgun Sequencing

Author: Bresler Guy
Bresler Ma'ayan
Tse David
Publication venue
Publication date: 18/02/2013
Field of study

We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. Building on earlier works, we design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes, including the GAGE datasets. The results are based on a set of necessary and sufficient conditions on the DNA sequence and the reads for reconstruction. The conditions can be viewed as the shotgun sequencing analogue of Ukkonen-Pevzner's necessary and sufficient conditions for Sequencing by Hybridization.Comment: 26 pages, 18 figure

arXiv.org e-Print Archive

PubMed Central

eScholarship - University of California

Telescoper: de novo assembly of highly repetitive regions.

Author: Bresler Ma'ayan
Chan Andrew H
Sheehan Sara
Song Yun S
Publication venue: eScholarship, University of California
Publication date: 01/01/2012
Field of study

MotivationWith advances in sequencing technology, it has become faster and cheaper to obtain short-read data from which to assemble genomes. Although there has been considerable progress in the field of genome assembly, producing high-quality de novo assemblies from short-reads remains challenging, primarily because of the complex repeat structures found in the genomes of most higher organisms. The telomeric regions of many genomes are particularly difficult to assemble, though much could be gained from the study of these regions, as their evolution has not been fully characterized and they have been linked to aging.ResultsIn this article, we tackle the problem of assembling highly repetitive regions by developing a novel algorithm that iteratively extends long paths through a series of read-overlap graphs and evaluates them based on a statistical framework. Our algorithm, Telescoper, uses short- and long-insert libraries in an integrated way throughout the assembly process. Results on real and simulated data demonstrate that our approach can effectively resolve much of the complex repeat structures found in the telomeres of yeast genomes, especially when longer long-insert libraries are used.AvailabilityTelescoper is publicly available for download at sourceforge.net/p/[email protected] informationSupplementary data are available at Bioinformatics online

PubMed Central

eScholarship - University of California

Haverford College: Haverford Scholarship

SMaSH: A Benchmarking Toolkit for Human Genome Variant Calling

Author: Bresler Ma'ayan
Curtis Kristal
Hartl Christopher
Jordan Michael I.
Liptrap Jesse
Newcomb Julie
Patterson David
Song Yun S.
Talwalkar Ameet
Terhorst Jonathan
Publication venue
Publication date: 05/01/2014
Field of study

Motivation: Computational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad-hoc and incomplete. Agreement on benchmarking variant calling methods would stimulate development of genomic processing tools and facilitate communication among researchers. Results: We propose SMaSH, a benchmarking methodology for evaluating human genome variant calling algorithms. We generate synthetic datasets, organize and interpret a wide range of existing benchmarking data for real genomes, and propose a set of accuracy and computational performance metrics for evaluating variant calling methods on this benchmarking data. Moreover, we illustrate the utility of SMaSH to evaluate the performance of some leading single nucleotide polymorphism (SNP), indel, and structural variant calling algorithms. Availability: We provide free and open access online to the SMaSH toolkit, along with detailed documentation, at smash.cs.berkeley.edu

arXiv.org e-Print Archive

Crossref

PubMed Central

eScholarship - University of California

Distributed Approach to Maximizing Network Utility

Author: Bresler Ma'ayan
Publication venue
Publication date: 01/01/2006
Field of study

Dataspace

Telescoper: de novo assembly of highly repetitive regions

Author: Bresler Ma'ayan,
Publication venue
Publication date: 15/05/2020
Field of study

Ezid

Towards Robust Multi-Layer Traffic Engineering: Optimization of Congestion Control and Routing

Author: Jennifer Rexford
Jiayue He
Ma'ayan Bresler
Mung Chiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Recommended from our members

Telescoper: de novo assembly of highly repetitive regions.

Author: Bresler Ma'ayan
Chan Andrew H
Sheehan Sara
Song Yun S
Publication venue: eScholarship, University of California
Publication date: 01/09/2012
Field of study

eScholarship - University of California

Recommended from our members

SM a SH: a benchmarking toolkit for human genome variant calling

Author: Bresler Ma'ayan
Curtis Kristal
Hartl Christopher
Jordan Michael I
Liptrap Jesse
Newcomb Julie
Patterson David
Song Yun S
Talwalkar Ameet
Terhorst Jonathan
Publication venue: eScholarship, University of California
Publication date: 01/10/2014
Field of study

MotivationComputational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad hoc and incomplete. Agreement on benchmarking variant calling methods would stimulate development of genomic processing tools and facilitate communication among researchers.ResultsWe propose SMaSH, a benchmarking methodology for evaluating germline variant calling algorithms. We generate synthetic datasets, organize and interpret a wide range of existing benchmarking data for real genomes and propose a set of accuracy and computational performance metrics for evaluating variant calling methods on these benchmarking data. Moreover, we illustrate the utility of SMaSH to evaluate the performance of some leading single-nucleotide polymorphism, indel and structural variant calling algorithms.Availability and implementationWe provide free and open access online to the SMaSH tool kit, along with detailed documentation, at smash.cs.berkeley.ed

eScholarship - University of California

Telescoper: de novo assembly of highly repetitive regions

Author: Alkan
Andrew H. Chan
Ariyaratne
Chaisson
Delcher
Drmanac
Earl
Gnerre
Harris
Iqbal
Kellis
Li
Ma'ayan Bresler
MacCallum
Margulies
McEachern
McKernan
Medvedev
Myers
Parrish
Peng
Pevzner
Rothberg
Salzberg
Sara Sheehan
Simpson
Simpson
Van Nieuwerburgh
Yun S. Song
Zerbino
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Optimal assembly for high throughput shotgun sequencing

Author: D Earl
David Tse
E Myers
E Ukkonen
ES Lander
EW Myers
GG Sutton
Guy Bresler
Iain Maccallum
J Gallant
Kececioglu D John
Koren Sergey
Ma'ayan Bresler
N Nagarajan
P Compeau
P Medvedev
P Medvedev
PA Pevzner
PA Pevzner
R Daniel
R Idury
RL Warren
SA Motahari
Salzberg L Steven
Sante Gnerre
Simpson T Jared
Wikipedia
X Huang
Y Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref