Search CORE

Simon Fraser University Institutional Repository

Nature Precedings

Circular RNAs: splicing’s enigma variations

Author: Capel B
Castell&oacute
Chen CY
Ebert MS
Hansen TB
Jeck WR
Matthias W Hentze
Mercer TR
Salzman J
Thomas Preiss
Wu Q
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

Author: B Langmead
B Schmidt
Bertil Schmidt
BH Bloom
Douglas L Maskell
DR Zerbino
E Lindholm
EW Myers
H Shi
H Shi
J Butler
J Nickolls
J Schröder
JC Dohm
JT Simpson
L Fan
L Salmela
MJ Chaisson
P Havlak
PA Pevzner
R Li
RL Warren
S Batzoglou
WR Jeck
X Huang
Y Liu
Y Liu
Y Liu
Y Liu
Yongchao Liu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Next-generation sequencing technologies have led to the high-throughput production of sequence data (reads) at low cost. However, these reads are significantly shorter and more error-prone than conventional Sanger shotgun reads. This poses a challenge for the <it>de novo </it>assembly in terms of assembly quality and scalability for large-scale short read datasets. Results We present DecGPU, the first parallel and distributed error correction algorithm for high-throughput short reads (HTSRs) using a hybrid combination of CUDA and MPI parallel programming models. DecGPU provides CPU-based and GPU-based versions, where the CPU-based version employs coarse-grained and fine-grained parallelism using the MPI and OpenMP parallel programming models, and the GPU-based version takes advantage of the CUDA and MPI parallel programming models and employs a hybrid CPU+GPU computing model to maximize the performance by overlapping the CPU and GPU computation. The distributed feature of our algorithm makes it feasible and flexible for the error correction of large-scale HTSR datasets. Using simulated and real datasets, our algorithm demonstrates superior performance, in terms of error correction quality and execution speed, to the existing error correction algorithms. Furthermore, when combined with Velvet and ABySS, the resulting DecGPU-Velvet and DecGPU-ABySS assemblers demonstrate the potential of our algorithm to improve <it>de novo </it>assembly quality for <it>de</it>-<it>Bruijn</it>-graph-based assemblers. Conclusions DecGPU is publicly available open-source software, written in CUDA C++ and MPI. The experimental results suggest that DecGPU is an effective and feasible error correction algorithm to tackle the flood of short reads produced by next-generation sequencing technologies.</p

Springer - Publisher Connector

Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development

Author: A Ivanov
A Kalsotra
A Kalsotra
A Matsumoto
A Roberts
AD Almeida
B Capel
CH Cho
Charles E. Murry
Chuan Jiang
CY Yu
D Liang
F Jafarifar
J Giudice
J Salzman
J Salzman
J Salzman
JA Gantz
JJ Chong
JJ Turunen
JM Westholm
JU Guo
Julia Salzman
Linda Szabo
Louise C. Laurent
LR Otake
LT Gehman
MA Laflamme
Mana M. Parast
MC Jordan
MC White
Nastaran Afari
Nathan J. Palpant
NF Lahens
NJ Palpant
P McCullagh
Peter L. Wang
PL Wang
R Ashwal-Fluss
RK Bradley
Robert Morey
S Hoffmann
S Markmiller
S Memczak
S Shen
S Starke
SL Paige
SY Ng
TB Hansen
TP van Gurp
WR Jeck
WR Jeck
X Lian
XF Li
XO Zhang
Y Gao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results

Author: Andrey Rzhetsky
CS Keith
D Hernandez
David N. Kuhn
DM Church
DR Zerbino
F Sanger
G Narzisi
Isidore Rigoutsos
J Shendure
JA Reinhardt
JC Dohm
JR Miller
JR Miller
JT Simpson
K Mavromatis
Laxmi Parida
MJ Chaisson
ML Metzker
Niina Haiminen
R Blakesley
R Cronn
R Li
R Li
S Altschul
S DiGuistini
S Gnerre
S Gnerre
S Ossowski
S Rounsley
SL Salzberg
W Zhang
WR Jeck
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness

CiteSeerX

Sequence assembly using next generation sequencing data—challenges and solutions

Author: D Hernandez
DR Kelley
DR Zerbino
EA Rodland
EW Myers
F Sanger
Francis Y. L. Chin
H Leung
HCM Leung
Henry C. M. Leung
J Butler
JC Dohm
JT Simpson
K Salikhov
M Burrows
MJ Chaisson
MJ Chaisson
N Vyahhi
R Li
RL Warren
RW Holley
RW Holley
S. M. Yiu
W Fiers
W Min Jou
WR Jeck
Y Peng
Y Peng
Y Peng
Y Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Assembly complexity of prokaryotic genomes using short reads

Author: A Guénoche
AR Rubinov
B Bollobás
B Haubold
C Smith
Carl Kingsford
D Gusfield
DH Huson
DR Zerbino
Dvan den Broek
E Myers
EW Myers
I Simon
J Butler
J Parkhill
JAA Quitzau
JC Dohm
JP Hutchinson
JP Hutchinson
M Antoniotti
M Margulies
Michael C Schatz
Mihai Pop
MJ Chaisson
MJ Chaisson
MS Waterman
N de Bruijn
N Whiteford
OG Troyanskaya
P Medvedev
PA Pevzner
PA Pevzner
R Barrangou
R Idury
S Batzoglou
T van Aardenne-Ehrenfest
TD Harris
WR Jeck
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes. Results We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for <it>de novo </it>reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages). Conclusions Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.</p

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Digital Repository at the University of Maryland

Comparing De Novo Genome Assembly: The Long and Short of It

Author: A Phillippy
B Mishra
B Schmidt
Bud Mishra
C Alkan
C Aston
D Bryant
D Hernandez
D Schwartz
D Sommer
DR Zerbino
DR Zerbino
EW Myers
F Sanger
FR Blattner
G Narzisi
GG Sutton
Giuseppe Narzisi
IT Paulsen
J Butler
J Tarhio
JC Dohm
JC Mullikin
JM Kidd
JR Miller
JT Simpson
M Antoniotti
M Eppinger
M Hossain
M Wu
MJ Chaisson
P Green
P Medvedev
PA Pevzner
PN Ariyaratne
R Li
RL Warren
RW Hung
S Batzoglou
S Boisvert
S Gnerre
S Kim
S Kurtz
SL Salzberg
SR Gill
SS Hall
Stein Aerts
T Anantharaman
T Baba
TS Anantharaman
TS Anantharaman
WR Jeck
X Huang
X Huang
Publication venue: Public Library of Science
Publication date: 29/04/2011
Field of study

Recent advances in DNA sequencing technology and their focal role in Genome Wide Association Studies (GWAS) have rekindled a growing interest in the whole-genome sequence assembly (WGSA) problem, thereby, inundating the field with a plethora of new formalizations, algorithms, heuristics and implementations. And yet, scant attention has been paid to comparative assessments of these assemblers' quality and accuracy. No commonly accepted and standardized method for comparison exists yet. Even worse, widely used metrics to compare the assembled sequences emphasize only size, poorly capturing the contig quality and accuracy. This paper addresses these concerns: it highlights common anomalies in assembly accuracy through a rigorous study of several assemblers, compared under both standard metrics (N50, coverage, contig sizes, etc.) as well as a more comprehensive metric (Feature-Response Curves, FRC) that is introduced here; FRC transparently captures the trade-offs between contigs' quality against their sizes. For this purpose, most of the publicly available major sequence assemblers – both for low-coverage long (Sanger) and high-coverage short (Illumina) reads technologies – are compared. These assemblers are applied to microbial (Escherichia coli, Brucella, Wolbachia, Staphylococcus, Helicobacter) and partial human genome sequences (Chr. Y), using sequence reads of various read-lengths, coverages, accuracies, and with and without mate-pairs. It is hoped that, based on these evaluations, computational biologists will identify innovative sequence assembly paradigms, bioinformaticists will determine promising approaches for developing “next-generation” assemblers, and biotechnologists will formulate more meaningful design desiderata for sequencing technology platforms. A new software tool for computing the FRC metric has been developed and is available through the AMOS open-source consortium

Evaluating the Fidelity of De Novo Short Read Metagenomic Assembly Using Simulated Data

Author: A Brady
A Charuvaka
A Lopez-Bueno
AE Darling
Andrés Moya
D Hernandez
DB Jaffe
DB Rusch
DC Richter
DD Sommer
DH Haft
DH Huson
DR Zerbino
EW Myers
F Meyer
GG Sutton
GW Tyson
I Letunic
I Maccallum
J Laserson
J Qin
JA Huber
JC Dohm
JC Dohm
JC Wooley
JO Korbel
Jonathan H. Badger
JR Miller
JR Miller
JT Simpson
K Liolios
K Mavromatis
KE Wommack
L Krause
M de la Bastide
M Margulies
M Pop
M Stark
M Wu
Miguel Pignatelli
MJ Chaisson
NN Diaz
OU Nalbantoglu
PJ Turnbaugh
PJ Turnbaugh
R Li
R Seshadri
RD Finn
RL Tatusov
RL Warren
S Batzoglou
S Levy
S Yooseph
SM Huse
SR Gill
T Schoenfeld
TS Ghosh
VM Markowitz
WJ Kent
WR Jeck
X Huang
X Huang
Y Ye
Publication venue: Public Library of Science
Publication date: 23/05/2011
Field of study

A frequent step in metagenomic data analysis comprises the assembly of the sequenced reads. Many assembly tools have been published in the last years targeting data coming from next-generation sequencing (NGS) technologies but these assemblers have not been designed for or tested in multi-genome scenarios that characterize metagenomic studies. Here we provide a critical assessment of current de novo short reads assembly tools in multi-genome scenarios using complex simulated metagenomic data. With this approach we tested the fidelity of different assemblers in metagenomic studies demonstrating that even under the simplest compositions the number of chimeric contigs involving different species is noticeable. We further showed that the assembly process reduces the accuracy of the functional classification of the metagenomic data and that these errors can be overcome raising the coverage of the studied metagenome. The results presented here highlight the particular difficulties that de novo genome assemblers face in multi-genome scenarios demonstrating that these difficulties, that often compromise the functional classification of the analyzed data, can be overcome with a high sequencing effort

The biomarkers of immune dysregulation and inflammation response in Parkinson disease

Author: A Hurny
A Kumar
A Louveau
AJ Svahn
C Cebrián
CH Hsu
CH Stevens
CJ Wheeler
DA Ferrington
E Krüger
ED Ponomarev
F Odoardi
G Wang
GA Smith
H Dong
H González
H Wang
J Neefjes
J Trinh
JA Santiago
JAH Saunders
K Wakabayashi
L Alvarez-Erviti
L Wei
L-M Zhang
M Abe
M Barkhuizen
M Mishto
M Mo
M Orre
MA Nalls
MM Mouradian
MT Heneka
P Zhang
R Saba
RB Postuma
RL Mosley
S Zhang
SC Starossom
SL Chan
T Pringsheim
U Dettmer
V Sanchez-Guajardo
V Sanchez-Guajardo
W Peelaerts
WM Liu
WR Jeck
Y Tsuboi
Y Wei
Á Camarena
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study