Search CORE

185 research outputs found

QSRA – a quality-value guided de novo short read assembler

Author: D Hernandez
Douglas W Bryant
DR Zerbino
J Butler
J Dohm
J Kent
MJ Chaisson
NG de Bruijn
R Cronn
R Warren
Todd C Mockler
W Jeck
Weng-Keen Wong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data. Results We have designed and implemented an assembler, Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality. Conclusion QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae

Author: Baltrus D. A.
Dangl J. L.
Jeck W. R.
Jones C. D.
Nishimura M. T.
Reinhardt J. A.
Publication venue
Publication date: 01/01/2008
Field of study

We developed a novel approach for de novo genome assembly using only sequence data from high-throughput short read sequencing technologies. By combining data generated from 454 Life Sciences (Roche) and Illumina (formerly known as Solexa sequencing) sequencing platforms, we reliably assembled genomes into large scaffolds at a fraction of the traditional cost and without use of a reference sequence. We applied this method to two isolates of the phytopathogenic bacteria Pseudomonas syringae. Sequencing and reassembly of the well-studied tomato and Arabidopsis pathogen, PtoDC3000, facilitated development and testing of our method. Sequencing of a distantly related rice pathogen, Por1_6, demonstrated our method's efficacy for de novo assembly of novel genomes. Our assembly of Por1_6 yielded an N50 scaffold size of 531,821 bp with >75% of the predicted genome covered by scaffolds over 100,000 bp. One of the critical phenotypic differences between strains of P. syringae is the range of plant hosts they infect. This is largely determined by their complement of type III effector proteins. The genome of Por1_6 is the first sequenced for a P. syringae isolate that is a pathogen of monocots, and, as might be predicted, its complement of type III effectors differs substantially from the previously sequenced isolates of this species. The genome of Por1_6 helps to define an expansion of the P. syringae pan-genome, a corresponding contraction of the core genome, and a further diversification of the type III effector complement for this important plant pathogen species

Carolina Digital Repository

Circular RNAs are abundant, conserved, and associated with ALU repeats

Author: Burd C. E.
Jeck W. R.
Liu J.
Marzluff W. F.
Sharpless N. E.
Slevin M. K.
Sorrentino J. A.
Wang K.
Publication venue
Publication date: 01/01/2013
Field of study

Circular RNAs composed of exonic sequence have been described in a small number of genes. Thought to result from splicing errors, circular RNA species possess no known function. To delineate the universe of endogenous circular RNAs, we performed high-throughput sequencing (RNA-seq) of libraries prepared from ribosome-depleted RNA with or without digestion with the RNA exonuclease, RNase R. We identified >25,000 distinct RNA species in human fibroblasts that contained non-colinear exons (a “backsplice”) and were reproducibly enriched by exonuclease degradation of linear RNA. These RNAs were validated as circular RNA (ecircRNA), rather than linear RNA, and were more stable than associated linear mRNAs in vivo. In some cases, the abundance of circular molecules exceeded that of associated linear mRNA by >10-fold. By conservative estimate, we identified ecircRNAs from 14.4% of actively transcribed genes in human fibroblasts. Application of this method to murine testis RNA identified 69 ecircRNAs in precisely orthologous locations to human circular RNAs. Of note, paralogous kinases HIPK2 and HIPK3 produce abundant ecircRNA from their second exon in both humans and mice. Though HIPK3 circular RNAs contain an AUG translation start, it and other ecircRNAs were not bound to ribosomes. Circular RNAs could be degraded by siRNAs and, therefore, may act as competing endogenous RNAs. Bioinformatic analysis revealed shared features of circularized exons, including long bordering introns that contained complementary ALU repeats. These data show that ecircRNAs are abundant, stable, conserved and nonrandom products of RNA splicing that could be involved in control of gene expression

PubMed Central

Carolina Digital Repository

Analysis of quality raw data of second generation sequencers with Quality Assessment Software

Author: A Smith
Adriana R Carneiro
Artur Silva
B Ewing
B Ewing
D Gordon
D Hernandez
DR Zerbino
DW Bryant
E Lande
H Li
J Butler
J Dohm
Jan Baumbach
M Chaisson
Maria PC Schneider
Rommel TJ Ramos
S Bentley
SC Schuster
V Pandey
Vasco Azevedo
W Jeck
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Second generation technologies have advantages over Sanger; however, they have resulted in new challenges for the genome construction process, especially because of the small size of the reads, despite the high degree of coverage. Independent of the program chosen for the construction process, DNA sequences are superimposed, based on identity, to extend the reads, generating contigs; mismatches indicate a lack of homology and are not included. This process improves our confidence in the sequences that are generated. Findings We developed Quality Assessment Software, with which one can review graphs showing the distribution of quality values from the sequencing reads. This software allow us to adopt more stringent quality standards for sequence data, based on quality-graph analysis and estimated coverage after applying the quality filter, providing acceptable sequence coverage for genome construction from short reads. Conclusions Quality filtering is a fundamental step in the process of constructing genomes, as it reduces the frequency of incorrect alignments that are caused by measuring errors, which can occur during the construction process due to the size of the reads, provoking misassemblies. Application of quality filters to sequence data, using the software Quality Assessment, along with graphing analyses, provided greater precision in the definition of cutoff parameters, which increased the accuracy of genome construction.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results

Author: Andrey Rzhetsky
CS Keith
D Hernandez
David N. Kuhn
DM Church
DR Zerbino
F Sanger
G Narzisi
Isidore Rigoutsos
J Shendure
JA Reinhardt
JC Dohm
JR Miller
JR Miller
JT Simpson
K Mavromatis
Laxmi Parida
MJ Chaisson
ML Metzker
Niina Haiminen
R Blakesley
R Cronn
R Li
R Li
S Altschul
S DiGuistini
S Gnerre
S Gnerre
S Ossowski
S Rounsley
SL Salzberg
W Zhang
WR Jeck
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Sequence assembly using next generation sequencing data—challenges and solutions

Author: D Hernandez
DR Kelley
DR Zerbino
EA Rodland
EW Myers
F Sanger
Francis Y. L. Chin
H Leung
HCM Leung
Henry C. M. Leung
J Butler
JC Dohm
JT Simpson
K Salikhov
M Burrows
MJ Chaisson
MJ Chaisson
N Vyahhi
R Li
RL Warren
RW Holley
RW Holley
S. M. Yiu
W Fiers
W Min Jou
WR Jeck
Y Peng
Y Peng
Y Peng
Y Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Bartter- and Gitelman-like syndromes: salt-losing tubulopathies with loop or DCT defects

Author: A Bettinelli
A Caltik
A Chrispal
A Deng
A Fanconi
A Leonhardt
A Ohlsson
A Rudin
A Stolpe van de
AD Schachter
BK Kramer
C Malafronte
C Seidel
CA Pressler
D Bockenhauer
DB Simon
DB Simon
DB Simon
DB Simon
DG Bichet
DN Cruz
DR Phillips
DS Bell
E Puricelli
E Riveira-Munoz
F Bartter
F Bartter
F Emma
G Colussi
G Finer
G Talosi
H Seyberth
Hannsjörg W. Seyberth
HJ Gitelman
HW Seyberth
HW Seyberth
HW Seyberth
I Kurtz
I Zelikovic
J Peti-Peterdi
J Rodriguez-Soriano
J Rodriguez-Soriano
J Schnermann
JW Hollifield
K Brochard
K Nozu
K Nozu
K Panichpisal
Karl P. Schlingmann
KG Hufnagle
KP Schlingmann
KP Schlingmann
L Casatta
L Karolyi
LC Liaw
M Ito
M Konrad
M Kömhoff
M Kömhoff
M Peters
MC Sassen
MH Vaisbich
MR Pollak
N Besbas
N Jeck
N Jeck
N Jeck
NV Knoers
R Birkenhager
R Kleta
R Kleta
R Scognamiglio
R Vargas-Poussou
RJ Hene
RJ Unwin
RM Nüsing
RM Nüsing
RM Nüsing
RT Pachulski
S Reinalter
S Watanabe
SC Hebert
SC Reinalter
SC Reinalter
SC Reinalter
SH Lin
SJ Schurman
T Nijenhuis
TJ Smilde
TM Yu
U Gladziwa
UI Scholl
V Chadha
Publication venue: Springer-Verlag
Publication date
Field of study

Salt-losing tubulopathies with secondary hyperaldosteronism (SLT) comprise a set of well-defined inherited tubular disorders. Two segments along the distal nephron are primarily involved in the pathogenesis of SLTs: the thick ascending limb of Henle’s loop, and the distal convoluted tubule (DCT). The functions of these pre- and postmacula densa segments are quite distinct, and this has a major impact on the clinical presentation of loop and DCT disorders – the Bartter- and Gitelman-like syndromes. Defects in the water-impermeable thick ascending limb, with its greater salt reabsorption capacity, lead to major salt and water losses similar to the effect of loop diuretics. In contrast, defects in the DCT, with its minor capacity of salt reabsorption and its crucial role in fine-tuning of urinary calcium and magnesium excretion, provoke more chronic solute imbalances similar to the effects of chronic treatment with thiazides. The most severe disorder is a combination of a loop and DCT disorder similar to the enhanced diuretic effect of a co-medication of loop diuretics with thiazides. Besides salt and water supplementation, prostaglandin E2-synthase inhibition is the most effective therapeutic option in polyuric loop disorders (e.g., pure furosemide and mixed furosemide–amiloride type), especially in preterm infants with severe volume depletion. In DCT disorders (e.g., pure thiazide and mixed thiazide–furosemide type), renin–angiotensin–aldosterone system (RAAS) blockers might be indicated after salt, potassium, and magnesium supplementation are deemed insufficient. It appears that in most patients with SLT, a combination of solute supplementation with some drug treatment (e.g., indomethacin) is needed for a lifetime

Crossref

PubMed Central

NMR study of local magnetizations in diluted two-dimensional antiferromagnets

Author: A. F. M. Arts
B. J. Dikken
B. J. Dikken
C. Dekker
D. J. Breed
H. Ikeda
H. Ikeda
H. van der Vlist
H. W. de Wijn
H. W. de Wijn
H. W. de Wijn
J. A. van Luijk
J. P. Gosso
K. N. Shrivastava
L. Onsager
R. A. Cowley
R. J. Birgeneau
R. K. Jeck
Publication venue: 'American Physical Society (APS)'
Publication date
Field of study

Crossref

Mutations in isocitrate dehydrogenase 1 and 2 occur frequently in intrahepatic cholangiocarcinomas and share hypermethylation targets with glioblastomas

Author: Andersen J. B.
Auman J. T.
Chiang D. Y.
Cibulskis K.
Dong Q.
Getz G.
Guan K. L.
Hoskins J. M.
Hunt H. V.
Jeck W. R.
Jiang W.
Kim J. W.
Kuan P. F.
Liu Y.
Misher A. D.
Moser C. D.
Qin L. X.
Roberts L. R.
Savich G. L.
Tan T. X.
Thorgeirsson S. S.
Wang P.
Xiong Y.
Ye D.
Yourstone S. M.
Zhang C.
Publication venue
Publication date: 01/01/2013
Field of study

Mutations in the genes encoding isocitrate dehydrogenase, IDH1 and IDH2, have been reported in gliomas, myeloid leukemias, chondrosarcomas, and thyroid cancer. We discovered IDH1 and IDH2 mutations in 34 of 326 (10%) intrahepatic cholangiocarcinomas. Tumor with mutations in IDH1 or IDH2 had lower 5-hydroxymethylcytosine (5hmC) and higher 5-methylcytosine (5mC) levels, as well as increased dimethylation of histone H3K79. Mutations in IDH1 or IDH2 were associated with longer overall survival (p = 0.028) and were independently associated with a longer time to tumor recurrence after intrahepatic cholangiocarcinoma resection in multivariate analysis (p = 0.021). IDH1 and IDH2 mutations are significantly associated with increased levels of p53 in intrahepatic cholangiocarcinomas, but no mutations in the p53 gene were found, suggesting that mutations in IDH1 and IDH2 may cause a stress that leads to p53 activation. We identified 2,309 genes that were significantly hypermethylated in 19 cholangiocarcinomas with mutations in IDH1 or IDH2, compared with cholangiocarcinomas without these mutations. Hypermethylated CpG sites were significantly enriched in CpG shores and upstream of transcription start sites, suggesting a global regulation of transcriptional potential. Half of the hypermethylated genes overlapped with DNA hypermethylation in IDH1-mutant gliobastomas, suggesting the existence of a common set of genes whose expression may be affected by mutations in IDH1 or IDH2 in different types of tumors

PubMed Central

Carolina Digital Repository

A Cytoplasmic Domain Mutation in ClC-Kb Affects Long-Distance Communication Across the Membrane

Author: A Bateman
A Hayama
A Heils
AM Engh
B Schwappach
C Kubisch
CL Beck
CW Lin
E Cleiren
E Stogmann
G Carr
G Zifarelli
Gilbert Q. Martinez
I Zelikovic
J Rodriguez-Soriano
JP Cox
K Haug
K Matulef
K Nozu
L Mo
M Janosik
M Konrad
M Maduke
M Poet
M Proudfoot
M Pusch
MC Koch
MD Miller
Merritt Maduke
MI Niemeyer
MM Mupanomunda
N Jeck
N Piwon
P Day
P Fong
R Estevez
R Estevez
R Estevez
R Vargas-Poussou
R Zhang
S Fukuyama
S Ignoul
S Markovic
S Meyer
S Meyer
S Uchida
S Uchida
S Waldegger
S Watanabe
SC Hebert
SE Lloyd
Shuguang Zhang
SS Wang
T Daniel
T Schmidt-Rose
TJ Jentsch
TJ Jentsch
TJ Jentsch
TJ Jentsch
U Kornak
W Gunther
Y Kondo
Publication venue: Public Library of Science
Publication date
Field of study

BACKGROUND: ClC-Kb and ClC-Ka are homologous chloride channels that facilitate chloride homeostasis in the kidney and inner ear. Disruption of ClC-Kb leads to Bartter's Syndrome, a kidney disease. A point mutation in ClC-Kb, R538P, linked to Bartter's Syndrome and located in the C-terminal cytoplasmic domain was hypothesized to alter electrophysiological properties due to its proximity to an important membrane-embedded helix. METHODOLOGY/PRINCIPAL FINDINGS: Two-electrode voltage clamp experiments were used to examine the electrophysiological properties of the mutation R538P in both ClC-Kb and ClC-Ka. R538P selectively abolishes extracellular calcium activation of ClC-Kb but not ClC-Ka. In attempting to determine the reason for this specificity, we hypothesized that the ClC-Kb C-terminal domain had either a different oligomeric status or dimerization interface than that of ClC-Ka, for which a crystal structure has been published. We purified a recombinant protein corresponding to the ClC-Kb C-terminal domain and used multi-angle light scattering together with a cysteine-crosslinking approach to show that the dimerization interface is conserved between the ClC-Kb and ClC-Ka C-terminal domains, despite the fact that there are several differences in the amino acids that occur at this interface. CONCLUSIONS: The R538P mutation in ClC-Kb, which leads to Bartter's Syndrome, abolishes calcium activation of the channel. This suggests that a significant conformational change--ranging from the cytoplasmic side of the protein to the extracellular side of the protein--is involved in the Ca(2+)-activation process for ClC-Kb, and shows that the cytoplasmic domain is important for the channel's electrophysiological properties. In the highly similar ClC-Ka (90% identical), the R538P mutation does not affect activation by extracellular Ca(2+). This selective outcome indicates that ClC-Ka and ClC-Kb differ in how conformational changes are translated to the extracellular domain, despite the fact that the cytoplasmic domains share the same quaternary structure

Crossref

Directory of Open Access Journals

PubMed Central