Search CORE

44 research outputs found

R50 and N50 values with different k-mer sizes.

Author: Hideaki Koike (410323)
Hiroko Hagiwara (405278)
Itaru Takeda (410321)
Masayuki Machida (410324)
Myco Umemura (410319)
Tsutomu Ikegami (410322)
Yoshinori Koyama (410320)
Publication venue
Publication date
Field of study

(a) The R50 and (b) the N50 values for lib2.8.qv10 (closed square) and lib1.9.qv10 (open square). The R50 value corresponds to N50 using sequence fragments of the reference genome covered by highly accurate sequences of assembled scaffolds.</p

FigShare

Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40

Author: Hideaki Koike (410323)
Hiroko Hagiwara (405278)
Itaru Takeda (410321)
Masayuki Machida (410324)
Myco Umemura (410319)
Tsutomu Ikegami (410322)
Yoshinori Koyama (410320)
Publication venue
Publication date: 07/05/2013
Field of study

<div>The development of next-generation sequencing (NGS) technologies has dramatically increased the throughput, speed, and efficiency of genome sequencing. The short read data generated from NGS platforms, such as SOLiD and Illumina, are quite useful for mapping analysis. However, the SOLiD read data with lengths of <60 bp have been considered to be too short for de novo genome sequencing. Here, to investigate whether de novo sequencing of fungal genomes is possible using only SOLiD short read sequence data, we performed de novo assembly of the Aspergillus oryzae RIB40 genome using only SOLiD read data of 50 bp generated from mate-paired libraries with 2.8- or 1.9-kb insert sizes. The assembled scaffolds showed an N50 value of 1.6 Mb, a 22-fold increase than those obtained using only SOLiD short read in other published reports. In addition, almost 99% of the reference genome was accurately aligned by the assembled scaffold fragments in long lengths. The sequences of secondary metabolite biosynthetic genes and clusters, whose products are of considerable interest in fungal studies due to their potential medicinal, agricultural, and cosmetic properties, were also highly reconstructed in the assembled scaffolds. Based on these findings, we concluded that de novo genome sequencing using only SOLiD short reads is feasible and practical for molecular biological study of fungi. We also investigated the effect of filtering low quality data, library insert size, and k-mer size on the assembly performance, and recommend for the assembly use of mild filtered read data where the N50 was not so degraded and the library has an insert size of ∼2.0 kb, and k-mer size 33.</div

Directory of Open Access Journals

PubMed Central

FigShare

Number of ORFs reproduced in the assemblies.

Author: Hiroko Hagiwara (405278)
Isao Kojima (731067)
Kiyoshi Asai (92186)
Masayuki Machida (410324)
Myco Umemura (410319)
Toyohiro Inatsugi (731066)
Tsutomu Ikegami (410322)
Publication venue
Publication date
Field of study

a Number of ORFs aligned with e-values lower than 10−100.b Total number of ORFs.Number of ORFs reproduced in the assemblies.</p

FigShare

Libraries, data filtering, and k-mer size used for the series of de novo assemblies.

Author: Hideaki Koike (410323)
Hiroko Hagiwara (405278)
Itaru Takeda (410321)
Masayuki Machida (410324)
Myco Umemura (410319)
Tsutomu Ikegami (410322)
Yoshinori Koyama (410320)
Publication venue
Publication date
Field of study

aStandard deviation of the library insert size.bMeaning all bases have QVs of >10 or 90% base-level accuracy.</p

FigShare

Dotplot alignments of assembled strands against the reference genome sequence of S. avermitilis.

Author: Hiroko Hagiwara (405278)
Isao Kojima (731067)
Kiyoshi Asai (92186)
Masayuki Machida (410324)
Myco Umemura (410319)
Toyohiro Inatsugi (731066)
Tsutomu Ikegami (410322)
Publication venue
Publication date
Field of study

Alignments shorter than 4000 bp were omitted from the plots. Forward and reverse alignments are plotted in red and blue colors, respectively. (a) The MSSH assembly, (b) the HHHH assembly.</p

FigShare

Characteristics of the A. oryzaea contigs/scaffolds/strands from several assembliesb.

Author: Hiroko Hagiwara (405278)
Isao Kojima (731067)
Kiyoshi Asai (92186)
Masayuki Machida (410324)
Myco Umemura (410319)
Toyohiro Inatsugi (731066)
Tsutomu Ikegami (410322)
Publication venue
Publication date
Field of study

aS. avermitilis results are omitted due to the erroneous SOLiD reads.b k-mer size is fixed at 45.Characteristics of the A. oryzae<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0126289#t004fn001" target="_blank">a</a> contigs/scaffolds/strands from several assemblies<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0126289#t004fn002" target="_blank">b</a>.</p

FigShare

Numbers of nucleotide insertions and deletions in assembled sequences aligned to the reference sequence.

Author: Hideaki Koike (410323)
Hiroko Hagiwara (405278)
Itaru Takeda (410321)
Masayuki Machida (410324)
Myco Umemura (410319)
Tsutomu Ikegami (410322)
Yoshinori Koyama (410320)
Publication venue
Publication date
Field of study

Numbers of nucleotide insertions and deletions in assembled sequences aligned to the reference sequence.</p

FigShare

Numbers of misjoin, inversion, deletion, and insertion, and total sizes of deletion and insertion (>500 bp).

Author: Hideaki Koike (410323)
Hiroko Hagiwara (405278)
Itaru Takeda (410321)
Masayuki Machida (410324)
Myco Umemura (410319)
Tsutomu Ikegami (410322)
Yoshinori Koyama (410320)
Publication venue
Publication date
Field of study

Numbers of misjoin, inversion, deletion, and insertion, and total sizes of deletion and insertion (>500 bp).</p

FigShare

Statistical information of input short read data.

Author: Hiroko Hagiwara (405278)
Isao Kojima (731067)
Kiyoshi Asai (92186)
Masayuki Machida (410324)
Myco Umemura (410319)
Toyohiro Inatsugi (731066)
Tsutomu Ikegami (410322)
Publication venue
Publication date
Field of study

a Insertion length is estimated by mapping paired reads on the assembled results.b Total length of the reference sequences [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0126289#pone.0126289.ref026" target="_blank">26</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0126289#pone.0126289.ref029" target="_blank">29</a>].c SOLiD data are down-sampled at NlowQ = 25 (qv25) and 6 (qv6).Statistical information of input short read data.</p

FigShare

Characteristic indices of the strands from several assemblies.

Author: Hiroko Hagiwara (405278)
Isao Kojima (731067)
Kiyoshi Asai (92186)
Masayuki Machida (410324)
Myco Umemura (410319)
Toyohiro Inatsugi (731066)
Tsutomu Ikegami (410322)
Publication venue
Publication date
Field of study

Characteristic indices of the strands from several assemblies.</p

FigShare

R50 and N50 values with different k-mer sizes.

Fine <i>De Novo</i> Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on <i>Aspergillus oryzae</i> RIB40

Number of ORFs reproduced in the assemblies.

Libraries, data filtering, and k-mer size used for the series of <i>de novo</i> assemblies.

Dotplot alignments of assembled strands against the reference genome sequence of <i>S. avermitilis</i>.

Characteristics of the <i>A. oryzae</i><sup><sup>a</sup></sup> contigs/scaffolds/strands from several assemblies<sup><sup>b</sup></sup>.

Numbers of nucleotide insertions and deletions in assembled sequences aligned to the reference sequence.

Numbers of misjoin, inversion, deletion, and insertion, and total sizes of deletion and insertion (>500 bp).

Statistical information of input short read data.

Characteristic indices of the strands from several assemblies.