18 research outputs found
Generation of Artificial FASTQ Files to Evaluate the Performance of Next-Generation Sequencing Pipelines
<div><p>Pipelines for the analysis of Next-Generation Sequencing (NGS) data are generally composed of a set of different publicly available software, configured together in order to map short reads of a genome and call variants. The fidelity of pipelines is variable. We have developed <em>ArtificialFastqGenerator</em>, which takes a reference genome sequence as input and outputs artificial paired-end FASTQ files containing Phred quality scores. Since these artificial FASTQs are derived from the reference genome, it provides a gold-standard for read-alignment and variant-calling, thereby enabling the performance of any NGS pipeline to be evaluated. The user can customise DNA template/read length, the modelling of coverage based on GC content, whether to use real Phred base quality scores taken from existing FASTQ files, and whether to simulate sequencing errors. Detailed coverage and error summary statistics are outputted. Here we describe <em>ArtificialFastqGenerator</em> and illustrate its implementation in evaluating a typical bespoke NGS analysis pipeline under different experimental conditions. <em>ArtificialFastqGenerator</em> was released in January 2012. Source code, example files and binaries are freely available under the terms of the GNU General Public License v3.0. from <a href="https://sourceforge.net/projects/artfastqgen/">https://sourceforge.net/projects/artfastqgen/</a>.</p> </div
The profile for regional GC content versus mean target coverage, produced by using the default settings for the relevant <i>ArtificialFastqGenerator</i> user parameters.
<p>The profile for regional GC content versus mean target coverage, produced by using the default settings for the relevant <i>ArtificialFastqGenerator</i> user parameters.</p
SIRs for leukemia in T1DM patients according to age at leukemia diagnosis.
<p>O, observed; SIR, standardized incidence ratio; CI, confidence interval; *P<0.05, **P<0.01.</p><p>Case numbers in the reference population ALL: 1733 (<10 y), 701 (10–20 y), 562 (21–50 y), 2996 (total); CLL: 55 (<10 y), 15 (10–20 y), 728 (21–50 y), 798 (total); AML: 301 (<10 y), 296 (10–20 y), 1506 (21–50 y), 2103 (total); CML: 88 (<10 y), 93 (10–20 y), 1201 (21–50 y), 1382 (total).</p
SIRs for leukemia in T1D patients according to time at hospitalization for T1D and age at T1D diagnosis.
<p>O, observed; SIR, standardized incidence ratio; CI, confidence interval;</p>*<p>P<0.05, **P<0.01.</p
Age-specific incidence of leukemia in T1D patients (symbols in three age bands, 0–9 y, 10–19 y and 20–49 y) compared to the Swedish background rate (solid lines in 10–year age bands).
<p>The case numbers for T1D patients are shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0039523#pone-0039523-t001" target="_blank">Table 1</a> and those for the background rates in footnote to <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0039523#pone-0039523-t001" target="_blank">Table 1</a>.</p
Additional file 6: of Impact of atopy on risk of glioma: a Mendelian randomisation study
Table S5. Simulation analyses. (XLSX 28 kb
Additional file 5: of Impact of atopy on risk of glioma: a Mendelian randomisation study
Table S4. Range of odds ratios for which study had < 80% power, for each atopy-related trait (P = 0.05, two-sided). (XLSX 9 kb
Additional file 7: of Impact of atopy on risk of glioma: a Mendelian randomisation study
Table S6. Inverse-variance weighting, maximum likelihood estimation, weighted median estimate, mode-based estimate and Mendelian randomisation-Egger test results for combined atopy-related instrumental variables and glioma subtypes. (XLSX 39 kb
Additional file 4: of Impact of atopy on risk of glioma: a Mendelian randomisation study
Table S3. Percentage of variance explained by the combined sets of single nucleotide polymorphisms used as instrumental variables. (XLSX 33 kb
Additional file 2: of Impact of atopy on risk of glioma: a Mendelian randomisation study
Table S1. Summary of the eight glioma genome-wide association studies. (XLSX 29 kb