17 research outputs found

    Calling accuracy decreases with homopolymer length.

    Full text link
    <p>Lines show mean accuracy for each kit by reference homopolymer length, across bases 10–100 and bases 10–200, the latter range only relevant for the two 200 bp kits.</p

    Relationship between base position and error rate for homopolymer (over-call/under-call) versus substitution errors.

    Full text link
    <p>Panel (a) shows the homopolymer error rate (insertion+deletion) by read base position, and panel (b) shows the substitution error rate by base position. Each line is the raw mean error rate for a single data-set with the kit and species as specified by the colour key.</p

    Relationship between G+C% and the observed mean coverage for 100 bp bins in the reference genome.

    Full text link
    <p>Panel (a) is a boxplot of the distribution of the square-root normalized mean read depth across the 100 bp windows for each reference genome, broken down further by sequencing kit and G+C% bin. The coverage for each run was normalised by the mean coverage –the boxplots show the square-root fold-change from the mean genomic coverage for each combination of species, kit and G+C% bin. Thus a value of 2 means the coverage was four times that of the mean for that sequencing run. The boxes display the central 50% of the values in each treatment, with the median represented by the solid black horizontal bar. The whiskers each extend for 1.5× the inter-quartile range, and the black dots represent extreme individual observations which fall outside this range. The variability observed in the high G+C bins are likely due to the small sample size for these G+C regions, shown in panel (b). The outliers are potentially due to repetitive content in the genome that failed to be masked by our perfect match repeat approach.</p

    Ion Torrent quality scores versus empirically estimated quality score for base.

    Full text link
    <p>The grey cloud surrounding the LOESS smoother function indicates the 95% confidence interval for the conditional mean. Individual observations for each quality are plotted as black points.</p

    Mean rates of insertion, deletion and substitution errors across the three sequencing kits.

    Full text link
    <p>Each box-plot shows the distribution of error rates for the specified type across the runs for the specified kit (species are aggregated).</p

    Examples of over-call/under-call errors in homopolymers of length less than 2.

    Full text link
    <p>By aligning the read (derived from the rounded flow-values), and its corresponding reference sequence (considered the ‘true’ sequence) at the flow level, we can identify examples of over-calling a zero-length homopolymer (Flow Cycle #2), and under-calling a one-length homopolymer (Flow Cycle # 6). Flow Cycle #5 demonstrates a zero-length homopolymer being correctly called as zero.</p

    Effect of quality and flow trimming on dataset metrics, aggregated by kit used.

    Full text link
    <p>AT = Analysis trim, QT = Quality trim, HRI = High-residual ionogram trim (1-mers and 2-mers), HRI3 = High-residual ionogram trim (1-mer, 2-mer, 3-mers). The ‘comparison homopolymer rates’ are taken from other literature using the same kit and level of quality assurance (both cases used Torrent Server version 1.5.0).</p

    Sequencing runs generated for this study.

    Full text link
    <p>The name for each run is comprised of the chip (314, 316), species (B – <i>Bacillus amyloliquefaciens</i>, S – <i>Sulfolobus tokodaii</i>), machine (a, b), and kit (100 - Ion OneTouch Template Kit, 200M - Ion Xpress Template 200 kit, 200 - Ion OneTouch 200 Template kit). Runs are listed in chronological order. ‘% Wells with ISPs’ describes the percentage of wells on the chip which contained a bead. Mean Length AT denotes length after 3′ adapter trimming.</p

    Estimated main and deviance effects for each explanatory variable in the double-generalised linear model.

    Full text link
    <p>Position-in-cycle (PIC) effects are in <b><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003031#pcbi.1003031.s011" target="_blank">Table S1</a></b>. The intercept represents the mean effect (or dispersion effect) for an observation with all settings at baseline (baseline factors in this model taken to be <i>B. amyloliquefaciens</i>, 100 bp OneTouch Kit and Chip 314). The other coefficients are the differences from when their respective factor is changed from baseline.</p
    corecore