16 research outputs found

    The approach to separate repeat R to its two copies R1 and R2.

    No full text
    <p>(A) The branch is caused by different copies of repeat R (bold red). (B) The two copies of repeat R are resolved by two paths P1 and P2. P1 and P2 are extracted according to the reads at that branch, and the suffix of P1 overlaps the prefix of P2, then the reads of P2 are adjusted to their correct positions.</p

    Scheme for removing erroneous bases.

    No full text
    <p>Erroneous bases in reads will cause dead ends and bubbles that can be implicitly resolved as these errors can be masked by these correct reads. Reads with low similarities probably can be assembled onto other contigs with higher similarity.</p

    Align reads to contig for extension.

    No full text
    <p>(A) Align reads to contig. The path of nearby genome region is extracted according to the reads that are partially aligned onto contig, then new reads having more than 90% aligned bases with the path will be considered correctly aligned; otherwise, they should be aligned onto other genome regions rather than at that position. (B) Extension using paired-end reads. Contig is extended at the 3′ end according to the reads in Pool 1 and whose mates in Pool 2. There are two candidate bases ‘C’ and ‘T’, and ‘C’ is well supported by the mates in the two pools, whereas ‘T’ has no paired-end reads support, thus ‘C’ will be chosen to append onto the contig. (C) Extension using single-end reads. When assemble the grey color region which cannot be assembled by paired-end reads and the reads in Pool 1 have no mates in Pool 2, the reads in Pool 1 are used as single-end reads to extend the contig.</p
    corecore