26 research outputs found

    Additional file 1: Table S1. of Chloroplast genomes: diversity, evolution, and applications in genetic engineering

    No full text
    The chloroplast genes which are absent in specific species, their knock out phenotypes and transfer to nuclear genomes. (DOCX 23 kb

    Heterozygous variations, including heterozygous SNPs and hemizygous insertions/deletions/inversions, detected during assembly of diploid genome.

    No full text
    <p>Heterozygous variations, including heterozygous SNPs and hemizygous insertions/deletions/inversions, detected during assembly of diploid genome.</p

    A Genetic Algorithm for Diploid Genome Reconstruction Using Paired-End Sequencing

    No full text
    <div><p>The genome of many species in the biosphere is a diploid consisting of paternal and maternal haplotypes. The differences between these two haplotypes range from single nucleotide polymorphisms (SNPs) to large-scale structural variations (SVs). Existing genome assemblers for next-generation sequencing platforms attempt to reconstruct one consensus sequence, which is a mosaic of two parental haplotypes. Reconstructing paternal and maternal haplotypes is an important task in linkage analysis and association studies. This study designs and implemented HapSVAssembler on the basis of Genetic Algorithm (GA) and paired-end sequencing. The proposed method builds a consensus sequence, identifies various types of heterozygous variants, and reconstructs the paternal and maternal haplotypes by solving an optimization problem with a GA algorithm. Experimental results indicate that the HapSVAssembler has high accuracy and contiguity under various sequencing coverage, error rates, and insert sizes. The program is tested on pilot sequencing of a highly heterozygous genome, and 12,781 heterozygous SNPs and 602 hemizygous SVs are identified. We observe that, although the number of SVs is much less than that of SNPs, the genomic regions occupied by SVs are much larger, implying the heterozygosity computed using SNPs or <i>k</i>-mer spectrum may be under-estimated.</p></div

    Assembly accuracy and contiguity for different sequencing coverage and error rates.

    No full text
    <p>(a) The accuracy higher than 90% can be obtained with low error rate simulations even in low coverage; (b) The comparison of N10/N50 for different sequencing coverage.</p

    Identification of insertions or deletions.

    No full text
    <p>A discordant read <i>r</i><sub><i>j</i></sub> is mapped on the reference with two mapping locis, and . The spanning region of <i>r</i><sub><i>j</i></sub> is from to . And the potential breakpoint pair of SV<sub><i>i</i></sub> is initialized from to .</p

    Illustration of breakpoint reads across SV boundaries.

    No full text
    <p>(a) A breakpoint Read <i>r</i><sub><i>j</i></sub> whose right end matches perfectly first 4 nucleotides whether the remainder bases are mismatched with the reference. The guessing breakpoint can be inferred at the 4th base of the right end on <i>r</i><sub><i>j</i></sub>; (b) The actual breakpoints of SV can be determined by breakpoint reads.</p

    The accuracy for different genome size and read length.

    No full text
    <p>The paternal and maternal genomes differes in 1% SNPs. The mean insert size is 250bp with 25bp standard deviation, the sequencing coverage is 20X, and the sequencing error rate is 1%. (a) The accuracy for different genome sizes; (b) The accuracy for different read lengths.</p

    Illustration of converting paired-reads to SNP matrix and SV matrix.

    No full text
    <p>(a) Paired-end read <i>r</i><sub>1</sub> and <i>r</i><sub>2</sub> both contain SNPs but <i>r</i><sub>3</sub> does not, therefore, <i>r</i><sub>1</sub> and <i>r</i><sub>2</sub> can be successfully converted to read fragment <i>f</i><sub>1</sub> and <i>f</i><sub>2</sub> respectively. SNP <i>s</i><sub>2</sub> is covered by <i>r</i><sub>2</sub>, and the allele at <i>s</i><sub>2</sub> can be obtained by the 4-th nucleotide on <i>r</i><sub>2</sub>; (b) Single-end mapped read <i>r</i><sub>1</sub> and <i>r</i><sub>2</sub> whose unmapped ends are overlapping with <i>sv</i><sub>1</sub> (e.g., a deletion), both of and can be assigned by 1.</p

    Illustration of extended Haplotype blocks via heterozygous SVs.

    No full text
    <p>One end is represented by a solid arrow and two ends from the same read are connected by a dotted line. There is a heterozygous SV<sub>1</sub> between SNP<sub>10</sub> and SNP<sub>11</sub>. (a) Without considering SVs, the entire haplotype will be broken into three haplotype blocks; (b) In our approach, Block<sub>2</sub> and Block<sub>3</sub> in (a) are merged by bridging read <i>x</i>, <i>y</i> in Block<sub>2</sub> and bridging read <i>z</i> in Block<sub>3</sub> that indicate heterozygous SV<sub>1</sub>.</p
    corecore