67 research outputs found

    The next generation of target capture technologies - large DNA fragment enrichment and sequencing determines regional genomic variation of high complexity

    Get PDF
    Abstract Background The ability to capture and sequence large contiguous DNA fragments represents a significant advancement towards the comprehensive characterization of complex genomic regions. While emerging sequencing platforms are capable of producing several kilobases-long reads, the fragment sizes generated by current DNA target enrichment technologies remain a limiting factor, producing DNA fragments generally shorter than 1 kbp. The DNA enrichment methodology described herein, Region-Specific Extraction (RSE), produces DNA segments in excess of 20 kbp in length. Coupling this enrichment method to appropriate sequencing platforms will significantly enhance the ability to generate complete and accurate sequence characterization of any genomic region without the need for reference-based assembly. Results RSE is a long-range DNA target capture methodology that relies on the specific hybridization of short (20-25 base) oligonucleotide primers to selected sequence motifs within the DNA target region. These capture primers are then enzymatically extended on the 3’-end, incorporating biotinylated nucleotides into the DNA. Streptavidin-coated beads are subsequently used to pull-down the original, long DNA template molecules via the newly synthesized, biotinylated DNA that is bound to them. We demonstrate the accuracy, simplicity and utility of the RSE method by capturing and sequencing a 4 Mbp stretch of the major histocompatibility complex (MHC). Our results show an average depth of coverage of 164X for the entire MHC. This depth of coverage contributes significantly to a 99.94 % total coverage of the targeted region and to an accuracy that is over 99.99 %. Conclusions RSE represents a cost-effective target enrichment method capable of producing sequencing templates in excess of 20 kbp in length. The utility of our method has been proven to generate superior coverage across the MHC as compared to other commercially available methodologies, with the added advantage of producing longer sequencing templates amenable to DNA sequencing on recently developed platforms. Although our demonstration of the method does not utilize these DNA sequencing platforms directly, our results indicate that the capture of long DNA fragments produce superior coverage of the targeted region

    Sequence mining and transcript profiling to explore cyst nematode parasitism

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cyst nematodes are devastating plant parasites that become sedentary within plant roots and induce the transformation of normal plant cells into elaborate feeding cells with the help of secreted effectors, the parasitism proteins. These proteins are the translation products of parasitism genes and are secreted molecular tools that allow cyst nematodes to infect plants.</p> <p>Results</p> <p>We present here the expression patterns of all previously described parasitism genes of the soybean cyst nematode, <it>Heterodera glycines</it>, in all major life stages except the adult male. These insights were gained by analyzing our gene expression dataset from experiments using the Affymetrix Soybean Genome Array GeneChip, which contains probeset sequences for 6,860 genes derived from preparasitic and parasitic <it>H. glycines </it>life stages. Targeting the identification of additional <it>H. glycines </it>parasitism-associated genes, we isolated 633 genes encoding secretory proteins using algorithms to predict secretory signal peptides. Furthermore, because some of the known <it>H. glycines </it>parasitism proteins have strongest similarity to proteins of plants and microbes, we searched for predicted protein sequences that showed their highest similarities to plant or microbial proteins and identified 156 <it>H. glycines </it>genes, some of which also contained a signal peptide. Analyses of the expression profiles of these genes allowed the formulation of hypotheses about potential roles in parasitism. This is the first study combining sequence analyses of a substantial EST dataset with microarray expression data of all major life stages (except adult males) for the identification and characterization of putative parasitism-associated proteins in any parasitic nematode.</p> <p>Conclusion</p> <p>We have established an expression atlas for all known <it>H. glycines </it>parasitism genes. Furthermore, in an effort to identify additional <it>H. glycines </it>genes with putative functions in parasitism, we have reduced the currently known 6,860 <it>H. glycines </it>genes to a pool of 788 most promising candidate genes (including known parasitism genes) and documented their expression profiles. Using our approach to pre-select genes likely involved in parasitism now allows detailed functional analyses in a manner not feasible for larger numbers of genes. The generation of the candidate pool described here is an important enabling advance because it will significantly facilitate the unraveling of fascinating plant-animal interactions and deliver knowledge that can be transferred to other pathogen-host systems. Ultimately, the exploration of true parasitism genes verified from the gene pool delineated here will identify weaknesses in the nematode life cycle that can be exploited by novel anti-nematode efforts.</p

    CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist.</p> <p>Results</p> <p>We developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV.</p> <p>Conclusions</p> <p>To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated projects.</p> <p>Availability and Implementation</p> <p>Available on the web at: <url>http://sourceforge.net/projects/cnv</url></p

    Divergent evolution of arrested development in the dauer stage of Caenorhabditis elegans and the infective stage of Heterodera glycines

    Get PDF
    The generation and analysis of over 20,000 ESTs allowed the identification and expression profiling of 6,860 predicted genes in the nematode Heterodera glycines. This revealed that gene expression patterns in the dauer stage of Caenorhabditis elegans are not conserved in H. glycines

    Mitochondrial genome sequence analysis: A custom bioinformatics pipeline substantially improves Affymetrix MitoChip v2.0 call rate and accuracy

    Get PDF
    BACKGROUND: Mitochondrial genome sequence analysis is critical to the diagnostic evaluation of mitochondrial disease. Existing methodologies differ widely in throughput, complexity, cost efficiency, and sensitivity of heteroplasmy detection. Affymetrix MitoChip v2.0, which uses a sequencing-by-genotyping technology, allows potentially accurate and high-throughput sequencing of the entire human mitochondrial genome to be completed in a cost-effective fashion. However, the relatively low call rate achieved using existing software tools has limited the wide adoption of this platform for either clinical or research applications. Here, we report the design and development of a custom bioinformatics software pipeline that achieves a much improved call rate and accuracy for the Affymetrix MitoChip v2.0 platform. We used this custom pipeline to analyze MitoChip v2.0 data from 24 DNA samples representing a broad range of tissue types (18 whole blood, 3 skeletal muscle, 3 cell lines), mutations (a 5.8 kilobase pair deletion and 6 known heteroplasmic mutations), and haplogroup origins. All results were compared to those obtained by at least one other mitochondrial DNA sequence analysis method, including Sanger sequencing, denaturing HPLC-based heteroduplex analysis, and/or the Illumina Genome Analyzer II next generation sequencing platform. RESULTS: An average call rate of 99.75% was achieved across all samples with our custom pipeline. Comparison of calls for 15 samples characterized previously by Sanger sequencing revealed a total of 29 discordant calls, which translates to an estimated 0.012% for the base call error rate. We successfully identified 4 known heteroplasmic mutations and 24 other potential heteroplasmic mutations across 20 samples that passed quality control. CONCLUSIONS: Affymetrix MitoChip v2.0 analysis using our optimized MitoChip Filtering Protocol (MFP) bioinformatics pipeline now offers the high sensitivity and accuracy needed for reliable, high-throughput and cost-efficient whole mitochondrial genome sequencing. This approach provides a viable alternative of potential utility for both clinical diagnostic and research applications to traditional Sanger and other emerging sequencing technologies for whole mitochondrial genome analysis

    Resistance to autosomal dominant Alzheimer's disease in an APOE3 Christchurch homozygote: a case report.

    Get PDF
    We identified a PSEN1 (presenilin 1) mutation carrier from the world's largest autosomal dominant Alzheimer's disease kindred, who did not develop mild cognitive impairment until her seventies, three decades after the expected age of clinical onset. The individual had two copies of the APOE3 Christchurch (R136S) mutation, unusually high brain amyloid levels and limited tau and neurodegenerative measurements. Our findings have implications for the role of APOE in the pathogenesis, treatment and prevention of Alzheimer's disease

    Expert Panel Curation of 113 Primary Mitochondrial Disease Genes for the Leigh Syndrome Spectrum

    Get PDF
    OBJECTIVE: Primary mitochondrial diseases (PMDs) are heterogeneous disorders caused by inherited mitochondrial dysfunction. Classically defined neuropathologically as subacute necrotizing encephalomyelopathy, Leigh syndrome spectrum (LSS) is the most frequent manifestation of PMD in children, but may also present in adults. A major challenge for accurate diagnosis of LSS in the genomic medicine era is establishing gene-disease relationships (GDRs) for this syndrome with >100 monogenic causes across both nuclear and mitochondrial genomes. METHODS: The Clinical Genome Resource (ClinGen) Mitochondrial Disease Gene Curation Expert Panel (GCEP), comprising 40 international PMD experts, met monthly for 4 years to review GDRs for LSS. The GCEP standardized gene curation for LSS by refining the phenotypic definition, modifying the ClinGen Gene-Disease Clinical Validity Curation Framework to improve interpretation for LSS, and establishing a scoring rubric for LSS. RESULTS: The GDR with LSS across the nuclear and mitochondrial genomes was classified as definitive for 31/114 gene-disease relationships curated (27%); moderate for 38 (33%); limited for 43 (38%); and 2 as disputed (2%). Ninety genes were associated with autosomal recessive inheritance, 16 were maternally inherited, 5 autosomal dominant, and 3 X-linked. INTERPRETATION: GDRs for LSS were established for genes across both nuclear and mitochondrial genomes. Establishing these GDRs will allow accurate variant interpretation, expedite genetic diagnosis of LSS, and facilitate precision medicine, multi-system organ surveillance, recurrence risk counselling, reproductive choice, natural history studies and eligibility for interventional clinical trials. This article is protected by copyright. All rights reserved
    corecore