51 research outputs found

    Distribution, functional impact, and origin mechanisms of copy number variation in the barley genome

    Get PDF
    BACKGROUND There is growing evidence for the prevalence of copy number variation (CNV) and its role in phenotypic variation in many eukaryotic species. Here we use array comparative genomic hybridization to explore the extent of this type of structural variation in domesticated barley cultivars and wild barleys. RESULTS A collection of 14 barley genotypes including eight cultivars and six wild barleys were used for comparative genomic hybridization. CNV affects 14.9% of all the sequences that were assessed. Higher levels of CNV diversity are present in the wild accessions relative to cultivated barley. CNVs are enriched near the ends of all chromosomes except 4H, which exhibits the lowest frequency of CNVs. CNV affects 9.5% of the coding sequences represented on the array and the genes affected by CNV are enriched for sequences annotated as disease-resistance proteins and protein kinases. Sequence-based comparisons of CNV between cultivars Barke and Morex provided evidence that DNA repair mechanisms of double-strand breaks via single-stranded annealing and synthesis-dependent strand annealing play an important role in the origin of CNV in barley. CONCLUSIONS We present the first catalog of CNVs in a diploid Triticeae species, which opens the door for future genome diversity research in a tribe that comprises the economically important cereal species wheat, barley, and rye. Our findings constitute a valuable resource for the identification of CNV affecting genes of agronomic importance. We also identify potential mechanisms that can generate variation in copy number in plant genomes.This work was financially supported by the following grants: project GABI-BARLEX, German Federal Ministry of Education and Research (BMBF), #0314000 to MP, US, KFXM and NS; Triticeae Coordinated Agricultural Project, USDA-NIFA #2011-68002-30029 to GJM; and Agriculture and Food Research Initiative Plant Genome, Genetics and Breeding Program of USDA’s Cooperative State Research and Extension Service, #2009-65300- 05645 to GJM

    Whole exome capture in solution with 3 Gbp of data

    Get PDF
    We have developed a solution-based method for targeted DNA capture-sequencing that is directed to the complete human exome. Using this approach allows the discovery of greater than 95% of all expected heterozygous singe base variants, requires as little as 3 Gbp of raw sequence data and constitutes an effective tool for identifying rare coding alleles in large scale genomic studies

    Distribution, functional impact, and origin mechanisms of copy number variation in the barley genome

    Get PDF
    BACKGROUND: There is growing evidence for the prevalence of copy number variation (CNV) and its role in phenotypic variation in many eukaryotic species. Here we use array comparative genomic hybridization to explore the extent of this type of structural variation in domesticated barley cultivars and wild barleys. RESULTS: A collection of 14 barley genotypes including eight cultivars and six wild barleys were used for comparative genomic hybridization. CNV affects 14.9% of all the sequences that were assessed. Higher levels of CNV diversity are present in the wild accessions relative to cultivated barley. CNVs are enriched near the ends of all chromosomes except 4H, which exhibits the lowest frequency of CNVs. CNV affects 9.5% of the coding sequences represented on the array and the genes affected by CNV are enriched for sequences annotated as disease-resistance proteins and protein kinases. Sequence-based comparisons of CNV between cultivars Barke and Morex provided evidence that DNA repair mechanisms of double-strand breaks via single-stranded annealing and synthesis-dependent strand annealing play an important role in the origin of CNV in barley. CONCLUSIONS: We present the first catalog of CNVs in a diploid Triticeae species, which opens the door for future genome diversity research in a tribe that comprises the economically important cereal species wheat, barley, and rye. Our findings constitute a valuable resource for the identification of CNV affecting genes of agronomic importance. We also identify potential mechanisms that can generate variation in copy number in plant genomes

    Sorghum Genome Sequencing by Methylation Filtration

    Get PDF
    Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF) technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis

    Mutation discovery in mice by whole exome sequencing

    Get PDF
    We report the development and optimization of reagents for in-solution, hybridization-based capture of the mouse exome. By validating this approach in a multiple inbred strains and in novel mutant strains, we show that whole exome sequencing is a robust approach for discovery of putative mutations, irrespective of strain background. We found strong candidate mutations for the majority of mutant exomes sequenced, including new models of orofacial clefting, urogenital dysmorphology, kyphosis and autoimmune hepatitis

    Heritable L1 retrotransposition in the mouse primordial germline and early embryo

    Get PDF
    LINE-1 (L1) retrotransposons are a noted source of genetic diversity and disease in mammals. To expand its genomic footprint, L1 must mobilize in cells that will contribute their genetic material to subsequent generations. Heritable L1 insertions may therefore arise in germ cells and in pluripotent embryonic cells, prior to germline specification, yet the frequency and predominant developmental timing of such events remain unclear. Here, we applied mouse retrotransposon capture sequencing (mRC-seq) and whole-genome sequencing (WGS) to pedigrees of C57BL/6J animals, and uncovered an L1 insertion rate of ≥1 event per eight births. We traced heritable L1 insertions to pluripotent embryonic cells and, strikingly, to early primordial germ cells (PGCs). New L1 insertions bore structural hallmarks of target-site primed reverse transcription (TPRT) and mobilized efficiently in a cultured cell retrotransposition assay. Together, our results highlight the rate and evolutionary impact of heritable L1 retrotransposition and reveal retrotransposition-mediated genomic diversification as a fundamental property of pluripotent embryonic cells in vivo

    A comprehensive resequence analysis of the KLK15–KLK3–KLK2 locus on chromosome 19q13.33

    Get PDF
    Single nucleotide polymorphisms (SNPs) in the KLK3 gene on chromosome 19q13.33 are associated with serum prostate-specific antigen (PSA) levels. Recent genome wide association studies of prostate cancer have yielded conflicting results for association of the same SNPs with prostate cancer risk. Since the KLK3 gene encodes the PSA protein that forms the basis for a widely used screening test for prostate cancer, it is critical to fully characterize genetic variation in this region and assess its relationship with the risk of prostate cancer. We have conducted a next-generation sequence analysis in 78 individuals of European ancestry to characterize common (minor allele frequency, MAF >1%) genetic variation in a 56 kb region on chromosome 19q13.33 centered on the KLK3 gene (chr19:56,019,829–56,076,043 bps). We identified 555 polymorphic loci in the process including 116 novel SNPs and 182 novel insertion/deletion polymorphisms (indels). Based on tagging analysis, 144 loci are necessary to tag the region at an r2 threshold of 0.8 and MAF of 1% or higher, while 86 loci are required to tag the region at an r2 threshold of 0.8 and MAF >5%. Our sequence data augments coverage by 35 and 78% as compared to variants in dbSNP and HapMap, respectively. We observed six non-synonymous amino acid or frame shift changes in the KLK3 gene and three changes in each of the neighboring genes, KLK15 and KLK2. Our study has generated a detailed map of common genetic variation in the genomic region surrounding the KLK3 gene, which should be useful for fine-mapping the association signal as well as determining the contribution of this locus to prostate cancer risk and/or regulation of PSA expression

    Fluorescence in situ hybridization with high-complexity repeat-free oligonucleotide probes generated by massively parallel synthesis

    Get PDF
    The ability to visualize specific DNA sequences, on chromosomes and in nuclei, by fluorescence in situ hybridization (FISH) is fundamental to many aspects of genetics, genomics and cell biology. Probe selection is currently limited by the availability of DNA clones or the appropriate pool of DNA sequences for PCR amplification. Here, we show that liquid-phase probe pools from sequence capture technology can be adapted to generate fluorescently labelled pools of oligonucleotides that are very effective as repeat-free FISH probes in mammalian cells. As well as detection of small (15 kb) and larger (100 kb) specific loci in both cultured cells and tissue sections, we show that complex oligonucleotide pools can be used as probes to visualize features of nuclear organization. Using this approach, we dramatically reveal the disposition of exons around the outside of a chromosome territory core and away from the nuclear periphery

    Identification of Novel High-Frequency DNA Methylation Changes in Breast Cancer

    Get PDF
    Recent data have revealed that epigenetic alterations, including DNA methylation and chromatin structure changes, are among the earliest molecular abnormalities to occur during tumorigenesis. The inherent thermodynamic stability of cytosine methylation and the apparent high specificity of the alterations for disease may accelerate the development of powerful molecular diagnostics for cancer. We report a genome-wide analysis of DNA methylation alterations in breast cancer. The approach efficiently identified a large collection of novel differentially DNA methylated loci (∼200), a subset of which was independently validated across a panel of over 230 clinical samples. The differential cytosine methylation events were independent of patient age, tumor stage, estrogen receptor status or family history of breast cancer. The power of the global approach for discovery is underscored by the identification of a single differentially methylated locus, associated with the GHSR gene, capable of distinguishing infiltrating ductal breast carcinoma from normal and benign breast tissues with a sensitivity and specificity of 90% and 96%, respectively. Notably, the frequency of these molecular abnormalities in breast tumors substantially exceeds the frequency of any other single genetic or epigenetic change reported to date. The discovery of over 50 novel DNA methylation-based biomarkers of breast cancer may provide new routes for development of DNA methylation-based diagnostics and prognostics, as well as reveal epigenetically regulated mechanism involved in breast tumorigenesis

    Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content

    Get PDF
    Following the domestication of maize over the past ∼10,000 years, breeders have exploited the extensive genetic diversity of this species to mold its phenotype to meet human needs. The extent of structural variation, including copy number variation (CNV) and presence/absence variation (PAV), which are thought to contribute to the extraordinary phenotypic diversity and plasticity of this important crop, have not been elucidated. Whole-genome, array-based, comparative genomic hybridization (CGH) revealed a level of structural diversity between the inbred lines B73 and Mo17 that is unprecedented among higher eukaryotes. A detailed analysis of altered segments of DNA conservatively estimates that there are several hundred CNV sequences among the two genotypes, as well as several thousand PAV sequences that are present in B73 but not Mo17. Haplotype-specific PAVs contain hundreds of single-copy, expressed genes that may contribute to heterosis and to the extraordinary phenotypic diversity of this important crop
    corecore