6 research outputs found

    Digital genotyping of sorghum – a diverse plant species with a large repeat-rich genome

    Get PDF
    BACKGROUND: Rapid acquisition of accurate genotyping information is essential for all genetic marker-based studies. For species with relatively small genomes, complete genome resequencing is a feasible approach for genotyping; however, for species with large and highly repetitive genomes, the acquisition of whole genome sequences for the purpose of genotyping is still relatively inefficient and too expensive to be carried out on a high-throughput basis. Sorghum bicolor is a C(4) grass with a sequenced genome size of ~730 Mb, of which ~80% is highly repetitive. We have developed a restriction enzyme targeted genome resequencing method for genetic analysis, termed Digital Genotyping (DG), to be applied to sorghum and other grass species with large repeat-rich genomes. RESULTS: DG templates are generated using one of three methylation sensitive restriction enzymes that recognize a nested set of 4, 6 or 8 bp GC-rich sequences, enabling varying depth of analysis and integration of results among assays. Variation in sequencing efficiency among DG markers was correlated with template GC-content and length. The expected DG allele sequence was obtained 97.3% of the time with a ratio of expected to alternative allele sequence acquisition of >20:1. A genetic map aligned to the sorghum genome sequence with an average resolution of 1.47 cM was constructed using 1,772 DG markers from 137 recombinant inbred lines. The DG map enhanced the detection of QTL for variation in plant height and precisely aligned QTL such as Dw3 to underlying genes/alleles. Higher-resolution NgoMIV-based DG haplotypes were used to trace the origin of DNA on SBI-06, spanning Ma1 and Dw2 from progenitors to BTx623 and IS3620C. DG marker analysis identified the correct location of two miss-assembled regions and located seven super contigs in the sorghum reference genome sequence. CONCLUSION: DG technology provides a cost-effective approach to rapidly generate accurate genotyping data in sorghum. Currently, data derived from DG are used for many marker-based analyses, including marker-assisted breeding, pedigree and QTL analysis, genetic map construction, map-based gene cloning and association studies. DG in combination with whole genome resequencing is dramatically accelerating all aspects of genetic analysis of sorghum, an important genetic reference for C(4) grass species

    A robust benchmark for detection of germline large deletions and insertions

    No full text
    New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution, and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed the first sequence-resolved benchmark set for identification of both false negative and false positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle (GIAB) Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12745 isolated, sequence-resolved insertion (7281) and deletion (5464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5262 insertions and 4095 deletions supported by ≥1 diploid assembly. We demonstrate the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked-, and long-read sequencing and optical mapping

    A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency

    Get PDF
    none74siBackground Oncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials. Currently, there is a paucity of reliable genomic reference samples having a suitably large number of pre-identified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyze ten diverse cancer cell lines individually and their pool, termed Sample A, to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance. Results In reference Sample A, we identify more than 40,000 variants down to 1% allele frequency with more than 25,000 variants having less than 20% allele frequency with 1653 variants in COSMIC-related genes. This is 5–100× more than existing commercially available samples. We also identify an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection, sensitivity, and precision. Over 300 loci are randomly selected and independently verified via droplet digital PCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower allele frequency than what exists in Sample A natively, including known variants having allele frequency of 0.02%, a range suitable for assessing liquid biopsy panels. Conclusion These new reference samples and their admixtures provide superior capability for performing oncopanel quality control, analytical accuracy, and validation for small to large oncopanels and liquid biopsy assays.noneJones, Wendell; Gong, Binsheng; Novoradovskaya, Natalia; Li, Dan; Kusko, Rebecca; Richmond, Todd A.; Johann, Donald J.; Bisgin, Halil; Sahraeian, Sayed Mohammad Ebrahim; Bushel, Pierre R.; Pirooznia, Mehdi; Wilkins, Katherine; Chierici, Marco; Bao, Wenjun; Basehore, Lee Scott; Lucas, Anne Bergstrom; Burgess, Daniel; Butler, Daniel J.; Cawley, Simon; Chang, Chia-Jung; Chen, Guangchun; Chen, Tao; Chen, Yun-Ching; Craig, Daniel J.; del Pozo, Angela; Foox, Jonathan; Francescatto, Margherita; Fu, Yutao; Furlanello, Cesare; Giorda, Kristina; Grist, Kira P.; Guan, Meijian; Hao, Yingyi; Happe, Scott; Hariani, Gunjan; Haseley, Nathan; Jasper, Jeff; Jurman, Giuseppe; Kreil, David Philip; Łabaj, Paweł; Lai, Kevin; Li, Jianying; Li, Quan-Zhen; Li, Yulong; Li, Zhiguang; Liu, Zhichao; López, Mario Solís; Miclaus, Kelci; Miller, Raymond; Mittal, Vinay K.; Mohiyuddin, Marghoob; Pabón-Peña, Carlos; Parsons, Barbara L.; Qiu, Fujun; Scherer, Andreas; Shi, Tieliu; Stiegelmeyer, Suzy; Suo, Chen; Tom, Nikola; Wang, Dong; Wen, Zhining; Wu, Leihong; Xiao, Wenzhong; Xu, Chang; Yu, Ying; Zhang, Jiyang; Zhang, Yifan; Zhang, Zhihong; Zheng, Yuanting; Mason, Christopher E.; Willey, James C.; Tong, Weida; Shi, Leming; Xu, JoshuaJones, Wendell; Gong, Binsheng; Novoradovskaya, Natalia; Li, Dan; Kusko, Rebecca; Richmond, Todd A.; Johann, Donald J.; Bisgin, Halil; Sahraeian, Sayed Mohammad Ebrahim; Bushel, Pierre R.; Pirooznia, Mehdi; Wilkins, Katherine; Chierici, Marco; Bao, Wenjun; Basehore, Lee Scott; Lucas, Anne Bergstrom; Burgess, Daniel; Butler, Daniel J.; Cawley, Simon; Chang, Chia-Jung; Chen, Guangchun; Chen, Tao; Chen, Yun-Ching; Craig, Daniel J.; del Pozo, Angela; Foox, Jonathan; Francescatto, Margherita; Fu, Yutao; Furlanello, Cesare; Giorda, Kristina; Grist, Kira P.; Guan, Meijian; Hao, Yingyi; Happe, Scott; Hariani, Gunjan; Haseley, Nathan; Jasper, Jeff; Jurman, Giuseppe; Kreil, David Philip; Łabaj, Paweł; Lai, Kevin; Li, Jianying; Li, Quan-Zhen; Li, Yulong; Li, Zhiguang; Liu, Zhichao; López, Mario Solís; Miclaus, Kelci; Miller, Raymond; Mittal, Vinay K.; Mohiyuddin, Marghoob; Pabón-Peña, Carlos; Parsons, Barbara L.; Qiu, Fujun; Scherer, Andreas; Shi, Tieliu; Stiegelmeyer, Suzy; Suo, Chen; Tom, Nikola; Wang, Dong; Wen, Zhining; Wu, Leihong; Xiao, Wenzhong; Xu, Chang; Yu, Ying; Zhang, Jiyang; Zhang, Yifan; Zhang, Zhihong; Zheng, Yuanting; Mason, Christopher E.; Willey, James C.; Tong, Weida; Shi, Leming; Xu, Joshu
    corecore