Search CORE

GenomeVIP: A cloud platform for genomic variant discovery and interpretation

Author: Chen Ken
DeNardo Erin
Ding Li
Fenyö David
Handsaker Robert E
Huang Kuan-lin
Koboldt Daniel C
Mashl R. Jay
Niu Beifang
Raphael Benjamin J
Scott Adam D
Wendl Michael C
Wyczalkowski Matthew A
Ye Kai
Yellapantula Venkata D
Yoon Christopher J
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Identifying genomic variants is a fundamental first step toward the understanding of the role of inherited and acquired variation in disease. The accelerating growth in the corpus of sequencing data that underpins such analysis is making the data-download bottleneck more evident, placing substantial burdens on the research community to keep pace. As a result, the search for alternative approaches to the traditional “download and analyze” paradigm on local computing resources has led to a rapidly growing demand for cloud-computing solutions for genomics analysis. Here, we introduce the Genome Variant Investigation Platform (GenomeVIP), an open-source framework for performing genomics variant discovery and annotation using cloud- or local high-performance computing infrastructure. GenomeVIP orchestrates the analysis of whole-genome and exome sequence data using a set of robust and popular task-specific tools, including VarScan, GATK, Pindel, BreakDancer, Strelka, and Genome STRiP, through a web interface. GenomeVIP has been used for genomic analysis in large-data projects such as the TCGA PanCanAtlas and in other projects, such as the ICGC Pilots, CPTAC, ICGC-TCGA DREAM Challenges, and the 1000 Genomes SV Project. Here, we demonstrate GenomeVIP's ability to provide high-confidence annotated somatic, germline, and de novo variants of potential biological significance using publicly available data sets.</jats:p

A comparative analysis of algorithms for somatic SNV detection in cancer

Author: Andreas W. Schreiber
Cibulskis
David L. Adelson
Ding
Dohm
Garique Glonek
Gundry
Hamish S. Scott
Koboldt
Koboldt
Larson
Lee
Liu
Loeb
Meacham
Meyerson
Nakamura
Nicola D. Roberts
Oki
Pleasance
R. Daniel Kortschak
Roth
Salk
Saunders
Susan Branford
Wendy T. Parker
Yang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

Motivation: With the advent of relatively affordable high-throughput technologies, DNA sequencing of cancers is now common practice in cancer research projects and will be increasingly used in clinical practice to inform diagnosis and treatment. Somatic (cancer-only) single nucleotide variants (SNVs) are the simplest class of mutation, yet their identification in DNA sequencing data is confounded by germline polymorphisms, tumour heterogeneity and sequencing and analysis errors. Four recently published algorithms for the detection of somatic SNV sites in matched cancer–normal sequencing datasets are VarScan, SomaticSniper, JointSNVMix and Strelka. In this analysis, we apply these four SNV calling algorithms to cancer–normal Illumina exome sequencing of a chronic myeloid leukaemia (CML) patient. The candidate SNV sites returned by each algorithm are filtered to remove likely false positives, then characterized and compared to investigate the strengths and weaknesses of each SNV calling algorithm. Results: Comparing the candidate SNV sets returned by VarScan, SomaticSniper, JointSNVMix2 and Strelka revealed substantial differences with respect to the number and character of sites returned; the somatic probability scores assigned to the same sites; their susceptibility to various sources of noise; and their sensitivities to low-allelic-fraction candidates.Nicola D. Roberts, R. Daniel Kortschak, Wendy T. Parker, Andreas W. Schreiber, Susan Branford, Hamish S. Scott, Garique Glonek and David L. Adelso

Adelaide Research & Scholarship

Caenorhabditis briggsae recombinant inbred line genotypes reveal inter-strain incompatibility and the rvolution of recombination

Author: Baird Scott E
Chamberlin Helen M
Gupta Bhagwati P
Haag Eric S
Koboldt Daniel C
Miller Raymond D
Ross Joseph A
Staisch Julia E
Publication venue: Digital Commons@Becker
Publication date: 01/01/2011
Field of study

A toolkit for rapid gene mapping in the nematode Caenorhabditis briggsae

Author: Baird Scott E
Chamberlin Helen M
Gupta Bhagwati P
Haag Eric S
Haines Karen
Koboldt Daniel C
Miller Raymond D
Staisch Julia
Thillainathan Bavithra
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The nematode <it>C. briggsae </it>serves as a useful model organism for comparative analysis of developmental and behavioral processes. The amenability of <it>C. briggsae </it>to genetic manipulations and the availability of its genome sequence have prompted researchers to study evolutionary changes in gene function and signaling pathways. These studies rely on the availability of forward genetic tools such as mutants and mapping markers. Results We have computationally identified more than 30,000 polymorphisms (SNPs and indels) in <it>C. briggsae </it>strains AF16 and HK104. These include 1,363 SNPs that change restriction enzyme recognition sites (snip-SNPs) and 638 indels that range between 7 bp and 2 kb. We established bulk segregant and single animal-based PCR assay conditions and used these to test 107 polymorphisms. A total of 75 polymorphisms, consisting of 14 snip-SNPs and 61 indels, were experimentally confirmed with an overall success rate of 83%. The utility of polymorphisms in genetic studies was demonstrated by successful mapping of 12 mutations, including 5 that were localized to sub-chromosomal regions. Our mapping experiments have also revealed one case of a misassembled contig on chromosome 3. Conclusions We report a comprehensive set of polymorphisms in <it>C. briggsae </it>wild-type strains and demonstrate their use in mapping mutations. We also show that molecular markers can be useful tools to improve the <it>C. briggsae </it>genome sequence assembly. Our polymorphism resource promises to accelerate genetic and functional studies of <it>C. briggsae </it>genes.</p

Springer - Publisher Connector

Directory of Open Access Journals

Digital Repository at the University of Maryland

CORE

VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing

Author: Ding Li
Koboldt Daniel C.
Larson David E.
Lin Ling
Mardis Elaine R.
McLellan Michael D.
Miller Christopher A.
Shen Dong
Wilson Richard K.
Zhang Qunyuan
Publication venue: Digital Commons@Becker
Publication date: 01/01/2012
Field of study

Cancer is a disease driven by genetic variation and mutation. Exome sequencing can be utilized for discovering these variants and mutations across hundreds of tumors. Here we present an analysis tool, VarScan 2, for the detection of somatic mutations and copy number alterations (CNAs) in exome data from tumor–normal pairs. Unlike most current approaches, our algorithm reads data from both samples simultaneously; a heuristic and statistical algorithm detects sequence variants and classifies them by somatic status (germline, somatic, or LOH); while a comparison of normalized read depth delineates relative copy number changes. We apply these methods to the analysis of exome sequence data from 151 high-grade ovarian tumors characterized as part of the Cancer Genome Atlas (TCGA). We validated some 7790 somatic coding mutations, achieving 93% sensitivity and 85% precision for single nucleotide variant (SNV) detection. Exome-based CNA analysis identified 29 large-scale alterations and 619 focal events per tumor on average. As in our previous analysis of these data, we observed frequent amplification of oncogenes (e.g., CCNE1, MYC) and deletion of tumor suppressors (NF1, PTEN, and CDKN2A). We searched for additional recurrent focal CNAs using the correlation matrix diagonal segmentation (CMDS) algorithm, which identified 424 significant events affecting 582 genes. Taken together, our results demonstrate the robust performance of VarScan 2 for somatic mutation and CNA detection and shed new light on the landscape of genetic alterations in ovarian cancer

North Carolina macular dystrophy (MCDR1) caused by a novel tandem duplication of the PRDM13 gene

Author: Birch David G
Blanton Susan H
Bowne Sara J
Daiger Stephen P
Fulton Robert S
Jones Kaylie D
Koboldt Daniel C
Locke Kirsten G
Sullivan Lori S
Wheaton Dianna K
Wilson Richard K
Publication venue: Digital Commons@Becker
Publication date: 01/01/2016
Field of study

PURPOSE: To identify the underlying cause of disease in a large family with North Carolina macular dystrophy (NCMD). METHODS: A large four-generation family (RFS355) with an autosomal dominant form of NCMD was ascertained. Family members underwent comprehensive visual function evaluations. Blood or saliva from six affected family members and three unaffected spouses was collected and DNA tested for linkage to the MCDR1 locus on chromosome 6q12. Three affected family members and two unaffected spouses underwent whole exome sequencing (WES) and subsequently, custom capture of the linkage region followed by next-generation sequencing (NGS). Standard PCR and dideoxy sequencing were used to further characterize the mutation. RESULTS: Of the 12 eyes examined in six affected individuals, all but two had Gass grade 3 macular degeneration features. Large central excavation of the retinal and choroid layers, referred to as a macular caldera, was seen in an age-independent manner in the grade 3 eyes. The calderas are unique to affected individuals with MCDR1. Genome-wide linkage mapping and haplotype analysis of markers from the chromosome 6q region were consistent with linkage to the MCDR1 locus. Whole exome sequencing and custom-capture NGS failed to reveal any rare coding variants segregating with the phenotype. Analysis of the custom-capture NGS sequencing data for copy number variants uncovered a tandem duplication of approximately 60 kb on chromosome 6q. This region contains two genes, CCNC and PRDM13. The duplication creates a partial copy of CCNC and a complete copy of PRDM13. The duplication was found in all affected members of the family and is not present in any unaffected members. The duplication was not seen in 200 ethnically matched normal chromosomes. CONCLUSIONS: The cause of disease in the original family with MCDR1 and several others has been recently reported to be dysregulation of the PRDM13 gene, caused by either single base substitutions in a DNase 1 hypersensitive site upstream of the CCNC and PRDM13 genes or a tandem duplication of the PRDM13 gene. The duplication found in the RFS355 family is distinct from the previously reported duplication and provides additional support that dysregulation of PRDM13, not CCNC, is the cause of NCMD mapped to the MCDR1 locus

Re-sequencing expands our understanding of the phenotypic impact of variants at GWAS loci

Author: Ding Li
et al
Fronick Catrina
Fulton Lucinda L
Fulton Robert S
Koboldt Daniel C
Larson David E
Lin Ling
Magrini Vincent
McLellan Michael D
O\u27Laughlin Michele
Wilson Richard K
Zhang Qunyuan
Publication venue: Digital Commons@Becker
Publication date: 01/01/2014
Field of study