98 research outputs found

    SMaSH: A Benchmarking Toolkit for Human Genome Variant Calling

    Full text link
    Motivation: Computational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad-hoc and incomplete. Agreement on benchmarking variant calling methods would stimulate development of genomic processing tools and facilitate communication among researchers. Results: We propose SMaSH, a benchmarking methodology for evaluating human genome variant calling algorithms. We generate synthetic datasets, organize and interpret a wide range of existing benchmarking data for real genomes, and propose a set of accuracy and computational performance metrics for evaluating variant calling methods on this benchmarking data. Moreover, we illustrate the utility of SMaSH to evaluate the performance of some leading single nucleotide polymorphism (SNP), indel, and structural variant calling algorithms. Availability: We provide free and open access online to the SMaSH toolkit, along with detailed documentation, at smash.cs.berkeley.edu

    Colorectal cancer linkage on chromosomes 4q21, 8q13, 12q24, and 15q22

    Get PDF
    A substantial proportion of familial colorectal cancer (CRC) is not a consequence of known susceptibility loci, such as mismatch repair (MMR) genes, supporting the existence of additional loci. To identify novel CRC loci, we conducted a genome-wide linkage scan in 356 white families with no evidence of defective MMR (i.e., no loss of tumor expression of MMR proteins, no microsatellite instability (MSI)-high tumors, or no evidence of linkage to MMR genes). Families were ascertained via the Colon Cancer Family Registry multi-site NCI-supported consortium (Colon CFR), the City of Hope Comprehensive Cancer Center, and Memorial University of Newfoundland. A total of 1,612 individuals (average 5.0 per family including 2.2 affected) were genotyped using genome-wide single nucleotide polymorphism linkage arrays; parametric and non-parametric linkage analysis used MERLIN in a priori-defined family groups. Five lod scores greater than 3.0 were observed assuming heterogeneity. The greatest were among families with mean age of diagnosis less than 50 years at 4q21.1 (dominant HLOD = 4.51, α = 0.84, 145.40 cM, rs10518142) and among all families at 12q24.32 (dominant HLOD = 3.60, α = 0.48, 285.15 cM, rs952093). Among families with four or more affected individuals and among clinic-based families, a common peak was observed at 15q22.31 (101.40 cM, rs1477798; dominant HLOD = 3.07, α = 0.29; dominant HLOD = 3.03, α = 0.32, respectively). Analysis of families with only two affected individuals yielded a peak at 8q13.2 (recessive HLOD = 3.02, α = 0.51, 132.52 cM, rs1319036). These previously unreported linkage peaks demonstrate the continued utility of family-based data in complex traits and suggest that new CRC risk alleles remain to be elucidated. © 2012 Cicek et al

    Germline mutations in PMS2 and MLH1 in individuals with solitary loss of PMS2 expression in colorectal carcinomas from the Colon Cancer Family Registry Cohort

    Get PDF
    Immunohistochemistry for DNA mismatch repair proteins is used to screen for Lynch syndrome in individuals with colorectal carcinoma (CRC). Although solitary loss of PMS2 expression is indicative of carrying a germline mutation in PMS2, previous studies reported MLH1 mutation in some cases. We determined the prevalence of MLH1 germline mutations in a large cohort of individuals with a CRC demonstrating solitary loss of PMS2 expression

    Colorectal and other cancer risks for carriers and noncarriers from families with a DNA mismatch repair gene mutation: A Prospective Cohort Study

    Get PDF
    To determine whether cancer risks for carriers and noncarriers from families with a mismatch repair (MMR) gene mutation are increased above the risks of the general population. We prospectively followed a cohort of 446 unaffected carriers of an MMR gene mutation (MLH1, n = 161; MSH2, n = 222; MSH6, n = 47; and PMS2, n = 16) and 1,029 their unaffected relatives who did not carry a mutation every 5 years at recruitment centers of the Colon Cancer Family Registry. For comparison of cancer risk with the general population, we estimated country-, age-, and sex-specific standardized incidence ratios (SIRs) of cancer for carriers and noncarriers. Over a median follow-up of 5 years, mutation carriers had an increased risk of colorectal cancer (CRC; SIR, 20.48; 95% CI, 11.71 to 33.27; P < .001), endometrial cancer (SIR, 30.62; 95% CI, 11.24 to 66.64; P < .001), ovarian cancer (SIR, 18.81; 95% CI, 3.88 to 54.95; P < .001), renal cancer (SIR, 11.22; 95% CI, 2.31 to 32.79; P < .001), pancreatic cancer (SIR, 10.68; 95% CI, 2.68 to 47.70; P = .001), gastric cancer (SIR, 9.78; 95% CI, 1.18 to 35.30; P = .009), urinary bladder cancer (SIR, 9.51; 95% CI, 1.15 to 34.37; P = .009), and female breast cancer (SIR, 3.95; 95% CI, 1.59 to 8.13; P = .001). We found no evidence of their noncarrier relatives having an increased risk of any cancer, including CRC (SIR, 1.02; 95% CI, 0.33 to 2.39; P = .97). We confirmed that carriers of an MMR gene mutation were at increased risk of a wide variety of cancers, including some cancers not previously recognized as being a result of MMR mutations, and found no evidence of an increased risk of cancer for their noncarrier relatives

    Linkage to chromosome 2q32.2-q33.3 in familial serrated neoplasia (Jass syndrome)

    Get PDF
    Causative genetic variants have to date been identified for only a small proportion of familial colorectal cancer (CRC). While conditions such as Familial Adenomatous Polyposis and Lynch syndrome have well defined genetic causes, the search for variants underlying the remainder of familial CRC is plagued by genetic heterogeneity. The recent identification of families with a heritable predisposition to malignancies arising through the serrated pathway (familial serrated neoplasia or Jass syndrome) provides an opportunity to study a subset of familial CRC in which heterogeneity may be greatly reduced. A genome-wide linkage screen was performed on a large family displaying a dominantly-inherited predisposition to serrated neoplasia genotyped using the Affymetrix GeneChip Human Mapping 10 K SNP Array. Parametric and nonparametric analyses were performed and resulting regions of interest, as well as previously reported CRC susceptibility loci at 3q22, 7q31 and 9q22, were followed up by finemapping in 10 serrated neoplasia families. Genome-wide linkage analysis revealed regions of interest at 2p25.2-p25.1, 2q24.3-q37.1 and 8p21.2-q12.1. Finemapping linkage and haplotype analyses identified 2q32.2-q33.3 as the region most likely to harbour linkage, with heterogeneity logarithm of the odds (HLOD) 2.09 and nonparametric linkage (NPL) score 2.36 (P = 0.004). Five primary candidate genes (CFLAR, CASP10, CASP8, FZD7 and BMPR2) were sequenced and no segregating variants identified. There was no evidence of linkage to previously reported loci on chromosomes 3, 7 and 9

    Telomere structure and maintenance gene variants and risk of five cancer types.

    Get PDF
    Telomeres cap chromosome ends, protecting them from degradation, double-strand breaks, and end-to-end fusions. Telomeres are maintained by telomerase, a reverse transcriptase encoded by TERT, and an RNA template encoded by TERC. Loci in the TERT and adjoining CLPTM1L region are associated with risk of multiple cancers. We therefore investigated associations between variants in 22 telomere structure and maintenance gene regions and colorectal, breast, prostate, ovarian, and lung cancer risk. We performed subset-based meta-analyses of 204,993 directly-measured and imputed SNPs among 61,851 cancer cases and 74,457 controls of European descent. Independent associations for SNP minor alleles were identified using sequential conditional analysis (with gene-level p value cutoffs ≤3.08 × 10-5 ). Of the thirteen independent SNPs observed to be associated with cancer risk, novel findings were observed for seven loci. Across the DCLRE1B region, rs974494 and rs12144215 were inversely associated with prostate and lung cancers, and colorectal, breast, and prostate cancers, respectively. Across the TERC region, rs75316749 was positively associated with colorectal, breast, ovarian, and lung cancers. Across the DCLRE1B region, rs974404 and rs12144215 were inversely associated with prostate and lung cancers, and colorectal, breast, and prostate cancers, respectively. Near POT1, rs116895242 was inversely associated with colorectal, ovarian, and lung cancers, and RTEL1 rs34978822 was inversely associated with prostate and lung cancers. The complex association patterns in telomere-related genes across cancer types may provide insight into mechanisms through which telomere dysfunction in different tissues influences cancer risk.Funding for the iCOGS infrastructure came from: the European Community’s Seventh Framework Programme under grant agreement n° 223175 (HEALTH-F2-2009-223175) (COGS), Cancer Research UK (C1287/A10118, C1287/A 10710, C12292/A11174, C1281/A12014, C5047/A8384, C5047/A15007, C5047/A10692, C8197/A16565), the National Institutes of Health (CA128978) and Post-Cancer GWAS initiative (1U19 CA148537, 1U19 CA148065 and 1U19 CA148112 – the GAME-ON initiative), the Department of Defense (W81XWH-10-1-0341), the Canadian Institutes of Health Research (CIHR) for the CIHR Team in Familial Risks of Breast Cancer, Komen Foundation for the Cure, the Breast Cancer Research Foundation, and the Ovarian Cancer Research Fund.This is the author accepted manuscript. The final version is available from Wiley via http://dx.doi.org/10.1002/ijc.3028

    Age- and Tumor Subtype-Specific Breast Cancer Risk Estimates for CHEK2*1100delC Carriers.

    Get PDF
    PURPOSE: CHEK2*1100delC is a well-established breast cancer risk variant that is most prevalent in European populations; however, there are limited data on risk of breast cancer by age and tumor subtype, which limits its usefulness in breast cancer risk prediction. We aimed to generate tumor subtype- and age-specific risk estimates by using data from the Breast Cancer Association Consortium, including 44,777 patients with breast cancer and 42,997 controls from 33 studies genotyped for CHEK2*1100delC. PATIENTS AND METHODS: CHEK2*1100delC genotyping was mostly done by a custom Taqman assay. Breast cancer odds ratios (ORs) for CHEK2*1100delC carriers versus noncarriers were estimated by using logistic regression and adjusted for study (categorical) and age. Main analyses included patients with invasive breast cancer from population- and hospital-based studies. RESULTS: Proportions of heterozygous CHEK2*1100delC carriers in controls, in patients with breast cancer from population- and hospital-based studies, and in patients with breast cancer from familial- and clinical genetics center-based studies were 0.5%, 1.3%, and 3.0%, respectively. The estimated OR for invasive breast cancer was 2.26 (95%CI, 1.90 to 2.69; P = 2.3 × 10(-20)). The OR was higher for estrogen receptor (ER)-positive disease (2.55 [95%CI, 2.10 to 3.10; P = 4.9 × 10(-21)]) than it was for ER-negative disease (1.32 [95%CI, 0.93 to 1.88; P = .12]; P interaction = 9.9 × 10(-4)). The OR significantly declined with attained age for breast cancer overall (P = .001) and for ER-positive tumors (P = .001). Estimated cumulative risks for development of ER-positive and ER-negative tumors by age 80 in CHEK2*1100delC carriers were 20% and 3%, respectively, compared with 9% and 2%, respectively, in the general population of the United Kingdom. CONCLUSION: These CHEK2*1100delC breast cancer risk estimates provide a basis for incorporating CHEK2*1100delC into breast cancer risk prediction models and into guidelines for intensified screening and follow-up.NIH
    corecore