14,078 research outputs found

    Clinical validity assessment of a breast cancer risk model combining genetic and clinical information

    Get PDF
    _Background:_ The extent to which common genetic variation can assist in breast cancer (BCa) risk assessment is unclear. We assessed the addition of risk information from a panel of BCa-associated single nucleotide polymorphisms (SNPs) on risk stratification offered by the Gail Model.

_Methods:_ We selected 7 validated SNPs from the literature and genotyped them among white women in a nested case-control study within the Women’s Health Initiative Clinical Trial. To model SNP risk, previously published odds ratios were combined multiplicatively. To produce a combined clinical/genetic risk, Gail Model risk estimates were multiplied by combined SNP odds ratios. We assessed classification performance using reclassification tables and receiver operating characteristic (ROC) curves. 

_Results:_ The SNP risk score was well calibrated and nearly independent of Gail risk, and the combined predictor was more predictive than either Gail risk or SNP risk alone. In ROC curve analysis, the combined score had an area under the curve (AUC) of 0.594 compared to 0.557 for Gail risk alone. For reclassification with 5-year risk thresholds at 1.5% and 2%, the net reclassification index (NRI) was 0.085 (Z = 4.3, P = 1.0×10^-5^). Focusing on women with Gail 5-year risk of 1.5-2% results in an NRI of 0.195 (Z = 3.8, P = 8.6×10^−5^).

_Conclusions:_ Combining clinical risk factors and validated common genetic risk factors results in improvement in classification of BCa risks in white, postmenopausal women. This may have implications for informing primary prevention and/or screening strategies. Future research should assess the clinical utility of such strategies.

    Automated SNP genotype clustering algorithm to improve data completeness in high-throughput SNP genotyping datasets from custom arrays

    Get PDF
    High-throughput SNP genotyping platforms use automated genotype calling algorithms to assign genotypes. While these algorithms work efficiently for individual platforms, they are not compatible with other platforms, and have individual biases that result in missed genotype calls. Here we present data on the use of a second complementary SNP genotype clustering algorithm. The algorithm was originally designed for individual fluorescent SNP genotyping assays, and has been optimized to permit the clustering of large datasets generated from custom-designed Affymetrix SNP panels. In an analysis of data from a 3K array genotyped on 1,560 samples, the additional analysis increased the overall number of genotypes by over 45,000, significantly improving the completeness of the experimental data. This analysis suggests that the use of multiple genotype calling algorithms may be advisable in high-throughput SNP genotyping experiments. The software is written in Perl and is available from the corresponding author

    Accuracy and responses of genomic selection on key traits in apple breeding

    Get PDF
    open13siThe application of genomic selection in fruit tree crops is expected to enhance breeding efficiency by increasing prediction accuracy, increasing selection intensity and decreasing generation interval. The objectives of this study were to assess the accuracy of prediction and selection response in commercial apple breeding programmes for key traits. The training population comprised 977 individuals derived from 20 pedigreed full-sib families. Historic phenotypic data were available on 10 traits related to productivity and fruit external appearance and genotypic data for 7829 SNPs obtained with an Illumina 20K SNP array. From these data, a genome-wide prediction model was built and subsequently used to calculate genomic breeding values of five application full-sib families. The application families had genotypes at 364 SNPs from a dedicated 512 SNP array, and these genotypic data were extended to the high-density level by imputation. These five families were phenotyped for 1 year and their phenotypes were compared to the predicted breeding values. Accuracy of genomic prediction across the 10 traits reached a maximum value of 0.5 and had a median value of 0.19. The accuracies were strongly affected by the phenotypic distribution and heritability of traits. In the largest family, significant selection response was observed for traits with high heritability and symmetric phenotypic distribution. Traits that showed non-significant response often had reduced and skewed phenotypic variation or low heritability. Among the five application families the accuracies were uncorrelated to the degree of relatedness to the training population. The results underline the potential of genomic prediction to accelerate breeding progress in outbred fruit tree crops that still need to overcome long generation intervals and extensive phenotyping costs.openMuranty, H.; Troggio, M.; Sadok, I.B.; Mehdi A.R.; Auwerkerken, A.; Banchi, E.; Velasco, R.; Stevanato, P.; Eric van de Weg, W.; Di Guardo, M.; Kumar, S.; Laurens, F.; Bink, M.C.A.M.Muranty, H.; Troggio, M.; Sadok, I. B.; Mehdi, A. R.; Auwerkerken, A.; Banchi, E.; Velasco, R.; Stevanato, Piergiorgio; Eric van de Weg, W.; Di Guardo, M.; Kumar, S.; Laurens, F.; Bink, M. C. A. M

    Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

    Full text link
    Imputation using external reference panels is a widely used approach for increasing power in GWAS and meta-analysis. Existing HMM-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants (increasing to 87% (60%) when summary LD information is available from target samples) versus 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and is computationally very fast. As an empirical demonstration, we apply our method to 7 case-control phenotypes from the WTCCC data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2\chi^2 association statistics) compared to HMM-based imputation from individual-level genotypes at the 227 (176) published SNPs in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of 4 lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic vs. non-genic loci for these traits, as compared to an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses.Comment: 32 pages, 4 figure

    Accurate estimation of SNP-heritability from biobank-scale data irrespective of genetic architecture.

    Get PDF
    SNP-heritability is a fundamental quantity in the study of complex traits. Recent studies have shown that existing methods to estimate genome-wide SNP-heritability can yield biases when their assumptions are violated. While various approaches have been proposed to account for frequency- and linkage disequilibrium (LD)-dependent genetic architectures, it remains unclear which estimates reported in the literature are reliable. Here we show that genome-wide SNP-heritability can be accurately estimated from biobank-scale data irrespective of genetic architecture, without specifying a heritability model or partitioning SNPs by allele frequency and/or LD. We show analytically and through extensive simulations starting from real genotypes (UK Biobank, N = 337 K) that, unlike existing methods, our closed-form estimator is robust across a wide range of architectures. We provide estimates of SNP-heritability for 22 complex traits in the UK Biobank and show that, consistent with our results in simulations, existing biobank-scale methods yield estimates up to 30% different from our theoretically-justified approach

    Genetics of callous-unemotional behavior in children

    Get PDF
    Callous-unemotional behavior (CU) is currently under consideration as a subtyping index for conduct disorder diagnosis. Twin studies routinely estimate the heritability of CU as greater than 50%. It is now possible to estimate genetic influence using DNA alone from samples of unrelated individuals, not relying on the assumptions of the twin method. Here we use this new DNA method (implemented in a software package called Genome-wide Complex Trait Analysis, GCTA) for the first time to estimate genetic influence on CU. We also report the first genome-wide association (GWA) study of CU as a quantitative trait. We compare these DNA results to those from twin analyses using the same measure and the same community sample of 2,930 children rated by their teachers at ages 7, 9 and 12. GCTA estimates of heritability were near zero, even though twin analysis of CU in this sample confirmed the high heritability of CU reported in the literature, and even though GCTA estimates of heritability were substantial for cognitive and anthropological traits in this sample. No significant associations were found in GWA analysis, which, like GCTA, only detects additive effects of common DNA variants. The phrase ‘missing heritability’ was coined to refer to the gap between variance associated with DNA variants identified in GWA studies versus twin study heritability. However, GCTA heritability, not twin study heritability, is the ceiling for GWA studies because both GCTA and GWA are limited to the overall additive effects of common DNA variants, whereas twin studies are not. This GCTA ceiling is very low for CU in our study, despite its high twin study heritability estimate. The gap between GCTA and twin study heritabilities will make it challenging to identify genes responsible for the heritability of CU