125 research outputs found

    Clustering by genetic ancestry using genome-wide SNP data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Population stratification can cause spurious associations in a genome-wide association study (GWAS), and occurs when differences in allele frequencies of single nucleotide polymorphisms (SNPs) are due to ancestral differences between cases and controls rather than the trait of interest. Principal components analysis (PCA) is the established approach to detect population substructure using genome-wide data and to adjust the genetic association for stratification by including the top principal components in the analysis. An alternative solution is genetic matching of cases and controls that requires, however, well defined population strata for appropriate selection of cases and controls.</p> <p>Results</p> <p>We developed a novel algorithm to cluster individuals into groups with similar ancestral backgrounds based on the principal components computed by PCA. We demonstrate the effectiveness of our algorithm in real and simulated data, and show that matching cases and controls using the clusters assigned by the algorithm substantially reduces population stratification bias. Through simulation we show that the power of our method is higher than adjustment for PCs in certain situations.</p> <p>Conclusions</p> <p>In addition to reducing population stratification bias and improving power, matching creates a clean dataset free of population stratification which can then be used to build prediction models without including variables to adjust for ancestry. The cluster assignments also allow for the estimation of genetic heterogeneity by examining cluster specific effects.</p

    Imputation of missing genotypes: an empirical evaluation of IMPUTE

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Imputation of missing genotypes is becoming a very popular solution for synchronizing genotype data collected with different microarray platforms but the effect of ethnic background, subject ascertainment, and amount of missing data on the accuracy of imputation are not well understood.</p> <p>Results</p> <p>We evaluated the accuracy of the program IMPUTE to generate the genotype data of partially or fully untyped single nucleotide polymorphisms (SNPs). The program uses a model-based approach to imputation that reconstructs the genotype distribution given a set of referent haplotypes and the observed data, and uses this distribution to compute the marginal probability of each missing genotype for each individual subject that is used to impute the missing data. We assembled genome-wide data from five different studies and three different ethnic groups comprising Caucasians, African Americans and Asians. We randomly removed genotype data and then compared the observed genotypes with those generated by IMPUTE. Our analysis shows 97% median accuracy in Caucasian subjects when less than 10% of the SNPs are untyped and missing genotypes are accepted regardless of their posterior probability. The median accuracy increases to 99% when we require 0.95 minimum posterior probability for an imputed genotype to be acceptable. The accuracy decreases to 86% or 94% when subjects are African Americans or Asians. We propose a strategy to improve the accuracy by leveraging the level of admixture in African Americans.</p> <p>Conclusion</p> <p>Our analysis suggests that IMPUTE is very accurate in samples of Caucasians origin, it is slightly less accurate in samples of Asians background, but substantially less accurate in samples of admixed background such as African Americans. Sample size and ascertainment do not seem to affect the accuracy of imputation.</p

    NIA Long Life Family Study: Objectives, design, and heritability of cross-sectional and longitudinal phenotypes

    Get PDF
    The NIA Long Life Family Study (LLFS) is a longitudinal, multicenter, multinational, population-based multigenerational family study of the genetic and nongenetic determinants of exceptional longevity and healthy aging. The Visit 1 in-person evaluation (2006-2009) recruited 4 953 individuals from 539 two-generation families, selected from the upper 1% tail of the Family Longevity Selection Score (FLoSS, which quantifies the degree of familial clustering of longevity). Demographic, anthropometric, cognitive, activities of daily living, ankle-brachial index, blood pressure, physical performance, and pulmonary function, along with serum, plasma, lymphocytes, red cells, and DNA, were collected. A Genome Wide Association Scan (GWAS) (Ilumina Omni 2.5M chip) followed by imputation was conducted. Visit 2 (2014-2017) repeated all Visit 1 protocols and added carotid ultrasonography of atherosclerotic plaque and wall thickness, additional cognitive testing, and perceived fatigability. On average, LLFS families show healthier aging profiles than reference populations, such as the Framingham Heart Study, at all age/sex groups, for many critical healthy aging phenotypes. However, participants are not uniformly protected. There is considerable heterogeneity among the pedigrees, with some showing exceptional cognition, others showing exceptional grip strength, others exceptional pulmonary function, etc. with little overlap in these families. There is strong heritability for key healthy aging phenotypes, both cross-sectionally and longitudinally, suggesting that at least some of this protection may be genetic. Little of the variance in these heritable phenotypes is explained by the common genome (GWAS + Imputation), which may indicate that rare protective variants for specific phenotypes may be running in selected families

    Meta-analysis of genetic variants associated with human exceptional longevity

    Get PDF
    Despite evidence from family studies that there is a strong genetic influence upon exceptional longevity, relatively few genetic variants have been associated with this trait. One reason could be that many genes individually have such weak effects that they cannot meet standard thresholds of genome wide significance, but as a group in specific combinations of genetic variations, they can have a strong influence. Previously we reported that such genetic signatures of 281 genetic markers associated with about 130 genes can do a relatively good job of differentiating centenarians from non-centenarians particularly if the centenarians are 106 years and older. This would support our hypothesis that the genetic influence upon exceptional longevity increases with older and older (and rarer) ages. We investigated this list of markers using similar genetic data from 5 studies of centenarians from the USA, Europe and Japan. The results from the meta-analysis show that many of these variants are associated with survival to these extreme ages in other studies. Since many centenarians compress morbidity and disability towards the end of their lives, these results could point to biological pathways and therefore new therapeutics to increase years of healthy lives in the general population

    Protein signatures of centenarians and their offspring suggest centenarians age slower than other humans

    Get PDF
    Using samples from the New England Centenarian Study (NECS), we sought to characterize the serum proteome of 77 centenarians, 82 centenarians\u27 offspring, and 65 age-matched controls of the offspring (mean ages: 105, 80, and 79 years). We identified 1312 proteins that significantly differ between centenarians and their offspring and controls (FDR \u3c 1%), and two different protein signatures that predict longer survival in centenarians and in younger people. By comparing the centenarian signature with 2 independent proteomic studies of aging, we replicated the association of 484 proteins of aging and we identified two serum protein signatures that are specific of extreme old age. The data suggest that centenarians acquire similar aging signatures as seen in younger cohorts that have short survival periods, suggesting that they do not escape normal aging markers, but rather acquire them much later than usual. For example, centenarian signatures are significantly enriched for senescence-associated secretory phenotypes, consistent with those seen with younger aged individuals, and from this finding, we provide a new list of serum proteins that can be used to measure cellular senescence. Protein co-expression network analysis suggests that a small number of biological drivers may regulate aging and extreme longevity, and that changes in gene regulation may be important to reach extreme old age. This centenarian study thus provides additional signatures that can be used to measure aging and provides specific circulating biomarkers of healthy aging and longevity, suggesting potential mechanisms that could help prolong health and support longevity

    Genome-wide association study of personality traits in the Long Life Family Study

    Get PDF
    Personality traits have been shown to be associated with longevity and healthy aging. In order to discover novel genetic modifiers associated with personality traits as related with longevity, we performed a genome-wide association study (GWAS) on personality factors assessed by NEO-FFI in individuals enrolled in the Long Life Family Study (LLFS), a study of 583 families (N up to 4595) with clustering for longevity in the United States and Denmark. Three SNPs, in almost perfect LD, associated with agreeableness reached genome-wide significance (p&lt;10-8) and replicated in an additional sample of 1279 LLFS subjects, although one (rs9650241) failed to replicate and the other two were not available in two independent replication cohorts, the Baltimore Longitudinal Study of Aging and the New England Centenarian Study. Based on 10,000,000 permutations, the empirical p-value of 2X10-7 was observed for the genome-wide significant SNPs. Seventeen SNPs that reached marginal statistical significance in the two previous GWASs (p-value &lt; 10-4 and 10-5), were also marginally significantly associated in this study (p-value &lt; 0.05), although none of the associations passed the Bonferroni correction. In addition, we tested age-by-SNP interactions and found some significant associations. Since scores of personality traits in LLFS subjects change in the oldest ages, and genetic factors outweigh environmental factors to achieve extreme ages, these age-by-SNP interactions could be a proxy for complex gene-gene interactions affecting personality traits and longevity

    Health and function of participants in the Long Life Family Study: A comparison with other cohorts

    Get PDF
    Individuals from families recruited for the Long Life Family Study (LLFS) (n= 4559) were examined and compared to individuals from other cohorts to determine whether the recruitment targeting longevity resulted in a cohort of individuals with better health and function. Other cohorts with similar data included the Cardiovascular Health Study, the Framingham Heart Study, and the New England Centenarian Study. Diabetes, chronic pulmonary disease and peripheral artery disease tended to be less common in LLFS probands and offspring compared to similar aged persons in the other cohorts. Pulse pressure and triglycerides were lower, high density lipids were higher, and a perceptual speed task and gait speed were better in LLFS. Age-specific comparisons showed differences that would be consistent with a higher peak, later onset of decline or slower rate of change across age in LLFS participants. These findings suggest several priority phenotypes for inclusion in future genetic analysis to identify loci contributing to exceptional survival

    Genetic Signatures of Exceptional Longevity in Humans

    Get PDF
    Like most complex phenotypes, exceptional longevity is thought to reflect a combined influence of environmental (e.g., lifestyle choices, where we live) and genetic factors. To explore the genetic contribution, we undertook a genome-wide association study of exceptional longevity in 801 centenarians (median age at death 104 years) and 914 genetically matched healthy controls. Using these data, we built a genetic model that includes 281 single nucleotide polymorphisms (SNPs) and discriminated between cases and controls of the discovery set with 89% sensitivity and specificity, and with 58% specificity and 60% sensitivity in an independent cohort of 341 controls and 253 genetically matched nonagenarians and centenarians (median age 100 years). Consistent with the hypothesis that the genetic contribution is largest with the oldest ages, the sensitivity of the model increased in the independent cohort with older and older ages (71% to classify subjects with an age at death>102 and 85% to classify subjects with an age at death>105). For further validation, we applied the model to an additional, unmatched 60 centenarians (median age 107 years) resulting in 78% sensitivity, and 2863 unmatched controls with 61% specificity. The 281 SNPs include the SNP rs2075650 in TOMM40/APOE that reached irrefutable genome wide significance (posterior probability of association = 1) and replicated in the independent cohort. Removal of this SNP from the model reduced the accuracy by only 1%. Further in-silico analysis suggests that 90% of centenarians can be grouped into clusters characterized by different “genetic signatures” of varying predictive values for exceptional longevity. The correlation between 3 signatures and 3 different life spans was replicated in the combined replication sets. The different signatures may help dissect this complex phenotype into sub-phenotypes of exceptional longevity
    corecore