Search CORE

125 research outputs found

Clustering by genetic ancestry using genome-wide SNP data

Author: Baldwin Clinton T
Hartley Stephen W
Perls Thomas T
Sebastiani Paola
Solovieff Nadia
Steinberg Martin H
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Population stratification can cause spurious associations in a genome-wide association study (GWAS), and occurs when differences in allele frequencies of single nucleotide polymorphisms (SNPs) are due to ancestral differences between cases and controls rather than the trait of interest. Principal components analysis (PCA) is the established approach to detect population substructure using genome-wide data and to adjust the genetic association for stratification by including the top principal components in the analysis. An alternative solution is genetic matching of cases and controls that requires, however, well defined population strata for appropriate selection of cases and controls. Results We developed a novel algorithm to cluster individuals into groups with similar ancestral backgrounds based on the principal components computed by PCA. We demonstrate the effectiveness of our algorithm in real and simulated data, and show that matching cases and controls using the clusters assigned by the algorithm substantially reduces population stratification bias. Through simulation we show that the power of our method is higher than adjustment for PCs in certain situations. Conclusions In addition to reducing population stratification bias and improving power, matching creates a clean dataset free of population stratification which can then be used to build prediction models without including variables to adjust for ancestry. The cluster assignments also allow for the estimation of genetic heterogeneity by examining cluster specific effects.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Imputation of missing genotypes: an empirical evaluation of IMPUTE

Author: Baldwin Clinton T
Chui David HK
Fucharoen Supan
Hartley Stephen W
Perls Thomas T
Sebastiani Paola
Steinberg Martin H
Timofeev Nadia
Zhao Zhenming
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Imputation of missing genotypes is becoming a very popular solution for synchronizing genotype data collected with different microarray platforms but the effect of ethnic background, subject ascertainment, and amount of missing data on the accuracy of imputation are not well understood. Results We evaluated the accuracy of the program IMPUTE to generate the genotype data of partially or fully untyped single nucleotide polymorphisms (SNPs). The program uses a model-based approach to imputation that reconstructs the genotype distribution given a set of referent haplotypes and the observed data, and uses this distribution to compute the marginal probability of each missing genotype for each individual subject that is used to impute the missing data. We assembled genome-wide data from five different studies and three different ethnic groups comprising Caucasians, African Americans and Asians. We randomly removed genotype data and then compared the observed genotypes with those generated by IMPUTE. Our analysis shows 97% median accuracy in Caucasian subjects when less than 10% of the SNPs are untyped and missing genotypes are accepted regardless of their posterior probability. The median accuracy increases to 99% when we require 0.95 minimum posterior probability for an imputed genotype to be acceptable. The accuracy decreases to 86% or 94% when subjects are African Americans or Asians. We propose a strategy to improve the accuracy by leveraging the level of admixture in African Americans. Conclusion Our analysis suggests that IMPUTE is very accurate in samples of Caucasians origin, it is slightly less accurate in samples of Asians background, but substantially less accurate in samples of admixed background such as African Americans. Sample size and ascertainment do not seem to affect the accuracy of imputation.</p

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

Learning Bayesian Networks from Correlated Data

Author: Bae Harold
Montano Monty
Monti Stefano
Perls Thomas T.
Sebastiani Paola
Steinberg Martin H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/06/2016
Field of study

Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures

Harvard University - DASH

NIA Long Life Family Study: Objectives, design, and heritability of cross-sectional and longitudinal phenotypes

Author: Christensen Kaare
Jiuan Lin Shiow
Kulminski Alexander
Lee Joseph
Newman Anne
Perls Thomas T
Province Michael A
Sebastiani Paola
Wojczynski Mary K
Zmuda Joe M
Publication venue: Digital Commons@Becker
Publication date: 05/11/2021
Field of study

The NIA Long Life Family Study (LLFS) is a longitudinal, multicenter, multinational, population-based multigenerational family study of the genetic and nongenetic determinants of exceptional longevity and healthy aging. The Visit 1 in-person evaluation (2006-2009) recruited 4 953 individuals from 539 two-generation families, selected from the upper 1% tail of the Family Longevity Selection Score (FLoSS, which quantifies the degree of familial clustering of longevity). Demographic, anthropometric, cognitive, activities of daily living, ankle-brachial index, blood pressure, physical performance, and pulmonary function, along with serum, plasma, lymphocytes, red cells, and DNA, were collected. A Genome Wide Association Scan (GWAS) (Ilumina Omni 2.5M chip) followed by imputation was conducted. Visit 2 (2014-2017) repeated all Visit 1 protocols and added carotid ultrasonography of atherosclerotic plaque and wall thickness, additional cognitive testing, and perceived fatigability. On average, LLFS families show healthier aging profiles than reference populations, such as the Framingham Heart Study, at all age/sex groups, for many critical healthy aging phenotypes. However, participants are not uniformly protected. There is considerable heterogeneity among the pedigrees, with some showing exceptional cognition, others showing exceptional grip strength, others exceptional pulmonary function, etc. with little overlap in these families. There is strong heritability for key healthy aging phenotypes, both cross-sectionally and longitudinally, suggesting that at least some of this protection may be genetic. Little of the variance in these heritable phenotypes is explained by the common genome (GWAS + Imputation), which may indicate that rare protective variants for specific phenotypes may be running in selected families

Digital Commons@Becker

PubMed Central

Meta-analysis of genetic variants associated with human exceptional longevity

Author: Andersen Stacy L.
Bae1 Harold
Daw E. Warwick
Hirose Nobuyoshi
Kojima Toshio
Malovini Alberto
Perls Thomas T
Puca Annibale
Schupf Nicole
Sebastiani Paola
Sun Fangui X.
Publication venue: Digital Commons@Becker
Publication date: 01/01/2013
Field of study

Despite evidence from family studies that there is a strong genetic influence upon exceptional longevity, relatively few genetic variants have been associated with this trait. One reason could be that many genes individually have such weak effects that they cannot meet standard thresholds of genome wide significance, but as a group in specific combinations of genetic variations, they can have a strong influence. Previously we reported that such genetic signatures of 281 genetic markers associated with about 130 genes can do a relatively good job of differentiating centenarians from non-centenarians particularly if the centenarians are 106 years and older. This would support our hypothesis that the genetic influence upon exceptional longevity increases with older and older (and rarer) ages. We investigated this list of markers using similar genetic data from 5 studies of centenarians from the USA, Europe and Japan. The results from the meta-analysis show that many of these variants are associated with survival to these extreme ages in other studies. Since many centenarians compress morbidity and disability towards the end of their lives, these results could point to biological pathways and therefore new therapeutics to increase years of healthy lives in the general population

Digital Commons@Becker

PubMed Central

Recommended from our members

Learning Bayesian Networks from Correlated Data

Author: Bae Harold
Montano Monty
Monti Stefano
Perls Thomas T.
Sebastiani Paola
Steinberg Martin H.
Publication venue: Nature Publishing Group
Publication date
Field of study

ScholarsArchive@OSU

Protein signatures of centenarians and their offspring suggest centenarians age slower than other humans

Author: Andersen Stacy L.
Chandler Kevin B.
Costello Catherine E.
Denis Gerald
Federico Anthony
Ferrucci Luigi
Glass David J.
Gurinovich Anastasia
Jennings Lori
Monti Stefano
Morris Melody
Perls Thomas T.
Sebastiani Paola
Tanaka Toshiko
Publication venue: FIU Digital Commons
Publication date: 01/02/2021
Field of study

Using samples from the New England Centenarian Study (NECS), we sought to characterize the serum proteome of 77 centenarians, 82 centenarians\u27 offspring, and 65 age-matched controls of the offspring (mean ages: 105, 80, and 79 years). We identified 1312 proteins that significantly differ between centenarians and their offspring and controls (FDR \u3c 1%), and two different protein signatures that predict longer survival in centenarians and in younger people. By comparing the centenarian signature with 2 independent proteomic studies of aging, we replicated the association of 484 proteins of aging and we identified two serum protein signatures that are specific of extreme old age. The data suggest that centenarians acquire similar aging signatures as seen in younger cohorts that have short survival periods, suggesting that they do not escape normal aging markers, but rather acquire them much later than usual. For example, centenarian signatures are significantly enriched for senescence-associated secretory phenotypes, consistent with those seen with younger aged individuals, and from this finding, we provide a new list of serum proteins that can be used to measure cellular senescence. Protein co-expression network analysis suggests that a small number of biological drivers may regulate aging and extreme longevity, and that changes in gene regulation may be important to reach extreme old age. This centenarian study thus provides additional signatures that can be used to measure aging and provides specific circulating biomarkers of healthy aging and longevity, suggesting potential mechanisms that could help prolong health and support longevity

DigitalCommons@Florida International University

Genome-wide association study of personality traits in the Long Life Family Study

Author: Antonio eTerracciano
Antonio eTerracciano
E Warwick Daw
Harold T Bae
Jenny X Sun
Luigi eFerrucci
Paola eSebastiani
Stacy L Andersen
Thomas T Perls
Publication venue: Digital Commons@Becker
Publication date: 01/01/2013
Field of study

Personality traits have been shown to be associated with longevity and healthy aging. In order to discover novel genetic modifiers associated with personality traits as related with longevity, we performed a genome-wide association study (GWAS) on personality factors assessed by NEO-FFI in individuals enrolled in the Long Life Family Study (LLFS), a study of 583 families (N up to 4595) with clustering for longevity in the United States and Denmark. Three SNPs, in almost perfect LD, associated with agreeableness reached genome-wide significance (p<10-8) and replicated in an additional sample of 1279 LLFS subjects, although one (rs9650241) failed to replicate and the other two were not available in two independent replication cohorts, the Baltimore Longitudinal Study of Aging and the New England Centenarian Study. Based on 10,000,000 permutations, the empirical p-value of 2X10-7 was observed for the genome-wide significant SNPs. Seventeen SNPs that reached marginal statistical significance in the two previous GWASs (p-value < 10-4 and 10-5), were also marginally significantly associated in this study (p-value < 0.05), although none of the associations passed the Bonferroni correction. In addition, we tested age-by-SNP interactions and found some significant associations. Since scores of personality traits in LLFS subjects change in the oldest ages, and genetic factors outweigh environmental factors to achieve extreme ages, these age-by-SNP interactions could be a proxy for complex gene-gene interactions affecting personality traits and longevity

Crossref

Directory of Open Access Journals

Digital Commons@Becker

Frontiers - Publisher Connector

PubMed Central

Health and function of participants in the Long Life Family Study: A comparison with other cohorts

Author: Barral Sandra
Christensen Kaare
Glynn Nancy W.
Hadley Evan
Lee Joseph H.
Mayeux Richard
Newman Anne B.
Perls Thomas T.
Sebastiani Paola
Simonsick Eleanor M.
Taylor Christopher A.
Walston Jeremy D.
Yashin Anatoli I.
Zmuda Joseph M.
Publication venue: Impact Journals LLC
Publication date: 01/01/2011
Field of study

Individuals from families recruited for the Long Life Family Study (LLFS) (n= 4559) were examined and compared to individuals from other cohorts to determine whether the recruitment targeting longevity resulted in a cohort of individuals with better health and function. Other cohorts with similar data included the Cardiovascular Health Study, the Framingham Heart Study, and the New England Centenarian Study. Diabetes, chronic pulmonary disease and peripheral artery disease tended to be less common in LLFS probands and offspring compared to similar aged persons in the other cohorts. Pulse pressure and triglycerides were lower, high density lipids were higher, and a perceptual speed task and gait speed were better in LLFS. Age-specific comparisons showed differences that would be consistent with a higher peak, later onset of decline or slower rate of change across age in LLFS participants. These findings suggest several priority phenotypes for inclusion in future genetic analysis to identify loci contributing to exceptional survival

PubMed Central

Syddansk Universitets Forskerportal

Genetic Signatures of Exceptional Longevity in Humans

Author: A Nebel
AC Need
AL Price
AL Price
AM Herskind
Andrew T. DeWan
Annibale Puca
B Devlin
BJ Willcox
BJ Willcox
C Holscher
C Kooperberg
C Sabatti
CE Yu
Clinton T. Baldwin
CS Bloss
D Harman
D Harold
D Michie
Daniel A. Dworkis
DF Terry
DF Terry
DH Song
DJ Baker
DJ Balding
DJ Hand
Efthymia Melista
F Marroni
F Schachter
G Atzmon
G Atzmon
G Lettre
GE Fraser
Greg Gibson
GW Beecham
H Gudmundsson
J Deelen
J Deelen
J Dupuis
J Evert
J Stessman
J Vijg
J Yang
JB Meigs
JC Lambert
Jemma B. Wilk
JH Wang
Josephine Hoh
JW Vaupel
K Christensen
K Christensen
KL Lunetta
Kyle M. Walsh
L Alpert
L Pawlikowska
L Rokach
L Wang
LA Hindorff
M Beekman
M Bonafe
M Eriksson
M Ramoni
M Ramoni
M Schoenmaker
M Stephens
Martin H. Steinberg
MD Gray
ME Goddard
MF Ramoni
Monty Montano
N Barzilai
N Pankratz
N Solovieff
N Solovieff
Nadia Solovieff
NP Paynter
NR Wray
NR Wray
P Sebastiani
P Sebastiani
P Sebastiani
Paola Sebastiani
Q Tan
R Hitt
R Saxena
RA Kerber
RD Young
RF Lane
RG Westendorp
Richard H. Myers
S Hekimi
S Okser
S Purcell
S Raychaudhuri
SG Potkin
SM Purcell
SN Rajpathak
Stacy Andersen
Stephen W. Hartley
T Perls
T Perls
TH Meuwissen
Thomas T. Perls
TT Perls
TT Perls
TT Perls
X Li
YS Aulchenko
Z Wei
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Like most complex phenotypes, exceptional longevity is thought to reflect a combined influence of environmental (e.g., lifestyle choices, where we live) and genetic factors. To explore the genetic contribution, we undertook a genome-wide association study of exceptional longevity in 801 centenarians (median age at death 104 years) and 914 genetically matched healthy controls. Using these data, we built a genetic model that includes 281 single nucleotide polymorphisms (SNPs) and discriminated between cases and controls of the discovery set with 89% sensitivity and specificity, and with 58% specificity and 60% sensitivity in an independent cohort of 341 controls and 253 genetically matched nonagenarians and centenarians (median age 100 years). Consistent with the hypothesis that the genetic contribution is largest with the oldest ages, the sensitivity of the model increased in the independent cohort with older and older ages (71% to classify subjects with an age at death>102 and 85% to classify subjects with an age at death>105). For further validation, we applied the model to an additional, unmatched 60 centenarians (median age 107 years) resulting in 78% sensitivity, and 2863 unmatched controls with 61% specificity. The 281 SNPs include the SNP rs2075650 in TOMM40/APOE that reached irrefutable genome wide significance (posterior probability of association = 1) and replicated in the independent cohort. Removal of this SNP from the model reduced the accuracy by only 1%. Further in-silico analysis suggests that 90% of centenarians can be grouped into clusters characterized by different “genetic signatures” of varying predictive values for exceptional longevity. The correlation between 3 signatures and 3 different life spans was replicated in the combined replication sets. The different signatures may help dissect this complex phenotype into sub-phenotypes of exceptional longevity

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Archivio della Ricerca - Università di Salerno

The Francis Crick Institute