37 research outputs found
Accuracy of Administratively-Assigned Ancestry for Diverse Populations in an Electronic Medical Record-Linked Biobank
<div><p>Recently, the development of biobanks linked to electronic medical records has presented new opportunities for genetic and epidemiological research. Studies based on these resources, however, present unique challenges, including the accurate assignment of individual-level population ancestry. In this work we examine the accuracy of administratively-assigned race in diverse populations by comparing assigned races to genetically-defined ancestry estimates. Using 220 ancestry informative markers, we generated principal components for patients in our dataset, which were used to cluster patients into groups based on genetic ancestry. Consistent with other studies, we find a strong overall agreement (Kappa ā=ā0.872) between genetic ancestry and assigned race, with higher rates of agreement for African-descent and European-descent assignments, and reduced agreement for Hispanic, East Asian-descent, and South Asian-descent assignments. These results suggest caution when selecting study samples of non-African and non-European backgrounds when administratively-assigned race from biobanks is used.</p></div
Comparison of administratively-assigned race and genetic ancestry, based on principal component analysis.
<p>A) All pairwise combinations of principle components (PCs) 1 through 3, by administratively assigned race. B) All pairwise combinations of PCs 1 through 3, by cluster assignments corresponding to genetic ancestry. Comparison of Frames 1A and1B indicate individuals with administratively assigned race different than their genetically defined ancestry cluster. For example, the East Asian-descent cluster (1B; blue) contains individuals with administratively-assigned race (1A) of Caucasian (green), Hispanic (purple), and Other (orange).</p
Agreement between genetic and assigned ancestry.
<p>Notation: Cohen's Kappa coefficient (standard error).</p><p>South Asian-descent includes individuals with Native American and Indian race codes in BioVU.</p><p>Samples with administratively-assigned race of āUnknownā were excluded from this analysis.</p
Percentages of each administratively-assigned race assigned to each genetic ancestry group.
<p>Percentages reflect the proportion of individuals assigned to a genetic ancestry cluster for given administratively-assigned race.</p
Distribution of administratively-assigned race.
<p>Race categories listed are based on classification options originating from the SD. Our BioVU dataset contained no individuals labeled Other (O). Vanderbilt University Medical Center is located in Davidson County, TN. 2010 US census data is shown for Davidson County, Tennessee <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0099161#pone.0099161-US1" target="_blank">[25]</a>. * For Davidson County, āAsian/Pacificā includes Asian (Non-Indian), Native Hawaiian, and Pacific Islander individuals, āNative Americanā includes Native American (American Indian) and Alaskan Native individuals, āIndianā includes Asian Indian individuals, and āUnknownā includes āsome other raceā and individuals who reported two or more races for the census. ** āHispanicā is not listed a race in the US Census; rather, Hispanic-origin is indicated and is not exclusive to any racial category. For example, 25,156 individuals in Davidson County who self-identified as āWhiteā also self-identified, separately, as Hispanic. Within Davidson County, 9.8% of individuals indicated Hispanic origin.</p
<i>APOE</i> allele frequencies in suspected non-amyloid pathophysiology (SNAP) and the prodromal stages of Alzheimerās Disease
<div><p>Biomarker definitions for preclinical Alzheimerās disease (AD) have identified individuals with neurodegeneration (ND+) without Ī²-amyloidosis (AĪ²-) and labeled them with suspected non-AD pathophysiology (SNAP). We evaluated <i>Apolipoprotein E</i> (<i>APOE</i>) Īµ2 and Īµ4 allele frequencies across biomarker definitionsāAĪ²-/ND- (n = 268), AĪ²+/ND- (n = 236), AĪ²-/ND+ or SNAP (n = 78), AĪ²+/ND+ (n = 204)āhypothesizing that SNAP would have an <i>APOE</i> profile comparable to AĪ²-/ND-. Using AD Neuroimaging Initiative data (n = 786, 72Ā±7 years, 48% female), amyloid status (AĪ²+ or AĪ²-) was defined by cerebrospinal fluid (CSF) AĪ²-42 levels, and neurodegeneration status (ND+ or ND-) was defined by hippocampal volume from MRI. Binary logistic regression related biomarker status to <i>APOE</i> Īµ2 and Īµ4 allele carrier status, adjusting for age, sex, education, and cognitive diagnosis. Compared to the biomarker negative (AĪ²-/ND-) participants, higher proportions of Īµ4 and lower proportions of Īµ2 carriers were observed among AĪ²+/ND- (Īµ4: OR = 6.23, p<0.001; Īµ2: OR = 0.53, p = 0.03) and AĪ²+/ND+ participants (Īµ4: OR = 12.07, p<0.001; Īµ2: OR = 0.29, p = 0.004). SNAP participants were statistically comparable to biomarker negative participants (p-values>0.30). In supplemental analyses, comparable results were observed when coding SNAP using amyloid imaging and when using CSF tau levels. In contrast to <i>APOE</i>, a polygenic risk score for AD that excluded <i>APOE</i> did not show an association with amyloidosis or neurodegeneration (p-values>0.15), but did show an association with SNAP defined using CSF tau (Ī² = 0.004, p = 0.02). Thus, in a population with low levels of cerebrovascular disease and a lower prevalence of SNAP than the general population, <i>APOE</i> and known genetic drivers of AD do not appear to contribute to the neurodegeneration observed in SNAP. Additional work in population based samples is needed to better elucidate the genetic contributors to various etiological drivers of SNAP.</p></div
Associations between biomarker groups and APOE carrier status.
<p>Associations between biomarker groups and APOE carrier status.</p
APOE genotypes across AD biomarker.
<p>Pie charts are presented by biomarker group based on amyloid status defined using levels of cerebrospinal fluid amyloid-Ī² 42 (AĪ²) and neurodegeneration defined using hippocampal volume (ND). Colors represent APOE genotype whereby gray represents homozygous Īµ3 allele carriers, shades of red represent Īµ4 allele carriers, shades of blue represent Īµ2 allele carriers, and purple is used to represent Īµ2/Īµ4 carriers. Sample sizes are presented below the segment label for each allele combination. Allele combinations that do not have any participants within a given biomarker group are labeled in light grey font.</p