8 research outputs found

    Admixture mapping implicates 13q33.3 as ancestry-of-origin locus for Alzheimer disease in Hispanic and Latino populations

    Get PDF
    Alzheimer disease (AD) is the most common form of senile dementia, with high incidence late in life in many populations including Caribbean Hispanic (CH) populations. Such admixed populations, descended from more than one ancestral population, can present challenges for genetic studies, including limited sample sizes and unique analytical constraints. Therefore, CH populations and other admixed populations have not been well represented in studies of AD, and much of the genetic variation contributing to AD risk in these populations remains unknown. Here, we conduct genome-wide analysis of AD in multiplex CH families from the Alzheimer Disease Sequencing Project (ADSP). We developed, validated, and applied an implementation of a logistic mixed model for admixture mapping with binary traits that leverages genetic ancestry to identify ancestry-of-origin loci contributing to AD. We identified three loci on chromosome 13q33.3 associated with reduced risk of AD, where associations were driven by Native American (NAM) ancestry. This AD admixture mapping signal spans the FAM155A, ABHD13, TNFSF13B, LIG4, and MYO16 genes and was supported by evidence for association in an independent sample from the Alzheimer's Genetics in Argentina—Alzheimer Argentina consortium (AGA-ALZAR) study with considerable NAM ancestry. We also provide evidence of NAM haplotypes and key variants within 13q33.3 that segregate with AD in the ADSP whole-genome sequencing data. Interestingly, the widely used genome-wide association study approach failed to identify associations in this region. Our findings underscore the potential of leveraging genetic ancestry diversity in recently admixed populations to improve genetic mapping, in this case for AD-relevant loci.Fil: Horimoto, Andrea R.V.R.. University of Washington; Estados UnidosFil: Boyken, Lisa A.. University of Washington; Estados UnidosFil: Blue, Elizabeth E.. University of Washington; Estados Unidos. Brotman Baty Institute for Precision Medicine; Estados UnidosFil: Grinde, Kelsey E.. University of Washington; Estados Unidos. Macalester College; Estados UnidosFil: Nafikov, Rafael A.. University of Washington; Estados UnidosFil: Sohi, Harkirat K.. University of Washington; Estados UnidosFil: Nato, Alejandro Q.. University of Washington; Estados Unidos. Marshall University; Estados UnidosFil: Bis, Joshua C.. University of Washington; Estados UnidosFil: Brusco, Luis Ignacio. Universidad de Buenos Aires. Facultad de Medicina; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Morelli, Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Ramirez, Alfredo Jose. University Of Cologne; Alemania. Universitat Bonn; Alemania. German Center for Neurodegenerative Diseases; Alemania. University Of Texas Health Science Center At San Antonio (ut Health San Antonio) ; University Of Texas At San Antonio; . Universidad Nacional Arturo Jauretche. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Provincia de Buenos Aires. Ministerio de Salud. Hospital Alta Complejidad en Red El Cruce Dr. Néstor Carlos Kirchner Samic. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos; ArgentinaFil: Dalmasso, Maria Carolina. Universidad Nacional Arturo Jauretche. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Provincia de Buenos Aires. Ministerio de Salud. Hospital Alta Complejidad en Red El Cruce Dr. Néstor Carlos Kirchner Samic. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos; Argentina. University Of Cologne; AlemaniaFil: Temple, Seth. University of Washington; Estados UnidosFil: Satizabal, Claudia. University Of Texas Health Science Center At San Antonio (ut Health San Antonio) ; University Of Texas At San Antonio; . University of Texas at San Antonio; Estados UnidosFil: Browning, Sharon R.. University of Washington; Estados UnidosFil: Seshadri, Sudha. University Of Texas Health Science Center At San Antonio (ut Health San Antonio) ; University Of Texas At San Antonio; . University of Texas at San Antonio; Estados UnidosFil: Wijsman, Ellen M.. University of Washington; Estados UnidosFil: Thornton, Timothy A.. University of Washington; Estados Unido

    The Milky Way Tomography with SDSS: II. Stellar Metallicity

    Full text link
    Using effective temperature and metallicity derived from SDSS spectra for ~60,000 F and G type main sequence stars (0.2<g-r<0.6), we develop polynomial models for estimating these parameters from the SDSS u-g and g-r colors. We apply this method to SDSS photometric data for about 2 million F/G stars and measure the unbiased metallicity distribution for a complete volume-limited sample of stars at distances between 500 pc and 8 kpc. The metallicity distribution can be exquisitely modeled using two components with a spatially varying number ratio, that correspond to disk and halo. The two components also possess the kinematics expected for disk and halo stars. The metallicity of the halo component is spatially invariant, while the median disk metallicity smoothly decreases with distance from the Galactic plane from -0.6 at 500 pc to -0.8 beyond several kpc. The absence of a correlation between metallicity and kinematics for disk stars is in a conflict with the traditional decomposition in terms of thin and thick disks. We detect coherent substructures in the kinematics--metallicity space, such as the Monoceros stream, which rotates faster than the LSR, and has a median metallicity of [Fe/H]=-0.96, with an rms scatter of only ~0.15 dex. We extrapolate our results to the performance expected from the Large Synoptic Survey Telescope (LSST) and estimate that the LSST will obtain metallicity measurements accurate to 0.2 dex or better, with proper motion measurements accurate to ~0.2 mas/yr, for about 200 million F/G dwarf stars within a distance limit of ~100 kpc (g<23.5). [abridged]Comment: 40 pages, 21 figures, emulateApJ style, accepted to ApJ, high resolution figures are available from http://www.astro.washington.edu/ivezic/sdss/mw/astroph0804.385

    PBAP: a pipeline for file processing and quality control of pedigree data with dense genetic markers

    No full text
    Motivation: Huge genetic datasets with dense marker panels are now common. With the availability of sequence data and recognition of importance of rare variants, smaller studies based on pedigrees are again also common. Pedigree-based samples often start with a dense marker panel, a subset of which may be used for linkage analysis to reduce computational burden and to limit linkage disequilibrium between single-nucleotide polymorphisms (SNPs). Programs attempting to select markers for linkage panels exist but lack flexibility. Results: We developed a pedigree-based analysis pipeline (PBAP) suite of programs geared towards SNPs and sequence data. PBAP performs quality control, marker selection and file preparation. PBAP sets up files for MORGAN, which can handle analyses for small and large pedigrees, typically human, and results can be used with other programs and for downstream analyses. We evaluate and illustrate its features with two real datasets

    Analysis of individual families implicates noncoding DNA variation and multiple biological pathways in Alzheimer’s disease risk

    No full text
    Background Late‐onset Alzheimer’s disease (AD) is a complex disorder with multiple genetic risk factors. Linkage and association analysis have mapped dozens of loci in pooled analysis of many pedigrees or large numbers of unrelated cases and controls. Identification of the underlying DNA risk variants in the regions of interest (ROIs) has been complicated by both the genetic heterogeneity and the cost, until recently, of comprehensive DNA sequencing in ROIs. The known loci also leave much heritability unexplained. Method We used the families in the AD Sequencing Project (ADSP) discovery family sample to identify variants of interest from whole genome sequences (WGS), and through the variants, genes implicated in risk. We used SNP‐based multipoint linkage analysis to identify ROIs with rare VOIs, carrying out analysis without trimming pedigrees. We pursued all ROIs with family‐specific lodmax scores >1.9, reducing the variants of interest by several filters. We carried out pedigree‐based genotype imputation from the available WGS data, followed by family‐based association analysis, filtered for low population minor allele frequency. We prioritized genes with a low false‐discovery rate for association of single‐cell transcription in brain with AD disease state (PMID:31209304), and genes with high expression in bulk brain (PMID: 24309898). Result We obtained 46 distinct ROIs representing lodmax1.9‐3.5 per ROI in each of 26 of the 110 ADSP discovery families analyzed. 29 ROIs further investigated in 16 of the families yielded 59 prioritized genes, with 1‐11 genes/ROI. Only 4 out of 321 variants that passed all filters in these genes were in exons, with minimal overlap with genes identified in AD GWASs. Only one ROI occurred in two families, with evidence for a shared‐haplotype between these families, implicating FBXO2 and FBXO44. Both genes are implicated in ubiquitination, while FBXO2 interacts with BACE1. Multiple pathways, both known and new, are implicated, including the ubiquitin‐proteasome system, neural development and maintenance, and mitochondrial functions. Conclusion This analysis underscores the evidence for extensive genetic heterogeneity and rare variants underlying AD risk, along with multiple potential mechanisms. The preponderance of prioritize non‐coding variants suggests alterations in gene regulation and/or expression as an aspect of AD genetic risk

    Quality control and integration of genotypes from two calling pipelines for whole genome sequence data in the Alzheimer's disease sequencing project

    No full text
    The Alzheimer's Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing centers. Genotype calling of single nucleotide variants (SNVs) and insertion-deletion variants (indels) was performed centrally using GATK-HaplotypeCaller and Atlas V2. The ADSP Quality Control (QC) Working Group applied QC protocols to project-level variant call format files (VCFs) from each pipeline, and developed and implemented a novel protocol, termed “consensus calling,” to combine genotype calls from both pipelines into a single high-quality set. QC was applied to autosomal bi-allelic SNVs and indels, and included pipeline-recommended QC filters, variant-level QC, and sample-level QC. Low-quality variants or genotypes were excluded, and sample outliers were noted. Quality was assessed by examining Mendelian inconsistencies (MIs) among 67 parent-offspring pairs, and MIs were used to establish additional genotype-specific filters for GATK calls. After QC, 578 subjects remained. Pipeline-specific QC excluded ~12.0% of GATK and 14.5% of Atlas SNVs. Between pipelines, ~91% of SNV genotypes across all QCed variants were concordant; 4.23% and 4.56% of genotypes were exclusive to Atlas or GATK, respectively; the remaining ~0.01% of discordant genotypes were excluded. For indels, variant-level QC excluded ~36.8% of GATK and 35.3% of Atlas indels. Between pipelines, ~55.6% of indel genotypes were concordant; while 10.3% and 28.3% were exclusive to Atlas or GATK, respectively; and ~0.29% of discordant genotypes were. The final WGS consensus dataset contains 27,896,774 SNVs and 3,133,926 indels and is publicly available

    Human whole-exome genotype data for Alzheimer’s disease

    Get PDF
    The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer’s Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD &gt; 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community.</p
    corecore