Search CORE

6 research outputs found

Human Whole-Exome Genotype Data For alzheimer\u27s Disease

Author: Brkanac Zoran
Bush William S
Cantwell Laura
Chou Yi-Fan
Clark Kaylyn
Cruchaga Carlos
Destefano Anita
Farrer Lindsay
Gangadharan Prabhakaran
Haines Jonathan
Hamilton-Nelson Kara
Kuzma Amanda B
Lee Wan-Ping
Leung Yuk Yee
Lin Honghuang
Martin Eden
Mayeux Richard P
Naj Adam C
Nicaretta Heather
Pericak-Vance Margaret
Qu Liming
Schellenberg Gerard D
Schmidt Michael
Seshadri Sudha
Valladares Otto
Wang Li-San
Wheeler Nicholas
Publication venue: DigitalCommons@TMC
Publication date: 23/01/2024
Field of study

The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer\u27s Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD \u3e 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community

DigitalCommons@The Texas Medical Center

The early-onset Alzheimer's disease whole-genome sequencing project: Study design and methodology

Author: Ayodele Temitope
Baez Penelope
Beecham Gary W.
Bradley Joseph
Crane Paul K.
Cruchaga Carlos
Cuccaro Michael L.
Dalgard Clifton L.
Fernandez Victoria
Jean-Francois Melissa
Kuzma Amanda
Mayeux Richard
Nicaretta Heather
Pericak-Vance Margaret A.
Ray Nicholas R.
Reitz Christiane
Schellenberg Gerard D.
Sims Rebecca
Wang Li-San
Williams Julie
Publication venue: Wiley
Publication date: 30/06/2023
Field of study

INTRODUCTION Sequencing efforts to identify genetic variants and pathways underlying Alzheimer's disease (AD) have largely focused on late-onset AD although early-onset AD (EOAD), accounting for ∼10% of cases, is largely unexplained by known mutations, resulting in a lack of understanding of its molecular etiology. METHODS Whole-genome sequencing and harmonization of clinical, neuropathological, and biomarker data of over 5000 EOAD cases of diverse ancestries. RESULTS A publicly available genomics resource for EOAD with extensive harmonized phenotypes. Primary analysis will (1) identify novel EOAD risk loci and druggable targets; (2) assess local-ancestry effects; (3) create EOAD prediction models; and (4) assess genetic overlap with cardiovascular and other traits. DISCUSSION This novel resource complements over 50,000 control and late-onset AD samples generated through the Alzheimer's Disease Sequencing Project (ADSP). The harmonized EOAD/ADSP joint call will be available through upcoming ADSP data releases and will allow for additional analyses across the full onset range. Highlights Sequencing efforts to identify genetic variants and pathways underlying Alzheimer's disease (AD) have largely focused on late-onset AD although early-onset AD (EOAD), accounting for ∼10% of cases, is largely unexplained by known mutations. This results in a significant lack of understanding of the molecular etiology of this devastating form of the disease. The Early-Onset Alzheimer's Disease Whole-genome Sequencing Project is a collaborative initiative to generate a large-scale genomics resource for early-onset Alzheimer's disease with extensive harmonized phenotype data. Primary analyses are designed to (1) identify novel EOAD risk and protective loci and druggable targets; (2) assess local-ancestry effects; (3) create EOAD prediction models; and (4) assess genetic overlap with cardiovascular and other traits. The harmonized genomic and phenotypic data from this initiative will be available through NIAGADS

Online Research @ Cardiff

University of Miami: Scholarship Miami

Recommended from our members

The Alzheimer’s Disease Sequencing Project – Follow Up Study (ADSP‐FUS): APOE genotype status and demographic characteristics across datasets

Author: Adams Larry D.
Cuccaro Michael L.
Dalgard Clifton L.
Faber Kelley M.
Foroud Tatiana M.
Inciute Jovita D.
Kunkle Brian W.
Kuzma Amanda B
Martin Eden R.
Mayeux Richard
Mena Pedro R.
Naj Adam C.
Nicaretta Heather Issen
Pericak-Vance Margaret A.
Reyes-Dumeyer Dolly
Schellenberg Gerald D.
Vance Jeffery M.
Vardarajan Badri N
Wang Li‐San
Whitehead Patrice
Zaman Andrew
Publication venue
Publication date: 01/12/2023
Field of study

Abstract Background The ADSP‐FUS is a National Institute on Aging (NIA) initiative focused on identifying genetic risk and protective variants for Alzheimer Disease (AD) by expanding the ADSP beyond non‐Hispanic Whites of European Ancestry (NHW‐EA) populations. Given the lack of diversity in the ADSP, the ADSP‐FUS was designed to whole genome sequence (WGS) existing ethnically diverse and unique cohorts. The upcoming phase ADSP‐ FUS 2.0: The Diverse Population Initiative, focuses on inclusion of Hispanic/Latino (HL), non‐Hispanic Black with African Ancestry (NHB‐AA), and Asian populations. Methods ADSP‐FUS cohorts consist of studies of AD, dementia, and age‐related conditions. Clinical classifications are assigned based on standard criteria from clinical measures and history, as well as additional neuropathologic data. In addition to production of WGS, genome‐wide array and APOE genotyping is acquired or performed for all ADSP‐FUS samples. Results The ADSP‐FUS currently consists of 38 cohorts comprised of ∼40,000 individuals, with plan to sequence >100,000 individuals from diverse ancestries. Genotyping, sequencing, and clinical adjudication has been performed on 23,428 participants (cases N = 6,961, median age = 73; controls N = 13,007, median age = 72; ADRD N = 3,460, median age = 77. More participants are female (62.3%) than male and are evenly distributed across cases (61.0%), controls (63.1%), and ADRD (61.8%). As expected, the most prevalent APOE genotype is APOE 3/3 (% by cases/controls for 2/2 = 0.2,0.4; 2/3 = 4.3, 8.2; 2/4 = 2.2, 1.8; 3/3 = 43.8, 64.4; 3/4 = 39.5, 23.0; 4/4 = 10.1, 2.2). These proportions vary greatly between ethnicities, with the highest for APOE 4/4 observed in Asian participants (8.8%) and the lowest in Hispanic participants (2.5%), for example. Mean Braak stage for AD cases is higher (5.1+1.2) than controls (2.6+1.3) and ADRD participants (3.5+1.6). Conclusion The results provide an overview of features of ADSP‐FUS cohorts. As the ADSP‐FUS expands in size and diversity, this genomic resource, available via NIAGADS, will be integrated with ADSP programs focused on phenotype harmonization, association analyses, functional genomics, and machine learning. In concert with these programs, the ADSP‐FUS will accelerate the identification and understanding of potential genetic risk and protective variants for AD across all populations with the target of developing new treatments that are globally effective

University of Miami: Scholarship Miami

Recommended from our members

ADSP Whole Genome Sequencing (WGS) Release 4 Data Update from Genome Center for Alzheimer’s Disease

Abstract Background The Genome Center for Alzheimer’s Disease (GCAD) coordinates the integration of all available Alzheimer’s disease (AD) relevant whole genome sequencing (WGS) data with the goal of identifying AD risk or protective genetic variants and eventual therapeutic targets. The WGS datasets are generated through collaboration between investigators from the Alzheimer’s Disease Sequencing Project (ADSP) and GCAD. With the goal of minimizing data heterogeneity, introduced by different sequencing protocols and assays, GCAD processes all samples using standardized pipelines and performs quality control (QC)/quality assurance (QA) checks. Methods Raw sequencing data (FASTQs or BAMs) were aligned to GRCh38/hg38 by BWA, and variant calling and joint genotyping on single nucleotide variants (SNVs), insertions and deletions (indels), were done by GATK. Structural variants (SVs) were called per sample using the Smoove, Manta, and Strelka packages. Preliminary QA checks including sex check, contamination, and genotype concordance were performed followed by QC per ADSP protocol to evaluate the quality of samples and variants. To facilitate access and usage of massive joint‐genotype called VCF files, a compact version for storing variant info and sample genotypes only was released first. Results We dropped 275 (0.7%) samples of poor coverage (362M bi‐allelic variants, >58M multi‐allelic variants, with 95% of variants remaining after QC. SV calling is ongoing and data will be ready prior to the conference. Conclusion The ADSP and GCAD generate high quality SNVs, indels and SV calls. Currently GCAD is preparing the next release of ∼60,000 more ancestrally‐diverse WGS samples sequenced primarily through the ADSP Follow‐Up Study, which we anticipate will be released in 2023 to greatly benefit the AD genetics community

University of Miami: Scholarship Miami

Human whole-exome genotype data for Alzheimer's disease

The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer's Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD > 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community

Directory of Open Access Journals

University of Miami: Scholarship Miami

Human whole-exome genotype data for Alzheimer’s disease

Author: Adams Larry D.
Ahmad Shahzad
Amin Najaf
Antonacci-Fulton Lucinda
Appelbaum Elizabeth
Banks Eric
Barral Sandra
Beecham Gary
Beiser Alexa
Below Jennifer E.
Benchek Penelope
Bennett David A.
Bis Joshua C.
Blue Elizabeth
Booth Briana M.
Brkanac Zoran
Brown Lisa
Bush William S.
Butkiewicz Mariusz
Cantwell Laura
Chen Yuning
Choi Seung Hoan
Chou Yi Fan
Chung Jaeyoon
Clark Kaylyn
Cruchaga Carlos
Cuccaro Michael
Cupples L. Adrienne
Day Tyler
De Jager Phillip L.
Destefano Anita
Dinh Huyen
Doddapeneni Harsha
Dorschner Michael
Dugan-Perez Shannon
Dupuis Josee
English Adam
Faber Kelley
Farrell John
Farrer Lindsay
Feolo Michael
Foroud Tatiana
Fulton Robert S.
Gabriel Stacey
Gangadharan Prabhakaran
Gibbs Richard A.
Goate Alison
Gupta Namrata
Haines Jonathan
Hamilton-Nelson Kara
Han Yi
Haut Jacob
Horimoto Andrea R.
Hu Jianhong
Ikram M. Arfan
Iqbal Taha
Jan Bressler Bressler
Jayaseelan Joy
Jian Xueqiu
Jun Gyungah R.
Kalra Divya
Kapoor Manav
Khan Ziad
Koboldt Daniel C.
Korchina Viktoriya
Kunkle Brian
Kuzma Amanda B.
Larson David E.
Launer Lenore J.
Lee Sandra
Lee Wan Ping
Leung Yuk Yee
Lin Honghuang
Liu Ching Ti
Liu Xiuping
Liu Yue
Lunetta Kathy
Ma Yiyi
Malamon John
Marcora Edoardo
Martin Eden
Mayeux Richard P.
Mena Pedro
Mez Jesse
Mlynarski Elisabeth
Mosley Thomas H.
Muzny Donna
Nafikov Rafael
Naj Adam C.
Nasser Waleed
Nato Alejandro Q.
Navas Pat
Nguyen Hiep
Nicaretta Heather
Pericak-Vance Margaret
Psaty Bruce
Qu Liming
Rajabli Farid
Reitz Christiane
Renton Alan
Reyes-Dumeyer Dolly
Rice Kenneth
Saad Mohamad
Salerno William
Santibanez Jireh
Satizabal Claudia
Schellenberg Gerard D.
Schmidt Helena
Schmidt Michael
Schmidt Reinhold
Seshadri Sudha
Sha Jin
Skinner Evette
Smieszek Sandra
Sohi Harkirat
Song Yeunjoo
Stine Adam
Sun Fangui Jenny
Thornton Timothy
Tosto Giuseppe
Tsuang Debby
Valladares Otto
van der Lee Sven
van Duijn Cornelia
Vance Jeffrey M.
Vanderspek Ashley
Vardarajan Badri
Waligorski Jason
Wang Bowen
Wang Li San
Wheeler Nicholas
Wijsman Ellen
Wilson Richard K.
Witten Daniela
Worley Kim
Xia Li Charlie
Zhang Nancy
Zhang Xiaoling
Zhao Yi
Zhu Congcong
Zhu Yiming
Publication venue
Publication date: 23/01/2024
Field of study

The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer’s Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD > 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community.</p

EUR Research Repository