Search CORE

160 research outputs found

Detecting Identity by Descent and Estimating Genotype Error Rates in Sequence Data

Author: Browning Brian L.
Browning Sharon R.
Publication venue: The American Society of Human Genetics. Published by Elsevier Inc.
Publication date: 07/11/2013
Field of study

Existing methods for identity by descent (IBD) segment detection were designed for SNP array data, not sequence data. Sequence data have a much higher density of genetic variants and a different allele frequency distribution, and can have higher genotype error rates. Consequently, best practices for IBD detection in SNP array data do not necessarily carry over to sequence data. We present a method, IBDseq, for detecting IBD segments in sequence data and a method, SEQERR, for estimating genotype error rates at low-frequency variants by using detected IBD. The IBDseq method estimates probabilities of genotypes observed with error for each pair of individuals under IBD and non-IBD models. The ratio of estimated probabilities under the two models gives a LOD score for IBD. We evaluate several IBD detection methods that are fast enough for application to sequence data (IBDseq, Beagle Refined IBD, PLINK, and GERMLINE) under multiple parameter settings, and we show that IBDseq achieves high power and accuracy for IBD detection in sequence data. The SEQERR method estimates genotype error rates by comparing observed and expected rates of pairs of homozygote and heterozygote genotypes at low-frequency variants in IBD segments. We demonstrate the accuracy of SEQERR in simulated data, and we apply the method to estimate genotype error rates in sequence data from the UK10K and 1000 Genomes projects

Elsevier - Publisher Connector

PubMed Central

A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic

Author: Browning Sharon R.
Madsen Bo Eskerod
Publication venue: Public Library of Science
Publication date: 01/02/2009
Field of study

Resequencing is an emerging tool for identification of rare disease-associated mutations. Rare mutations are difficult to tag with SNP genotyping, as genotyping studies are designed to detect common variants. However, studies have shown that genetic heterogeneity is a probable scenario for common diseases, in which multiple rare mutations together explain a large proportion of the genetic basis for the disease. Thus, we propose a weighted-sum method to jointly analyse a group of mutations in order to test for groupwise association with disease status. For example, such a group of mutations may result from resequencing a gene. We compare the proposed weighted-sum method to alternative methods and show that it is powerful for identifying disease-associated genes, both on simulated and Encode data. Using the weighted-sum method, a resequencing study can identify a disease-associated gene with an overall population attributable risk (PAR) of 2%, even when each individual mutation has much lower PAR, using 1,000 to 7,000 affected and unaffected individuals, depending on the underlying genetic model. This study thus demonstrates that resequencing studies can identify important genetic associations, provided that specialised analysis methods, such as the weighted-sum method, are used

Directory of Open Access Journals

PubMed Central

Genome-wide association of white blood cell counts in Hispanic/Latino Americans: the Hispanic Community Health Study/Study of Latinos

Author: Auer Paul L.
Brown Lisa
Browning Brian L.
Browning Sharon R.
Hodonsky Chani J.
Jain Deepti
Laurie Cathy C.
Laurie Cecelia A.
Liu Yongmei
Loos Ruth J.F.
Minnerath Sharon
Morrison Jean V.
North Kari E.
Papanicolaou George
Reiner Alexander P.
Schick Ursula M.
Schurmann Claudia
Sofer Tamar
Taylor Kent D.
Thornton Timothy A.
Thyagarajan Bharat
Publication venue
Publication date: 01/01/2017
Field of study

Circulating white blood cell (WBC) counts (neutrophils, monocytes, lymphocytes, eosinophils, basophils) differ by ethnicity. The genetic factors underlying basal WBC traits in Hispanics/Latinos are unknown. We performed a genome-wide association study of total WBC and differential counts in a large, ethnically diverse US population sample of Hispanics/Latinos ascertained by the Hispanic Community Health Study and Study of Latinos (HCHS/SOL). We demonstrate that several previously known WBC-associated genetic loci (e.g. the African Duffy antigen receptor for chemokines null variant for neutrophil count) are generalizable to WBC traits in Hispanics/Latinos. We identified and replicated common and rare germ-line variants at FLT3 (a gene often somatically mutated in leukemia) associated with monocyte count. The common FLT3 variant rs76428106 has a large allele frequency differential between African and non-African populations. We also identified several novel genetic loci involving or regulating hematopoietic transcription factors (CEBPE-SLC7A7, CEBPA and CRBN-TRNT1) associated with basophil count. The minor allele of the CEBPE variant associated with lower basophil count has been previously associated with Amerindian ancestry and higher risk of acute lymphoblastic leukemia in Hispanics. Together, these data suggest that germline genetic variation affecting transcriptional and signaling pathways that underlie WBC development and lineage specification can contribute to inter-individual as well as ethnic differences in peripheral blood cell counts (normal hematopoiesis) in addition to susceptibility to leukemia (malignant hematopoiesis)

Carolina Digital Repository

Performance of Genotype Imputation for Rare Variants Identified in Exons and Flanking Regions of Genes

Author: A Coventry
Andrew J. Slater
BL Browning
BN Howie
Brian L. Browning
C Francks
D Kasperaviciute
EL Heinzen
H Li
H Ling
J Marchini
Jennifer L. Aponte
John C. Whittaker
KA Frazer
Li Li
M Firmann
M Li
Margaret Gelder Ehm
Matthew R. Nelson
P Muglia
Peter Heutink
SE Baranzini
SG Pillai
SG Pillai
Sharon R. Browning
SR Browning
Stephanie L. Chissoe
TL Assimes
Vincent E. Mooser
Xiangyang Kong
Y Li
Yun Li
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Genotype imputation has the potential to assess human genetic variation at a lower cost than assaying the variants using laboratory techniques. The performance of imputation for rare variants has not been comprehensively studied. We utilized 8865 human samples with high depth resequencing data for the exons and flanking regions of 202 genes and Genome-Wide Association Study (GWAS) data to characterize the performance of genotype imputation for rare variants. We evaluated reference sets ranging from 100 to 3713 subjects for imputing into samples typed for the Affymetrix (500K and 6.0) and Illumina 550K GWAS panels. The proportion of variants that could be well imputed (true r2>0.7) with a reference panel of 3713 individuals was: 31% (Illumina 550K) or 25% (Affymetrix 500K) with MAF (Minor Allele Frequency) less than or equal 0.001, 48% or 35% with 0.001<MAF< = 0.005, 54% or 38% with 0.005<MAF< = 0.01, 78% or 57% with 0.01<MAF< = 0.05, and 97% or 86% with MAF>0.05. The performance for common SNPs (MAF>0.05) within exons and flanking regions is comparable to imputation of more uniformly distributed SNPs. The performance for rare SNPs (0.01<MAF< = 0.05) was much more dependent on the GWAS panel and the number of reference samples. These results suggest routine use of genotype imputation for extending the assessment of common variants identified in humans via targeted exon resequencing into additional samples with GWAS data, but imputation of very rare variants (MAF< = 0.005) will require reference panels with thousands of subjects

Public Library of Science (PLOS)

Crossref

LSHTM Research Online

Directory of Open Access Journals

PubMed Central

Carolina Digital Repository

Genome-wide association study of red blood cell traits in Hispanics/Latinos: The Hispanic Community Health Study/Study of Latinos

Prior GWAS have identified loci associated with red blood cell (RBC) traits in populations of European, African, and Asian ancestry. These studies have not included individuals with an Amerindian ancestral background, such as Hispanics/Latinos, nor evaluated the full spectrum of genomic variation beyond single nucleotide variants. Using a custom genotyping array enriched for Amerindian ancestral content and 1000 Genomes imputation, we performed GWAS in 12,502 participants of Hispanic Community Health Study and Study of Latinos (HCHS/SOL) for hematocrit, hemoglobin, RBC count, RBC distribution width (RDW), and RBC indices. Approximately 60% of previously reported RBC trait loci generalized to HCHS/SOL Hispanics/Latinos, including African ancestral alpha- and beta-globin gene variants. In addition to the known 3.8kb alpha-globin copy number variant, we identified an Amerindian ancestral association in an alpha-globin regulatory region on chromosome 16p13.3 for mean corpuscular volume and mean corpuscular hemoglobin. We also discovered and replicated three genome-wide significant variants in previously unreported loci for RDW (SLC12A2 rs17764730, PSMB5 rs941718), and hematocrit (PROX1 rs3754140). Among the proxy variants at the SLC12A2 locus we identified rs3812049, located in a bi-directional promoter between SLC12A2 (which encodes a red cell membrane ion-transport protein) and an upstream anti-sense long-noncoding RNA, LINC01184, as the likely causal variant. We further demonstrate that disruption of the regulatory element harboring rs3812049 affects transcription of SLC12A2 and LINC01184 in human erythroid progenitor cells. Together, these results reinforce the importance of genetic study of diverse ancestral populations, in particular Hispanics/Latinos

Carolina Digital Repository

Admixture mapping implicates 13q33.3 as ancestry-of-origin locus for Alzheimer disease in Hispanic and Latino populations

Author: Bis Joshua C.
Blue Elizabeth E.
Boyken Lisa A.
Browning Sharon R.
Brusco Luis Ignacio
Dalmasso Maria Carolina
Grinde Kelsey E.
Horimoto Andrea R.V.R.
Morelli Laura
Nafikov Rafael A.
Nato Alejandro Q.
Ramirez Alfredo Jose
Satizabal Claudia
Seshadri Sudha
Sohi Harkirat K.
Temple Seth
Thornton Timothy A.
Wijsman Ellen M.
Publication venue: Cell Press
Publication date: 01/07/2023
Field of study

Alzheimer disease (AD) is the most common form of senile dementia, with high incidence late in life in many populations including Caribbean Hispanic (CH) populations. Such admixed populations, descended from more than one ancestral population, can present challenges for genetic studies, including limited sample sizes and unique analytical constraints. Therefore, CH populations and other admixed populations have not been well represented in studies of AD, and much of the genetic variation contributing to AD risk in these populations remains unknown. Here, we conduct genome-wide analysis of AD in multiplex CH families from the Alzheimer Disease Sequencing Project (ADSP). We developed, validated, and applied an implementation of a logistic mixed model for admixture mapping with binary traits that leverages genetic ancestry to identify ancestry-of-origin loci contributing to AD. We identified three loci on chromosome 13q33.3 associated with reduced risk of AD, where associations were driven by Native American (NAM) ancestry. This AD admixture mapping signal spans the FAM155A, ABHD13, TNFSF13B, LIG4, and MYO16 genes and was supported by evidence for association in an independent sample from the Alzheimer's Genetics in Argentina—Alzheimer Argentina consortium (AGA-ALZAR) study with considerable NAM ancestry. We also provide evidence of NAM haplotypes and key variants within 13q33.3 that segregate with AD in the ADSP whole-genome sequencing data. Interestingly, the widely used genome-wide association study approach failed to identify associations in this region. Our findings underscore the potential of leveraging genetic ancestry diversity in recently admixed populations to improve genetic mapping, in this case for AD-relevant loci.Fil: Horimoto, Andrea R.V.R.. University of Washington; Estados UnidosFil: Boyken, Lisa A.. University of Washington; Estados UnidosFil: Blue, Elizabeth E.. University of Washington; Estados Unidos. Brotman Baty Institute for Precision Medicine; Estados UnidosFil: Grinde, Kelsey E.. University of Washington; Estados Unidos. Macalester College; Estados UnidosFil: Nafikov, Rafael A.. University of Washington; Estados UnidosFil: Sohi, Harkirat K.. University of Washington; Estados UnidosFil: Nato, Alejandro Q.. University of Washington; Estados Unidos. Marshall University; Estados UnidosFil: Bis, Joshua C.. University of Washington; Estados UnidosFil: Brusco, Luis Ignacio. Universidad de Buenos Aires. Facultad de Medicina; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Morelli, Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Ramirez, Alfredo Jose. University Of Cologne; Alemania. Universitat Bonn; Alemania. German Center for Neurodegenerative Diseases; Alemania. University Of Texas Health Science Center At San Antonio (ut Health San Antonio) ; University Of Texas At San Antonio; . Universidad Nacional Arturo Jauretche. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Provincia de Buenos Aires. Ministerio de Salud. Hospital Alta Complejidad en Red El Cruce Dr. Néstor Carlos Kirchner Samic. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos; ArgentinaFil: Dalmasso, Maria Carolina. Universidad Nacional Arturo Jauretche. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Provincia de Buenos Aires. Ministerio de Salud. Hospital Alta Complejidad en Red El Cruce Dr. Néstor Carlos Kirchner Samic. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Unidad Ejecutora de Estudios en Neurociencias y Sistemas Complejos; Argentina. University Of Cologne; AlemaniaFil: Temple, Seth. University of Washington; Estados UnidosFil: Satizabal, Claudia. University Of Texas Health Science Center At San Antonio (ut Health San Antonio) ; University Of Texas At San Antonio; . University of Texas at San Antonio; Estados UnidosFil: Browning, Sharon R.. University of Washington; Estados UnidosFil: Seshadri, Sudha. University Of Texas Health Science Center At San Antonio (ut Health San Antonio) ; University Of Texas At San Antonio; . University of Texas at San Antonio; Estados UnidosFil: Wijsman, Ellen M.. University of Washington; Estados UnidosFil: Thornton, Timothy A.. University of Washington; Estados Unido

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

A Transcription Factor Map as Revealed by a Genome-Wide Gene Expression Analysis of Whole-Blood mRNA Transcriptome in Multiple Sclerosis

Background: Several lines of evidence suggest that transcription factors are involved in the pathogenesis of Multiple Sclerosis (MS) but complete mapping of the whole network has been elusive. One of the reasons is that there are several clinical subtypes of MS and transcription factors that may be involved in one subtype may not be in others. We investigate the possibility that this network could be mapped using microarray technologies and contemporary bioinformatics methods on a dataset derived from whole blood in 99 untreated MS patients (36 Relapse Remitting MS, 43 Primary Progressive MS, and 20 Secondary Progressive MS) and 45 age-matched healthy controls. Methodology/Principal Findings: We have used two different analytical methodologies: a non-standard differential expression analysis and a differential co-expression analysis, which have converged on a significant number of regulatory motifs that are statistically overrepresented in genes that are either differentially expressed (or differentially co-expressed) in cases and controls (e.g., V

KROX_Q6, p-value ,3.31E-6; V

CREBP1_Q2, p-value ,9.93E-6, V$YY1_02, p-value ,1.65E-5). Conclusions/Significance: Our analysis uncovered a network of transcription factors that potentially dysregulate several genes in MS or one or more of its disease subtypes. The most significant transcription factor motifs were for the Early Growth Response EGR/KROX family, ATF2, YY1 (Yin and Yang 1), E2F-1/DP-1 and E2F-4/DP-2 heterodimers, SOX5, and CREB and ATF families. These transcription factors are involved in early T-lymphocyte specification and commitment as well as in oligodendrocyte dedifferentiation and development, both pathways that have significant biological plausibility in MS causation

University of Newcastle's Digital Repository

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Queensland University of Technology ePrints Archive

Research Repository

Macquarie University ResearchOnline

University of Melbourne Institutional Repository

Genetic Diversity and Association Studies in US Hispanic/Latino Populations: Applications in the Hispanic Community Health Study/Study of Latinos

US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a “genetic-analysis group” variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness

Elsevier - Publisher Connector

Crossref

PubMed Central

Carolina Digital Repository

University of Miami: Scholarship Miami