Search CORE

28 research outputs found

Recommended from our members

Accurate estimation of SNP-heritability from biobank-scale data irrespective of genetic architecture.

Author: Burch Kathryn S
Hou Kangcheng
Majumdar Arunabha
Mancuso Nicholas
Pasaniuc Bogdan
Sankararaman Sriram
Shi Huwenbo
Wu Yue
Publication venue: eScholarship, University of California
Publication date: 01/08/2019
Field of study

SNP-heritability is a fundamental quantity in the study of complex traits. Recent studies have shown that existing methods to estimate genome-wide SNP-heritability can yield biases when their assumptions are violated. While various approaches have been proposed to account for frequency- and linkage disequilibrium (LD)-dependent genetic architectures, it remains unclear which estimates reported in the literature are reliable. Here we show that genome-wide SNP-heritability can be accurately estimated from biobank-scale data irrespective of genetic architecture, without specifying a heritability model or partitioning SNPs by allele frequency and/or LD. We show analytically and through extensive simulations starting from real genotypes (UK Biobank, N = 337 K) that, unlike existing methods, our closed-form estimator is robust across a wide range of architectures. We provide estimates of SNP-heritability for 22 complex traits in the UK Biobank and show that, consistent with our results in simulations, existing biobank-scale methods yield estimates up to 30% different from our theoretically-justified approach

eScholarship - University of California

Recommended from our members

Genetic mapping, inference and prediction across diverse human populations

Author: Hou Kangcheng
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

Genome-wide association studies have revolutionized our understanding of genetic influences on common diseases and complex traits. However, the majority of discoveries have been limited to individuals of European ancestry, leading to a data collection bias that disproportionately under-samples non-European populations. This bias leads to missed discovery opportunities and differential prediction accuracy across sub-populations defined by genetic ancestry and socioeconomic factors. Although datasets with diverse genetic ancestry backgrounds are increasingly available, existing analytical tools often fail to account for the heterogeneity present in these datasets. Here, I introduce new computational and statistical methods for genetic mapping, inference, and prediction across diverse human populations. First, I investigate the power of genetic mapping approaches in populations with diverse genetic ancestry backgrounds. Second, I explore the inference of genetic architecture, estimating the cross-ancestry sharing of genetic effects. Third, I examine genetic prediction, quantifying differential polygenic scoring accuracy by contexts and developing an approach to account for such differences

eScholarship - University of California

Accurate estimation of SNP-heritability from biobank-scale data irrespective of genetic architecture

Author: Hou Kangcheng,
Publication venue
Publication date: 16/05/2020
Field of study

Ezid

Estimation of regional polygenicity from GWAS provides insights into the genetic architecture of complex traits.

Author: Bogdan Pasaniuc
Kangcheng Hou
Kathryn S Burch
Mario Paciuc
Ruth Johnson
Sriram Sankararaman
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

The number of variants that have a non-zero effect on a trait (i.e. polygenicity) is a fundamental parameter in the study of the genetic architecture of a complex trait. Although many previous studies have investigated polygenicity at a genome-wide scale, a detailed understanding of how polygenicity varies across genomic regions is currently lacking. In this work, we propose an accurate and scalable statistical framework to estimate regional polygenicity for a complex trait. We show that our approach yields approximately unbiased estimates of regional polygenicity in simulations across a wide-range of various genetic architectures. We then partition the polygenicity of anthropometric and blood pressure traits across 6-Mb genomic regions (N = 290K, UK Biobank) and observe that all analyzed traits are highly polygenic: over one-third of regions harbor at least one causal variant for each of the traits analyzed. Additionally, we observe wide variation in regional polygenicity: on average across all traits, 48.9% of regions contain at least 5 causal SNPs, 5.44% of regions contain at least 50 causal SNPs. Finally, we find that heritability is proportional to polygenicity at the regional level, which is consistent with the hypothesis that heritability enrichments are largely driven by the variation in the number of causal SNPs

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

From groupwise to individual brain functional networks parcellation and application

Author: DongTao WEI
GuoRong WU
HeSheng LIU
Jiang QIU
KangCheng WANG
Xin HOU
Publication venue: 'Science China Press., Co. Ltd.'
Publication date
Field of study

Crossref

Efficient variance components analysis across millions of genomes

Author: Aaron Zhou
Ali Pazokitoroudi
Bogdan Pasaniuc
Kangcheng Hou
Kathryn S. Burch
Sriram Sankararaman
Yue Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2020
Field of study

Variance components analysis may be used for a variety of applications including heritability estimation and association mapping. Here, the authors present a computationally efficient method, scalable to extremely large GWAS datasets, and use it for heritabilty analysis of 22 traits from UK Bioban

Directory of Open Access Journals

Recommended from our members

Genotype error due to low-coverage sequencing induces uncertainty in polygenic scoring.

Author: Bhattacharya Arjun
Ding Yi
Gusev Alexander
Hou Kangcheng
Pasaniuc Bogdan
Petter Ella
Zaitlen Noah
Publication venue: eScholarship, University of California
Publication date: 03/08/2023
Field of study

Polygenic scores (PGSs) have emerged as a standard approach to predict phenotypes from genotype data in a wide array of applications from socio-genomics to personalized medicine. Traditional PGSs assume genotype data to be error-free, ignoring possible errors and uncertainties introduced from genotyping, sequencing, and/or imputation. In this work, we investigate the effects of genotyping error due to low coverage sequencing on PGS estimation. We leverage SNP array and low-coverage whole-genome sequencing data (lcWGS, median coverage 0.04×) of 802 individuals from the Dana-Farber PROFILE cohort to show that PGS error correlates with sequencing depth (p = 1.2 × 10-7). We develop a probabilistic approach that incorporates genotype error in PGS estimation to produce well-calibrated PGS credible intervals and show that the probabilistic approach increases classification accuracy by up to 6% as compared to traditional PGSs that ignore genotyping error. Finally, we use simulations to explore the combined effect of genotyping and effect size errors and their implication on PGS-based risk-stratification. Our results illustrate the importance of considering genotyping error as a source of PGS error especially for cohorts with varying genotyping technologies and/or low-coverage sequencing

eScholarship - University of California

Efficient variance components analysis across millions of genomes.

Author: Burch Kathryn S
Hou Kangcheng
Pasaniuc Bogdan
Pazokitoroudi Ali
Sankararaman Sriram
Wu Yue
Zhou Aaron
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

While variance components analysis has emerged as a powerful tool in complex trait genetics, existing methods for fitting variance components do not scale well to large-scale datasets of genetic variation. Here, we present a method for variance components analysis that is accurate and efficient: capable of estimating one hundred variance components on a million individuals genotyped at a million SNPs in a few hours. We illustrate the utility of our method in estimating and partitioning variation in a trait explained by genotyped SNPs (SNP-heritability). Analyzing 22 traits with genotypes from 300,000 individuals across about 8 million common and low frequency SNPs, we observe that per-allele squared effect size increases with decreasing minor allele frequency (MAF) and linkage disequilibrium (LD) consistent with the action of negative selection. Partitioning heritability across 28 functional annotations, we observe enrichment of heritability in FANTOM5 enhancers in asthma, eczema, thyroid and autoimmune disorders

Directory of Open Access Journals

eScholarship - University of California