153 research outputs found
Finite Voronoi decompositions of infinite vertex transitive graphs
In this paper, we consider the Voronoi decompositions of an arbitrary
infinite vertex-transitive graph G. In particular, we are interested in the
following question: what is the largest number of Voronoi cells that must be
infinite, given sufficiently (but finitely) many Voronoi sites which are
sufficiently far from each other? We call this number the survival number s(G).
The survival number of a graph has an alternative characterization in terms
of covering, which we use to show that s(G) is always at least two. The
survival number is not a quasi-isometry invariant, but it remains open whether
finiteness of the s(G) is. We show that all vertex transitive graphs with
polynomial growth have a finite s(G); vertex transitive graphs with infinitely
many ends have an infinite s(G); the lamplighter graph LL(Z), which has
exponential growth, has a finite s(G); and the lamplighter graph LL(Z^2), which
is Liouville, has an infinite s(G)
Meta-analysis fine-mapping is often miscalibrated at single-variant resolution
Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWASs). Fine-mapping of meta-analysis studies is typically performed as in a single-cohort study. Here, we first demonstrate that heterogeneity (e.g., of sample size, phenotyping, imputation) hurts calibration of meta-analysis fine-mapping. We propose a summary statistics-based quality-control (QC) method, suspicious loci analysis of meta-analysis summary statistics (SLALOM), that identifies suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics. We validate SLALOM in simulations and the GWAS Catalog. Applying SLALOM to 14 meta-analyses from the Global Biobank Meta-analysis Initiative (GBMI), we find that 67% of loci show suspicious patterns that call into question fine-mapping accuracy. These predicted suspicious loci are significantly depleted for having nonsynonymous variants as lead variant (2.7×; Fisher's exact p = 7.3 × 10−4). We find limited evidence of fine-mapping improvement in the GBMI meta-analyses compared with individual biobanks. We urge extreme caution when interpreting fine-mapping results from meta-analysis of heterogeneous cohorts.</p
Functional Architectures of Local and Distal Regulation of Gene Expression in Multiple Human Tissues
Genetic variants that modulate gene expression levels play an important role in the etiology of human diseases and complex traits. Although large-scale eQTL mapping studies routinely identify many local eQTLs, the molecular mechanisms by which genetic variants regulate expression remain unclear, particularly for distal eQTLs, which these studies are not well powered to detect. Here, we leveraged all variants (not just those that pass stringent significance thresholds) to analyze the functional architecture of local and distal regulation of gene expression in 15 human tissues by employing an extension of stratified LD-score regression that produces robust results in simulations. The top enriched functional categories in local regulation of peripheral-blood gene expression included coding regions (11.41×), conserved regions (4.67×), and four histone marks (p < 5 × 10 -5 for all enrichments); local enrichments were similar across the 15 tissues. We also observed substantial enrichments for distal regulation of peripheral-blood gene expression: coding regions (4.47×), conserved regions (4.51×), and two histone marks (p < 3 × 10 -7 for all enrichments). Analyses of the genetic correlation of gene expression across tissues confirmed that local regulation of gene expression is largely shared across tissues but that distal regulation is highly tissue specific. Our results elucidate the functional components of the genetic architecture of local and distal regulation of gene expression
Shared genetic aetiology of puberty timing between sexes and with health-related outcomes.
Understanding of the genetic regulation of puberty timing has come largely from studies of rare disorders and population-based studies in women. Here, we report the largest genomic analysis for puberty timing in 55,871 men, based on recalled age at voice breaking. Analysis across all genomic variants reveals strong genetic correlation (0.74, P=2.7 × 10(-70)) between male and female puberty timing. However, some loci show sex-divergent effects, including directionally opposite effects between sexes at the SIM1/MCHR2 locus (Pheterogeneity=1.6 × 10(-12)). We find five novel loci for puberty timing (P<5 × 10(-8)), in addition to nine signals in men that were previously reported in women. Newly implicated genes include two retinoic acid-related receptors, RORB and RXRA, and two genes reportedly disrupted in rare disorders of puberty, LEPR and KAL1. Finally, we identify genetic correlations that indicate shared aetiologies in both sexes between puberty timing and body mass index, fasting insulin levels, lipid levels, type 2 diabetes and cardiovascular disease.This work was supported by the Medical Research Council [U106179472; MC_U106179472; U106179471; MC_U106179471] and the National Human Genome Research Institute of the National Institutes of Health (grant number R44HG006981 to 23andMe)This is the final version of the article. It was first available from NPG via http://dx.doi.org/10.1038/ncomms984
Functionally informed fine-mapping and polygenic localization of complex trait heritability
Fine-mapping aims to identify causal variants impacting complex traits. We propose PolyFun, a computationally scalable framework to improve fine-mapping accuracy by leveraging functional annotations across the entire genome-not just genome-wide-significant loci-to specify prior probabilities for fine-mapping methods such as SuSiE or FINEMAP. In simulations, PolyFun + SuSiE and PolyFun + FINEMAP were well calibrated and identified >20% more variants with a posterior causal probability >0.95 than identified in their nonfunctionally informed counterparts. In analyses of 49 UK Biobank traits (average n = 318,000), PolyFun + SuSiE identified 3,025 fine-mapped variant-trait pairs with posterior causal probability >0.95, a >32% improvement versus SuSiE. We used posterior mean per-SNP heritabilities from PolyFun + SuSiE to perform polygenic localization, constructing minimal sets of common SNPs causally explaining 50% of common SNP heritability; these sets ranged in size from 28 (hair color) to 3,400 (height) to 2 million (number of children). In conclusion, PolyFun prioritizes variants for functional follow-up and provides insights into complex trait architectures. PolyFun is a computationally scalable framework for functionally informed fine-mapping that makes full use of genome-wide data. It prioritizes more variants than previous methods when applied to 49 complex traits from UK Biobank.Peer reviewe
Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights
Genome-wide association studies (GWAS) have identified over 100 risk loci for schizophrenia, but the causal mechanisms remain largely unknown. We performed a transcriptome-wide association study (TWAS) integrating a schizophrenia GWAS of 79,845 individuals from the Psychiatric Genomics Consortium with expression data from brain, blood, and adipose tissues across 3,693 primarily control individuals. We identified 157 TWAS-significant genes, of which 35 did not overlap a known GWAS locus. Of these 157 genes, 42 were associated with specific chromatin features measured in independent samples, thus highlighting potential regulatory targets for follow-up. Suppression of one identified susceptibility gene, mapk3, in zebrafish showed a significant effect on neurodevelopmental phenotypes. Expression and splicing from the brain captured most of the TWAS effect across all genes. This large-scale connection of associations to target genes, tissues, and regulatory features is an essential step in moving toward a mechanistic understanding of GWAS
Recommended from our members
Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types.
We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals
14 Schizophrenia Working Group of the Psychiatric Genomics Consortium, Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) study
Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R 2 increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase
Large-scale genomic analyses link reproductive ageing to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair
John Perry and colleagues report the results of a large genome-wide association study meta-analysis to identify variants influencing age at natural menopause. They identify 54 independent signals and find enrichment near genes involved in delayed puberty and DNA damage response
- …