9 research outputs found
Rare coding variants in CHRNB2 reduce the likelihood of smoking
Human genetic studies of smoking behavior have been thus far largely limited to common variants. Studying rare coding variants has the potential to identify drug targets. We performed an exome-wide association study of smoking phenotypes in up to 749,459 individuals and discovered a protective association in CHRNB2, encoding the β2 subunit of the α4β2 nicotine acetylcholine receptor. Rare predicted loss-of-function and likely deleterious missense variants in CHRNB2 in aggregate were associated with a 35% decreased odds for smoking heavily (odds ratio (OR) = 0.65, confidence interval (CI) = 0.56–0.76, P = 1.9 × 10−8). An independent common variant association in the protective direction (rs2072659; OR = 0.96; CI = 0.94–0.98; P = 5.3 × 10−6) was also evident, suggesting an allelic series. Our findings in humans align with decades-old experimental observations in mice that β2 loss abolishes nicotine-mediated neuronal responses and attenuates nicotine self-administration. Our genetic discovery will inspire future drug designs targeting CHRNB2 in the brain for the treatment of nicotine addiction
Genotyping, sequencing and analysis of 140,000 adults from Mexico City
The Mexico City Prospective Study is a prospective cohort of more than 150,000 adults recruited two decades ago from the urban districts of Coyoacán and Iztapalapa in Mexico City1. Here we generated genotype and exome-sequencing data for all individuals and whole-genome sequencing data for 9,950 selected individuals. We describe high levels of relatedness and substantial heterogeneity in ancestry composition across individuals. Most sequenced individuals had admixed Indigenous American, European and African ancestry, with extensive admixture from Indigenous populations in central, southern and southeastern Mexico. Indigenous Mexican segments of the genome had lower levels of coding variation but an excess of homozygous loss-of-function variants compared with segments of African and European origin. We estimated ancestry-specific allele frequencies at 142 million genomic variants, with an effective sample size of 91,856 for Indigenous Mexican ancestry at exome variants, all available through a public browser. Using whole-genome sequencing, we developed an imputation reference panel that outperforms existing panels at common variants in individuals with high proportions of central, southern and southeastern Indigenous Mexican ancestry. Our work illustrates the value of genetic studies in diverse populations and provides foundational imputation and allele frequency resources for future genetic studies in Mexico and in the United States, where the Hispanic/Latino population is predominantly of Mexican descent
Recommended from our members
Assessments of Significance for Genetic Association Analysis in Structured Samples
In this dissertation, we develop methods to address several problems that arise in the assessment of significance for genetic association analysis of complex traits in structured samples.
In Chapter 2, we focus on phenotype resampling methods for binary trait analysis. We develop BRASS, a permutation-based approach to testing association between a binary trait and an arbitrary predictor in samples with population structure and/or related individuals. BRASS is applicable in various contexts, including (1) correction for multiple comparisons when testing for region-wide or genome-wide significance, and (2) assessment of significance for tests that combine test statistics that perform well in different scenarios. Previous methods are applicable only to analysis of a quantitative trait and do not perform well for a binary trait. BRASS allows for covariates, ascertainment and simultaneous testing of multiple markers, and it does not place strong restrictions on the test statistic used. We use an estimating equation approach that can be viewed as a hybrid of logistic regression and linear mixed-effects model methods, and we use a combination of principal components and a genetic relatedness matrix to account for sample structure. In simulation studies, we demonstrate that BRASS maintains correct control of type 1 error. We illustrate the proposed approach in two genome-wide analyses of binary traits in domestic dog.
In Chapter 3, we focus on assessment of significance in genetic association analysis of single or multi-dimensional phenotypes where we consider test statistics of a certain form, allow association to be tested with single or multiple genetic markers simultaneously, and where there is population structure and/or relatedness. Existing approaches that can be used in this context are either computationally burdensome (permutation-based approaches), or do not perform well in settings such as small samples, high-dimensional traits, or misspecified phenotype model (asymptotic approximations based on prospective models), or require an assumption of second-order exchangeability of individuals’ genotypes, possibly after correction for ancestry-informative covariates (existing moment-matching methods for detecting association of two matrices). We develop JASPER, which can be viewed as an extension of existing moment-matching methods for detecting association of two matrices, to allow very general population structure and relatedness in the sample. JASPER can be used for a reasonably broad class of test statistics currently used in genetic association analysis, including most linear mixed model-based score tests and kernel-based test statistics. Notable features of JASPER are that it (1) is insensitive to misspecification of the phenotype model, (2) does not require knowledge of the distribution of the test statistic under the null hypothesis, (3) allows population structure, related individuals, covariates, ascertainment, rare variants, and multiple traits, and (4) with rare variant mapping, it does not require knowledge of the correlation structure among the rare variants. Through simulation studies, we demonstrate that JASPER properly controls type 1 error in the presence of sample structure and can provide substantial power gains compared to large-sample-based assessments of significance. JASPER is applied in a study of the genetic regulation of gene expression levels within biological pathways in data from the Framingham Heart Study
BRASS: Permutation methods for binary traits in genetic association studies with structured samples
In genetic association analysis of complex traits, permutation testing can be a valuable tool for assessing significance when the distribution of the test statistic is unknown or not well-approximated. This commonly arises, e.g, in tests of gene-set, pathway or genome-wide significance, or when the statistic is formed by machine learning or data adaptive methods. Existing applications include eQTL mapping, association testing with rare variants, inclusion of admixed individuals in genetic association analysis, and epistasis detection among many others. For genetic association testing in samples with population structure and/or relatedness, use of naive permutation can lead to inflated type 1 error. To address this in quantitative traits, the MVNpermute method was developed. However, for association mapping of a binary trait, the relationship between the mean and variance makes both naive permutation and the MVNpermute method invalid. We propose BRASS, a permutation method for binary traits, for use in association mapping in structured samples. In addition to modeling structure in the sample, BRASS allows for covariates, ascertainment and simultaneous testing of multiple markers, and it accommodates a wide range of test statistics. In simulation studies, we compare BRASS to other permutation and resampling-based methods in a range of scenarios that include population structure, familial relatedness, ascertainment and phenotype model misspecification. In these settings, we demonstrate the superior control of type 1 error by BRASS compared to the other 6 methods considered. We apply BRASS to assess genome-wide significance for association analyses in domestic dog for elbow dysplasia (ED) and idiopathic epilepsy (IE). For both traits we detect previously identified associations, and in addition, for ED, we detect significant association with a SNP on chromosome 35 that was not detected by previous analyses, demonstrating the potential of the method
rgcgithub/regenie: Regenie v3.4
<ul>
<li>Reduction in memory usage for LD computation when dosages are present;<ul>
<li>compute LD matrix block-wise rather than all at once</li>
<li>expected memory usage is (3NB+ M^2)*8 bytes where N is sample size, B is block size and M is number of variants in LD matrix</li>
<li>we recommend using blocks of sizes 1000 as choosing too small block size will increase the number of block pairs evaluated</li>
</ul>
</li>
<li>Minor bug fixes for LD computation;</li>
<li>Bug fix for carriage return in optional files<ul>
<li>in keep/remove/extract/exclude/mask-definition/annotation files</li>
</ul>
</li>
</ul>
Germline Mutations in CIDEB and Protection against Liver Disease
BACKGROUND Exome sequencing in hundreds of thousands of persons may enable the identification of rare protein-coding genetic variants associated with protection from human diseases like liver cirrhosis, providing a strategy for the discovery of new therapeutic targets. METHODS We performed a multistage exome sequencing and genetic association analysis to identify genes in which rare protein-coding variants were associated with liver phenotypes. We conducted in vitro experiments to further characterize associations. RESULTS The multistage analysis involved 542,904 persons with available data on liver aminotransferase levels, 24,944 patients with various types of liver disease, and 490,636 controls without liver disease. We found that rare coding variants in APOB, ABCB4, SLC30A10, and TM6SF2 were associated with increased aminotransferase levels and an increased risk of liver disease. We also found that variants in CIDEB, which encodes a structural protein found in hepatic lipid droplets, had a protective effect. The burden of rare predicted loss-of-function variants plus missense variants in CIDEB (combined carrier frequency, 0.7%) was associated with decreased alanine aminotransferase levels (beta per allele, -1.24 U per liter; 95% confidence interval [CI], -1.66 to -0.83; P=4.8×10-9) and with 33% lower odds of liver disease of any cause (odds ratio per allele, 0.67; 95% CI, 0.57 to 0.79; P=9.9×10-7). Rare coding variants in CIDEB were associated with a decreased risk of liver disease across different underlying causes and different degrees of severity, including cirrhosis of any cause (odds ratio per allele, 0.50; 95% CI, 0.36 to 0.70). Among 3599 patients who had undergone bariatric surgery, rare coding variants in CIDEB were associated with a decreased nonalcoholic fatty liver disease activity score (beta per allele in score units, -0.98; 95% CI, -1.54 to -0.41 [scores range from 0 to 8, with higher scores indicating more severe disease]). In human hepatoma cell lines challenged with oleate, CIDEB small interfering RNA knockdown prevented the buildup of large lipid droplets. CONCLUSIONS Rare germline mutations in CIDEB conferred substantial protection from liver disease
Sequencing of 640,000 exomes identifies GPR75 variants associated with protection from obesity
Large-scale human exome sequencing can identify rare protein-coding variants with a large impact on complex traits such as body adiposity. We sequenced the exomes of 645,626 individuals from the United Kingdom, the United States, and Mexico and estimated associations of rare coding variants with body mass index (BMI). We identified 16 genes with an exome-wide significant association with BMI, including those encoding five brain-expressed G protein-coupled receptors (CALCR, MC4R, GIPR, GPR151, and GPR75). Protein-truncating variants in GPR75 were observed in ∼4/10,000 sequenced individuals and were associated with 1.8 kilograms per square meter lower BMI and 54% lower odds of obesity in the heterozygous state. Knock out of Gpr75 in mice resulted in resistance to weight gain and improved glycemic control in a high-fat diet model. Inhibition of GPR75 may provide a therapeutic strategy for obesity
Recommended from our members
NOTCH3 p.Arg1231Cys is markedly enriched in South Asians and associated with stroke
Acknowledgements: Supported by Regeneron Pharmaceuticals, Inc. This research has been conducted using the UK Biobank Resource (project 26041). The authors thank everyone who made this work possible, particularly the UK Biobank team, their funders, the professionals from the member institutions who contributed to and supported this work, and most especially the UK Biobank participants, without whom this research would not be possible. The exome sequencing was funded by the UK Biobank Exome Sequencing Consortium (Bristol Myers Squibb, Regeneron, Biogen, Takeda, Abbvie, Alnylam, AstraZeneca and Pfizer). Ethical approval for the UK Biobank was previously obtained from the North West Center for Research Ethics Committee (11/NW/0382). Disclosure forms provided by the authors are available with the full text of this article.The genetic factors of stroke in South Asians are largely unexplored. Exome-wide sequencing and association analysis (ExWAS) in 75 K Pakistanis identified NM_000435.3(NOTCH3):c.3691 C > T, encoding the missense amino acid substitution p.Arg1231Cys, enriched in South Asians (alternate allele frequency = 0.58% compared to 0.019% in Western Europeans), and associated with subcortical hemorrhagic stroke [odds ratio (OR) = 3.39, 95% confidence interval (CI) = [2.26, 5.10], p = 3.87 × 10−9), and all strokes (OR [CI] = 2.30 [1.77, 3.01], p = 7.79 × 10−10). NOTCH3 p.Arg231Cys was strongly associated with white matter hyperintensity on MRI in United Kingdom Biobank (UKB) participants (effect [95% CI] in SD units = 1.1 [0.61, 1.5], p = 3.0 × 10−6). The variant is attributable for approximately 2.0% of hemorrhagic strokes and 1.1% of all strokes in South Asians. These findings highlight the value of diversity in genetic studies and have major implications for genomic medicine and therapeutic development in South Asian populations