191 research outputs found
A two-step multiple-marker strategy for genome-wide association studies
Genome-wide association studies raise study-design and analytical issues that are still being debated. Among them, stands the issue of reducing the number of markers to be genotyped without loss of efficiency in identifying trait loci, which can reduce the cost of studies and minimize the multiple testing problem. With this aim, we proposed a two-step strategy based on two analytical methods suited to examine sets of markers rather than single markers: the local score, which screens the genome to select candidate regions in Step 1, and FBAT-LC, a multiple-marker family-based association test used to obtain significance levels of regions at step 2. The performance of this strategy was evaluated on all replicates of Genetic Analysis Workshop 15 Problem 3 simulated data, using the answers to that problem. Overall, seven of the nine generated trait loci were detected in at least 87% of the replicates using a framework designed to handle either association with the disease or association with the severity of disease. This multiple-marker strategy was compared to the single-marker approach. By considering regions instead of single markers, this strategy minimizes the multiple testing problem and the number of false-positive results
Inflated Type I Error Rates When Using Aggregation Methods to Analyze Rare Variants in the 1000 Genomes Project Exon Sequencing Data in Unrelated Individuals: Summary Results from Group 7 at Genetic Analysis Workshop 17
As part of Genetic Analysis Workshop 17 (GAW17), our group considered the application of novel and standard approaches to the analysis of genotype-phenotype association in next-generation sequencing data. Our group identified a major issue in the analysis of the GAW17 next-generation sequencing data: type I error and false-positive report probability rates higher than those expected based on empirical type I error levels (as high as 90%). Two main causes emerged: population stratification and long-range correlation (gametic phase disequilibrium) between rare variants. Population stratification was expected because of the diverse sample. Correlation between rare variants was attributable to both random causes (e.g., nearly 10,000 of 25,000 markers were private variants, and the sample size was small [n = 697]) and nonrandom causes (more correlation was observed than was expected by random chance). Principal components analysis was used to control for population structure and helped to minimize type I errors, but this was at the expense of identifying fewer causal variants. A novel multiple regression approach showed promise to handle correlation between markers. Further work is needed, first, to identify best practices for the control of type I errors in the analysis of sequencing data and then to explore and compare the many promising new aggregating approaches for identifying markers associated with disease phenotypes
Recommended from our members
Combining effects from rare and common genetic variants in an exome-wide association study of sequence data
Recent breakthroughs in next-generation sequencing technologies allow cost-effective methods for measuring a growing list of cellular properties, including DNA sequence and structural variation. Next-generation sequencing has the potential to revolutionize complex trait genetics by directly measuring common and rare genetic variants within a genome-wide context. Because for a given gene both rare and common causal variants can coexist and have independent effects on a trait, strategies that model the effects of both common and rare variants could enhance the power of identifying disease-associated genes. To date, little work has been done on integrating signals from common and rare variants into powerful statistics for finding disease genes in genome-wide association studies. In this analysis of the Genetic Analysis Workshop 17 data, we evaluate various strategies for association of rare, common, or a combination of both rare and common variants on quantitative phenotypes in unrelated individuals. We show that the analysis of common variants only using classical approaches can achieve higher power to detect causal genes than recently proposed rare variant methods and that strategies that combine association signals derived independently in rare and common variants can slightly increase the power compared to strategies that focus on the effect of either the rare variants or the common variants
lme4qtl: linear mixed models with flexible covariance structure for genetic studies of related individuals.
BACKGROUND: Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software. RESULTS: To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project. CONCLUSIONS: Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl
Recommended from our members
Exploring Genome-Wide â Dietary Heme Iron Intake Interactions and the Risk of Type 2 Diabetes
Aims/hypothesis: Genome-wide association studies have identified over 50 new genetic loci for type 2 diabetes (T2D). Several studies conclude that higher dietary heme iron intake increases the risk of T2D. Therefore we assessed whether the relation between genetic loci and T2D is modified by dietary heme iron intake. Methods: We used Affymetrix Genome-Wide Human 6.0 array data [681,770 single nucleotide polymorphisms (SNPs)] and dietary information collected in the Health Professionals Follow-up Study (n = 725 cases; n = 1,273 controls) and the Nursesâ Health Study (n = 1,081 cases; n = 1,692 controls). We assessed whether genome-wide SNPs or iron metabolism SNPs interacted with dietary heme iron intake in relation to T2D, testing for associations in each cohort separately and then meta-analyzing to pool the results. Finally, we created 1,000 synthetic pathways matched to an iron metabolism pathway on number of genes, and number of SNPs in each gene. We compared the iron metabolic pathway SNPs with these synthetic SNP assemblies in their relation to T2D to assess if the pathway as a whole interacts with dietary heme iron intake. Results: Using a genomic approach, we found no significant geneâenvironment interactions with dietary heme iron intake in relation to T2D at a Bonferroni corrected genome-wide significance level of (top SNP in pooled analysis: intergenic rs10980508; ). Furthermore, no SNP in the iron metabolic pathway significantly interacted with dietary heme iron intake at a Bonferroni corrected significance level of (top SNP in pooled analysis: rs1805313; ). Finally, neither the main genetic effects (pooled empirical p by SNP = 0.41), nor gene â dietary hemeâiron interactions (pooled empirical p-value for the interactions = 0.72) were significant for the iron metabolic pathway as a whole. Conclusions: We found no significant interactions between dietary heme iron intake and common SNPs in relation to T2D
Recommended from our members
Screening for interaction effects in gene expression data
Expression quantitative trait (eQTL) studies are a powerful tool for identifying genetic variants that affect levels of messenger RNA. Since gene expression is controlled by a complex network of gene-regulating factors, one way to identify these factors is to search for interaction effects between genetic variants and mRNA levels of transcription factors (TFs) and their respective target genes. However, identification of interaction effects in gene expression data pose a variety of methodological challenges, and it has become clear that such analyses should be conducted and interpreted with caution. Investigating the validity and interpretability of several interaction tests when screening for eQTL SNPs whose effect on the target gene expression is modified by the expression level of a transcription factor, we characterized two important methodological issues. First, we stress the scale-dependency of interaction effects and highlight that commonly applied transformation of gene expression data can induce or remove interactions, making interpretation of results more challenging. We then demonstrate that, in the setting of moderate to strong interaction effects on the order of what may be reasonably expected for eQTL studies, standard interaction screening can be biased due to heteroscedasticity induced by true interactions. Using simulation and real data analysis, we outline a set of reasonable minimum conditions and sample size requirements for reliable detection of variant-by-environment and variant-by-TF interactions using the heteroscedasticity consistent covariance-based approach
Fungal microbiota dysbiosis in IBD.
International audienceThe bacterial intestinal microbiota plays major roles in human physiology and IBDs. Although some data suggest a role of the fungal microbiota in IBD pathogenesis, the available data are scarce. The aim of our study was to characterise the faecal fungal microbiota in patients with IBD. Bacterial and fungal composition of the faecal microbiota of 235 patients with IBD and 38 healthy subjects (HS) was determined using 16S and ITS2 sequencing, respectively. The obtained sequences were analysed using the Qiime pipeline to assess composition and diversity. Bacterial and fungal taxa associated with clinical parameters were identified using multivariate association with linear models. Correlation between bacterial and fungal microbiota was investigated using Spearman's test and distance correlation. We observed that fungal microbiota is skewed in IBD, with an increased Basidiomycota/Ascomycota ratio, a decreased proportion of Saccharomyces cerevisiae and an increased proportion of Candida albicans compared with HS. We also identified disease-specific alterations in diversity, indicating that a Crohn's disease-specific gut environment may favour fungi at the expense of bacteria. The concomitant analysis of bacterial and fungal microbiota showed a dense and homogenous correlation network in HS but a dramatically unbalanced network in IBD, suggesting the existence of disease-specific inter-kingdom alterations. Besides bacterial dysbiosis, our study identifies a distinct fungal microbiota dysbiosis in IBD characterised by alterations in biodiversity and composition. Moreover, we unravel here disease-specific inter-kingdom network alterations in IBD, suggesting that, beyond bacteria, fungi might also play a role in IBD pathogenesis
The association between serum lipids and intraocular pressure in two large UK cohorts
PURPOSE: Serum lipids are modifiable, routinely collected blood tests associated with cardiovascular health. We examined the association of commonly collected serum lipid measures (total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein (LDL-C) and triglycerides (TG)) with intraocular pressure (IOP). DESIGN: Cross-sectional study in the UK Biobank and EPIC-Norfolk cohorts. PARTICIPANTS: We included 94 323 participants of UK Biobank (mean age 57 years) and 6 230 participants of EPIC-Norfolk (mean age 68 years) with data on TC, HDL-C, LDL-C, TG collected between 2006-2009. METHODS: Multivariable linear regression adjusting for demographic, lifestyle, anthropometric, medical and ophthalmic covariables was used to examine the associations of serum lipids with IOPcc. MAIN OUTCOME MEASURES: IOPcc. RESULTS: Higher levels of TC, HDL-C and LDL-C were independently associated with higher IOPcc in both cohorts after adjustment for key demographic, medical and lifestyle factors. For each standard deviation increase in TC, HDL-C, and LDL-C, IOPcc (mmHg) was higher by 0.09 (95% CI: 0.06-0.11; P<0.001), 0.11 (95% CI 0.08-0.13; P<0.001), 0.07 (95% CI: 0.05-0.09, P<0.001), respectively in the UK Biobank cohort. In the EPIC-Norfolk cohort, each additional standard deviation in TC, HDL-C, and LDL-C was associated with a higher IOPcc (mmHg) by 0.19 (95% CI 0.07-0.31, P=0.001), 0.14 (95% CI 0.03-0.25, P=0.016), and 0.17 (95% CI 0.06-0.29, P=0.003). An inverse association between TGs and IOP in the UK Biobank (-0.05, 95% CI -0.08 to -0.03, P<0.001) was not replicated in the EPIC cohort (P=0.30). CONCLUSION: Our findings suggest that serum TC, HDL-C and LDL-C are positively associated with IOP in two UK cohorts and TGs may be negatively associated. Future research is required to assess whether these associations are causal in nature
The Association of Physical Activity with Glaucoma and Related Traits in the UK Biobank
PURPOSE: To examine the association of physical activity (PA) with glaucoma and related traits, to assess whether genetic predisposition to glaucoma modified these associations, and to probe causal relationships using Mendelian randomization (MR). DESIGN: Cross-sectional observational and gene-environment interaction analyses in the UK Biobank. Two-sample MR experiments using summary statistics from large genetic consortia. PARTICIPANTS: UK Biobank participants with data on self-reported or accelerometer-derived PA and intraocular pressure (IOP; n = 94 206 and n = 27 777, respectively), macular inner retinal OCT measurements (n = 36 274 and n = 9991, respectively), and glaucoma status (n = 86 803 and n = 23 556, respectively). METHODS: We evaluated multivariable-adjusted associations of self-reported (International Physical Activity Questionnaire) and accelerometer-derived PA with IOP and macular inner retinal OCT parameters using linear regression and with glaucoma status using logistic regression. For all outcomes, we examined gene-PA interactions using a polygenic risk score (PRS) that combined the effects of 2673 genetic variants associated with glaucoma. MAIN OUTCOME MEASURES: Intraocular pressure, macular retinal nerve fiber layer (mRNFL) thickness, macular ganglion cell-inner plexiform layer (mGCIPL) thickness, and glaucoma status. RESULTS: In multivariable-adjusted regression models, we found no association of PA level or time spent in PA with glaucoma status. Higher overall levels and greater time spent in higher levels of both self-reported and accelerometer-derived PA were associated positively with thicker mGCIPL (P < 0.001 for trend for each). Compared with the lowest quartile of PA, participants in the highest quartiles of accelerometer-derived moderate- and vigorous-intensity PA showed a thicker mGCIPL by +0.57 Όm (P < 0.001) and +0.42 Όm (P = 0.005). No association was found with mRNFL thickness. High overall level of self-reported PA was associated with a modestly higher IOP of +0.08 mmHg (P = 0.01), but this was not replicated in the accelerometry data. No associations were modified by a glaucoma PRS, and MR analyses did not support a causal relationship between PA and any glaucoma-related outcome. CONCLUSIONS: Higher overall PA level and greater time spent in moderate and vigorous PA were not associated with glaucoma status but were associated with thicker mGCIPL. Associations with IOP were modest and inconsistent. Despite the well-documented acute reduction in IOP after PA, we found no evidence that high levels of habitual PA are associated with glaucoma status or IOP in the general population. FINANCIAL DISCLOSURE(S): Proprietary or commercial disclosure may be found after the references
- âŠ