194 research outputs found

    A two-step multiple-marker strategy for genome-wide association studies

    Get PDF
    Genome-wide association studies raise study-design and analytical issues that are still being debated. Among them, stands the issue of reducing the number of markers to be genotyped without loss of efficiency in identifying trait loci, which can reduce the cost of studies and minimize the multiple testing problem. With this aim, we proposed a two-step strategy based on two analytical methods suited to examine sets of markers rather than single markers: the local score, which screens the genome to select candidate regions in Step 1, and FBAT-LC, a multiple-marker family-based association test used to obtain significance levels of regions at step 2. The performance of this strategy was evaluated on all replicates of Genetic Analysis Workshop 15 Problem 3 simulated data, using the answers to that problem. Overall, seven of the nine generated trait loci were detected in at least 87% of the replicates using a framework designed to handle either association with the disease or association with the severity of disease. This multiple-marker strategy was compared to the single-marker approach. By considering regions instead of single markers, this strategy minimizes the multiple testing problem and the number of false-positive results

    Inflated Type I Error Rates When Using Aggregation Methods to Analyze Rare Variants in the 1000 Genomes Project Exon Sequencing Data in Unrelated Individuals: Summary Results from Group 7 at Genetic Analysis Workshop 17

    Get PDF
    As part of Genetic Analysis Workshop 17 (GAW17), our group considered the application of novel and standard approaches to the analysis of genotype-phenotype association in next-generation sequencing data. Our group identified a major issue in the analysis of the GAW17 next-generation sequencing data: type I error and false-positive report probability rates higher than those expected based on empirical type I error levels (as high as 90%). Two main causes emerged: population stratification and long-range correlation (gametic phase disequilibrium) between rare variants. Population stratification was expected because of the diverse sample. Correlation between rare variants was attributable to both random causes (e.g., nearly 10,000 of 25,000 markers were private variants, and the sample size was small [n = 697]) and nonrandom causes (more correlation was observed than was expected by random chance). Principal components analysis was used to control for population structure and helped to minimize type I errors, but this was at the expense of identifying fewer causal variants. A novel multiple regression approach showed promise to handle correlation between markers. Further work is needed, first, to identify best practices for the control of type I errors in the analysis of sequencing data and then to explore and compare the many promising new aggregating approaches for identifying markers associated with disease phenotypes

    lme4qtl: linear mixed models with flexible covariance structure for genetic studies of related individuals.

    Get PDF
    BACKGROUND: Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software. RESULTS: To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project. CONCLUSIONS: Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl

    An Approach to Identify Gene-Environment interactions and Reveal New Biological insight in Complex Traits

    Get PDF
    There is a long-standing debate about the magnitude of the contribution of gene-environment interactions to phenotypic variations of complex traits owing to the low statistical power and few reported interactions to date. to address this issue, the Gene-Lifestyle Interactions Working Group within the Cohorts for Heart and Aging Research in Genetic Epidemiology Consortium has been spearheading efforts to investigate G × E in large and diverse samples through meta-analysis. Here, we present a powerful new approach to screen for interactions across the genome, an approach that shares substantial similarity to the Mendelian randomization framework. We identify and confirm 5 loci (6 independent signals) interacted with either cigarette smoking or alcohol consumption for serum lipids, and empirically demonstrate that interaction and mediation are the major contributors to genetic effect size heterogeneity across populations. The estimated lower bound of the interaction and environmentally mediated heritability is significant (P \u3c 0.02) for low-density lipoprotein cholesterol and triglycerides in Cross-Population data. Our study improves the understanding of the genetic architecture and environmental contributions to complex traits

    Fungal microbiota dysbiosis in IBD.

    Get PDF
    International audienceThe bacterial intestinal microbiota plays major roles in human physiology and IBDs. Although some data suggest a role of the fungal microbiota in IBD pathogenesis, the available data are scarce. The aim of our study was to characterise the faecal fungal microbiota in patients with IBD. Bacterial and fungal composition of the faecal microbiota of 235 patients with IBD and 38 healthy subjects (HS) was determined using 16S and ITS2 sequencing, respectively. The obtained sequences were analysed using the Qiime pipeline to assess composition and diversity. Bacterial and fungal taxa associated with clinical parameters were identified using multivariate association with linear models. Correlation between bacterial and fungal microbiota was investigated using Spearman's test and distance correlation. We observed that fungal microbiota is skewed in IBD, with an increased Basidiomycota/Ascomycota ratio, a decreased proportion of Saccharomyces cerevisiae and an increased proportion of Candida albicans compared with HS. We also identified disease-specific alterations in diversity, indicating that a Crohn's disease-specific gut environment may favour fungi at the expense of bacteria. The concomitant analysis of bacterial and fungal microbiota showed a dense and homogenous correlation network in HS but a dramatically unbalanced network in IBD, suggesting the existence of disease-specific inter-kingdom alterations. Besides bacterial dysbiosis, our study identifies a distinct fungal microbiota dysbiosis in IBD characterised by alterations in biodiversity and composition. Moreover, we unravel here disease-specific inter-kingdom network alterations in IBD, suggesting that, beyond bacteria, fungi might also play a role in IBD pathogenesis

    The association between serum lipids and intraocular pressure in two large UK cohorts

    Get PDF
    PURPOSE: Serum lipids are modifiable, routinely collected blood tests associated with cardiovascular health. We examined the association of commonly collected serum lipid measures (total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein (LDL-C) and triglycerides (TG)) with intraocular pressure (IOP). DESIGN: Cross-sectional study in the UK Biobank and EPIC-Norfolk cohorts. PARTICIPANTS: We included 94 323 participants of UK Biobank (mean age 57 years) and 6 230 participants of EPIC-Norfolk (mean age 68 years) with data on TC, HDL-C, LDL-C, TG collected between 2006-2009. METHODS: Multivariable linear regression adjusting for demographic, lifestyle, anthropometric, medical and ophthalmic covariables was used to examine the associations of serum lipids with IOPcc. MAIN OUTCOME MEASURES: IOPcc. RESULTS: Higher levels of TC, HDL-C and LDL-C were independently associated with higher IOPcc in both cohorts after adjustment for key demographic, medical and lifestyle factors. For each standard deviation increase in TC, HDL-C, and LDL-C, IOPcc (mmHg) was higher by 0.09 (95% CI: 0.06-0.11; P<0.001), 0.11 (95% CI 0.08-0.13; P<0.001), 0.07 (95% CI: 0.05-0.09, P<0.001), respectively in the UK Biobank cohort. In the EPIC-Norfolk cohort, each additional standard deviation in TC, HDL-C, and LDL-C was associated with a higher IOPcc (mmHg) by 0.19 (95% CI 0.07-0.31, P=0.001), 0.14 (95% CI 0.03-0.25, P=0.016), and 0.17 (95% CI 0.06-0.29, P=0.003). An inverse association between TGs and IOP in the UK Biobank (-0.05, 95% CI -0.08 to -0.03, P<0.001) was not replicated in the EPIC cohort (P=0.30). CONCLUSION: Our findings suggest that serum TC, HDL-C and LDL-C are positively associated with IOP in two UK cohorts and TGs may be negatively associated. Future research is required to assess whether these associations are causal in nature
    corecore