59 research outputs found

    kruX:Matrix-based non-parametric eQTL discovery

    Get PDF
    The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. In summary, kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure.Comment: minor revision; 6 pages, 5 figures; software available at http://krux.googlecode.co

    Experimental investigation into l-Arg and l-Cys eco-friendly surfactants in enhanced oil recovery by considering IFT reduction and wettability alteration

    Get PDF
    Surfactant flooding is an important technique used to improve oil recovery from mature oil reservoirs due to minimizing the interfacial tension (IFT) between oil and water and/or altering the rock wettability toward water-wet using various surfactant agents including cationic, anionic, non-ionic, and amphoteric varieties. In this study, two amino-acid based surfactants, named lauroyl arginine (l-Arg) and lauroyl cysteine (l-Cys), were synthesized and used to reduce the IFT of oil–water systems and alter the wettability of carbonate rocks, thus improving oil recovery from oil-wet carbonate reservoirs. The synthesized surfactants were characterized using Fourier transform infrared spectroscopy and nuclear magnetic resonance analyses, and the critical micelle concentration (CMC) of surfactant solutions was determined using conductivity, pH, and turbidity techniques. Experimental results showed that the CMCs of l-Arg and l-Cys solutions were 2000 and 4500 ppm, respectively. It was found that using l-Arg and l-Cys solutions at their CMCs, the IFT and contact angle were reduced from 34.5 to 18.0 and 15.4 mN/m, and from 144° to 78° and 75°, respectively. Thus, the l-Arg and l-Cys solutions enabled approximately 11.9% and 8.9% additional recovery of OOIP (original oil in place). It was identified that both amino-acid surfactants can be used to improve oil recovery due to their desirable effects on the EOR mechanisms at their CMC ranges

    Genetic susceptibility loci for cardiovascular disease and their impact on atherosclerotic plaques

    Get PDF
    Background: Atherosclerosis is a chronic inflammatory disease in part caused by lipid uptake in the vascular wall, but the exact underlying mechanisms leading to acute myocardial infarction and stroke remain poorly understood. Large consortia identified genetic susceptibility loci that associate with large artery ischemic stroke and coronary artery disease. However, deciphering their underlying mechanisms are challenging. Histological studies identified destabilizing characteristics in human atherosclerotic plaques that associate with clinical outcome. To what extent established susceptibility loci for large artery ischemic stroke and coronary artery disease relate to plaque characteristics is thus far unknown but may point to novel mechanisms. Methods: We studied the associations of 61 established cardiovascular risk loci with 7 histological plaque characteristics assessed in 1443 carotid plaque specimens from the Athero-Express Biobank Study. We also assessed if the genotyped cardiovascular risk loci impact the tissue-specific gene expression in 2 independent biobanks, Biobank of Karolinska Endarterectomy and Stockholm Atherosclerosis Gene Expression. Results: A total of 21 established risk variants (out of 61) nominally associated to a plaque characteristic. One variant (rs12539895, risk allele A) at 7q22 associated to a reduction of intraplaque fat, P=5.09×10−6 after correction for multiple testing. We further characterized this 7q22 Locus and show tissue-specific effects of rs12539895 on HBP1 expression in plaques and COG5 expression in whole blood and provide data from public resources showing an association with decreased LDL (low-density lipoprotein) and increase HDL (high-density lipoprotein) in the blood. Conclusions: Our study supports the view that cardiovascular susceptibility loci may exert their effect by influencing the atherosclerotic plaque characteristics

    Detection of regulator genes and eQTLs in gene networks

    Full text link
    Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.Comment: minor revision with typos corrected; review article; 24 pages, 2 figure

    Cross-Tissue Regulatory Gene Networks in Coronary Artery Disease

    Get PDF
    Coronary artery disease (CAD) is the underlying cause of myocardial infarction and stroke that together are responsible for nearly 30% of all global deaths. CAD is a common complex disease caused by the interactions of multiple genetic and environmental risk factors acting across several metabolic and vascular tissues. Owing to the complexity of these interactions, systems genetics is an increasingly recognized path to a better understanding of complex diseases. In this thesis, we applied systems genetics by integrating the analysis of genotype (DNA) and global gene expression (RNA) data from metabolic and vascular tissues with phenotype data from the clinically well-characterized subjects in the Stockholm Atherosclerosis Gene Expression (STAGE) study. We validated the initial findings using genome-wide association studies (GWAS) and several gene expression datasets from mice and cell models. As a result, we for the first time inferred regulatory gene networks (RGNs) with key drivers of CAD, several of its main risk factors and atherosclerosis regression. In paper I, we designed a computational pipeline to reconstruct RGNs with key drivers in CAD using the STAGE study. Then, by integrating expression quantitative traits (eQTLs) of these RGNs with genotype data from several GWAS, 30 CAD-causal RGNs interconnected in blood, vascular and metabolic tissues were identified. Twelve of these RGNs were further validated in gene expression and phenotype data from the Hybrid Mouse Diversity Panel. As proof of concept, by targeting the key drivers AIP, DRAP1, POLR2I, and PQBP1 in a cross-species-validated, arterial-wall RGN involving RNA-processing genes, we re-identified this RGN in THP-1 foam cells and independent gene expression data from CAD macrophages and carotid lesions. In paper II, we developed a cross-tissue weighted gene co-expression network analysis (X-WGCNA) method (used in Paper I) that reliably captures gene activities both within and across tissues. X-WGCNA is implemented as a package in R and is available online. In paper III, we inferred transcription factor (TF) RGNs from three plasma cholesterol lowering (PCL)-responsive gene sets causally related to regression of early, mature, and advanced mouse atherosclerosis. We then used THP-1 cells in an in vitro atherosclerosis regression model to successfully validate 3 key drivers in these RGNs driving regression in early (PPARG), mature (MLL5), and advanced (SRSF10/XRN2) atherosclerosis. In paper IV, we inferred the STAGE eQTLs (used in papers I and II) and identified subsets with gene regulatory effects across multiple tissues that according to GWAS were highly enriched in association with CAD. To better understand the pathophysiological role of these multi-tissue eQTLs, we identified and analyzed a number of associated gene sets. A key result of this thesis is a repository of RGNs with key drivers for CAD, CAD risk factors, and atherosclerosis regression. This repository together with the computational pipeline including X-WGCNA should be useful in future studies that aim to go beyond genetic loci identified by GWAS and provide opportunities for novel diagnostics and therapies

    Human Validation of Genes Associated With a Murine Atherosclerotic Phenotype

    Get PDF
    ObjectiveThe genetically modified mouse is the most commonly used animal model for studying the pathogenesis of atherosclerotic disease. We aimed to assess if mice atherosclerosis-related genes could be validated in human disease through examination of results from genome-wide association studies. Approach and ResultsWe performed a systematic review to identify atherosclerosis-causing genes in mice and carried out gene-based association tests of their human orthologs for an association with human coronary artery disease and human large artery ischemic stroke. Moreover, we investigated the association of these genes with human atherosclerotic plaque characteristics. In addition, we assessed the presence of tissue-specific cis-acting expression quantitative trait loci for these genes in humans. Finally, using pathway analyses we show that the putative atherosclerosis-causing genes revealed few associations with human coronary artery disease, large artery ischemic stroke, or atherosclerotic plaque characteristics, despite the fact that the majority of these genes have cis-acting expression quantitative trait loci. ConclusionsA role for genes that has been observed in mice for atherosclerotic lesion development could scarcely be confirmed by studying associations of disease development with common human genetic variants. The value of murine atherosclerotic models for selection of therapeutic targets in human disease remains unclear

    Model-based clustering of multi-tissue gene expression data

    Get PDF
    Motivation: Recently, it has become feasible to generate large-scale, multi-tissue gene expression data, where expression profiles are obtained from multiple tissues or organs sampled from dozens to hundreds of individuals. When traditional clustering methods are applied to this type of data, important information is lost, because they either require all tissues to be analyzed independently, ignoring dependencies and similarities between tissues, or to merge tissues in a single, monolithic dataset, ignoring individual characteristics of tissues.Results: We developed a Bayesian model-based multi-tissue clustering algorithm, revamp, which can incorporate prior information on physiological tissue similarity, and which results in a set of clusters, each consisting of a core set of genes conserved across tissues as well as differential sets of genes specific to one or more subsets of tissues. Using data from seven vascular and metabolic tissues from over 100 individuals in the STockholm Atherosclerosis Gene Expression (STAGE) study, we demonstrate that multi-tissue clusters inferred by revamp are more enriched for tissue-dependent protein-protein interactions compared to alternative approaches. We further demonstrate that revamp results in easily interpretable multi-tissue gene expression associations to key coronary artery disease processes and clinical phenotypes in the STAGE individuals

    Different prognostic impact of recurrent gene mutations in chronic lymphocytic leukemia depending on IGHV gene somatic hypermutation status: a study by ERIC in HARMONY

    Get PDF
    Recent evidence suggests that the prognostic impact of gene mutations in patients with chronic lymphocytic leukemia (CLL) may differ depending on the immunoglobulin heavy variable (IGHV) gene somatic hypermutation (SHM) status. In this study, we assessed the impact of nine recurrently mutated genes (BIRC3, EGR2, MYD88, NFKBIE, NOTCH1, POT1, SF3B1, TP53, and XPO1) in pre-treatment samples from 4580 patients with CLL, using time-to-first-treatment (TTFT) as the primary end-point in relation to IGHV gene SHM status. Mutations were detected in 1588 (34.7%) patients at frequencies ranging from 2.3-9.8% with mutations in NOTCH1 being the most frequent. In both univariate and multivariate analyses, mutations in all genes except MYD88 were associated with a significantly shorter TTFT. In multivariate analysis of Binet stage A patients, performed separately for IGHV-mutated (M-CLL) and unmutated CLL (U-CLL), a different spectrum of gene alterations independently predicted short TTFT within the two subgroups. While SF3B1 and XPO1 mutations were independent prognostic variables in both U-CLL and M-CLL, TP53, BIRC3 and EGR2 aberrations were significant predictors only in U-CLL, and NOTCH1 and NFKBIE only in M-CLL. Our findings underscore the need for a compartmentalized approach to identify high-risk patients, particularly among M-CLL patients, with potential implications for stratified management
    • …
    corecore