14 research outputs found

    Exploring microRNA Biology using Integrative Bioinformatics

    Get PDF
    Deregulation of energy metabolism is one of the emerging hallmarks of cancer required for proliferation and metastasis. MicroRNAs are small RNA molecules that have crucial roles in the regulation of biological processes in organisms, including metabolism. Due to recent discovery of miRNAs in humans, roles of miRNAs in metabolism of tumour cells, and effects these have on cancer patients, are still obscure and in need of expansion. Currently, experimental and computational data on the miRNAs are being analysed by a wide range of statistical methods; however, these methods in their original forms posses many limitations. Therefore, new ways of utilising these statistical methods are needed in order to unravel the roles of miRNAs in cancer metabolism. In this thesis, the roles of a specific miRNA, miR-22, and the three metabolic target genes were investigated through the use of classical statistical methods, revealed that miR-22, the metabolic target genes, and the interactions between them, were beneficial to survival outcome of breast cancer patients. Furthermore, novel combinations of the conventional statistical methods were invented in order to investigate the global miRNA regulations on metabolic target genes. These new procedures were demonstrated by using publicly available data sets. In one analysis, it was found that miRNAs could be divided into six clusters according to the metabolic target genes through a novel combination of statistical methods. A new statistical method was also invented to provide a generalised means to test for clustering based on sets of correlations.Open Acces

    NELFE-Dependent MYC Signature Identifies a Unique Cancer Subtype in Hepatocellular Carcinoma.

    Get PDF
    The MYC oncogene is dysregulated in approximately 30% of liver cancer. In an effort to exploit MYC as a therapeutic target, including in hepatocellular carcinoma (HCC), strategies have been developed on the basis of MYC amplification or gene translocation. Due to the failure of these strategies to provide accurate diagnostics and prognostic value, we have developed a Negative Elongation Factor E (NELFE)-Dependent MYC Target (NDMT) gene signature. This signature, which consists of genes regulated by MYC and NELFE, an RNA binding protein that enhances MYC-induced hepatocarcinogenesis, is predictive of NELFE/MYC-driven tumors that would otherwise not be identified by gene amplification or translocation alone. We demonstrate the utility of the NDMT gene signature to predict a unique subtype of HCC, which is associated with a poor prognosis in three independent cohorts encompassing diverse etiologies, demographics, and viral status. The application of gene signatures, such as the NDMT signature, offers patients access to personalized risk assessments, which may be utilized to direct future care

    MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS

    Get PDF
    The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes

    Tumor metabolism and associated serum metabolites define prognostic subtypes of Asian hepatocellular carcinoma.

    Get PDF
    Treatment effectiveness in hepatocellular carcinoma (HCC) depends on early detection and precision-medicine-based patient stratification for targeted therapies. However, the lack of robust biomarkers, particularly a non-invasive diagnostic tool, precludes significant improvement of clinical outcomes for HCC patients. Serum metabolites are one of the best non-invasive means for determining patient prognosis, as they are stable end-products of biochemical processes in human body. In this study, we aimed to identify prognostic serum metabolites in HCC. To determine serum metabolites that were relevant and representative of the tissue status, we performed a two-step correlation analysis to first determine associations between metabolic genes and tissue metabolites, and second, between tissue metabolites and serum metabolites among 49 HCC patients, which were then validated in 408 additional Asian HCC patients with mixed etiologies. We found that certain metabolic genes, tissue metabolites and serum metabolites can independently stratify HCC patients into prognostic subgroups, which are consistent across these different data types and our previous findings. The metabolic subtypes are associated with β-oxidation process in fatty acid metabolism, where patients with worse survival outcome have dysregulated fatty acid metabolism. These serum metabolites may be used as non-invasive biomarkers to define prognostic tumor molecular subtypes for HCC

    Behaviour of the different methods under the null.

    No full text
    <p>This table relates to the simulation study to test the type 1 error rates of MultiPhen, CCA, and the univariate approach, described in the text. The elements of the table show the number of results with <i>P</i><1e<sup>–5</sup> in the scenario described by the corresponding row and column (which give the minor allele frequencies) headers. Since 100000 replicates of SNP-phenotype associations were simulated under the null hypothesis of no association, the expectation for all elements of the table is 1; those with >1 indicating inflation of the type 1 error rate. Simulations with MAF = 30%, 0.5% were performed on a sample size of N = 5000. For the full results see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s001" target="_blank">Figures S1</a>–<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s008" target="_blank">S8</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s014" target="_blank">Table S1</a>–<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s016" target="_blank">S3</a>.</p

    The power of MultiPhen in different scenarios of effect and correlation between phenotypes.

    No full text
    <p>Power results based on simulations described in the text for MultiPhen (red lines) and the standard single-phenotype approach (black lines). Left panel: causal variant explains 0.5% of phenotypic variance of both phenotypes. Middle panel: causal variant explains 0.5% on the phenotypic variance of the first phenotype and 0.1% of the variance in the second phenotype. Right panel: causal variant explains 0.5% of phenotypic variance of the first phenotype and 0% of the second phenotype.</p

    Results under standard GWAS and MultiPhen approaches for genome-wide significant SNPs.

    No full text
    <p>¶ Nyholt-Šidák corrected for 4 comparisons. § Nyholt-Šidák corrected for 3 comparisons. Results compare univariate and MultiPhen <i>P</i> values, presented on the -log10 scale for ease of comparison, for all SNPs with genome-wide significant <i>P</i> values (>7.301 on the -log10 scale) from either approach. Genome-wide significant results shown in bold. The difference in terms of orders of magnitude of the MultiPhen <i>P</i> value on all phenotypes is relative to the most associated univariate phenotype; and the order of magnitude difference for MultiPhen where the most associated phenotype is excluded is relative to the univariate result also excluding the most associated phenotype.</p

    Genome-wide significant results from standard GWAS approach and MultiPhen tested on combinations of the lipids using NFBC1966 data.

    No full text
    <p>Each bar shows the number of SNPs reaching genome-wide significance for a given phenotype-combination analysis (specified by the first letters of each trait, such that CHL refers to an analysis on the CHOL, HDL and LDL), with the SNPs discovered by both the univariate approach and MultiPhen shown by the white segment of the bar, the SNPs discovered by the univariate approach only shown by the grey segment, and the SNPs discovered by MultiPhen only illustrated by the black segment. The bars labelled ALL2 and ALL3 combine results across analyses on all combinations of two and three lipid traits, respectively, while ALL combines the results across the analyses of all 2, 3 and 4 combinations of the traits. A complete breakdown of these results is presented in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s018" target="_blank">Tables S5</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s019" target="_blank">S6</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s020" target="_blank">S7</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s021" target="_blank">S8</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s022" target="_blank">S9</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s023" target="_blank">S10</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s024" target="_blank">S11</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s025" target="_blank">S12</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s026" target="_blank">S13</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s027" target="_blank">S14</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0034861#pone.0034861.s028" target="_blank">S15</a>.</p

    The correlation structure between pairs of lipids.

    No full text
    <p>The left panel shows the correlation structure between total cholesterol (CHOL) and low-density lipoprotein (LDL) in 5655 individuals from the Northern Finland Birth Cohort 1966. Each circle depicts the value of CHOL (X-axis) and LDL (Y-axis) in mmol/L for each individual. The right panel shows the correlation structure between low-density lipoprotein (LDL) and high-density lipoprotein (HDL), in mmol/L, in the same individuals. The arrows in each plot show the direction of effect of a variant affecting only CHOL or only HDL, such that the genotypes of individuals underlying each plotted point are more likely to contain risk alleles for the labelled lipid moving through the points in the direction of the arrow. The diagonal arrows are based on the Friedewald Formula (Friedewald.72). The arrows indicate that effects of variants can be in very different directions in the 2-dimensional spaces shown; the aim of modelling and testing linear combinations of phenotypes is to capture effects in any direction.</p
    corecore