555 research outputs found

    Knowledge-guided multi-scale independent component analysis for biomarker identification

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this paper, we develop a novel strategy, namely knowledge-guided multi-scale independent component analysis (ICA), to first infer regulatory signals and then identify biologically relevant biomarkers from microarray data.</p> <p>Results</p> <p>Since gene expression levels reflect the joint effect of several underlying biological functions, disease-specific biomarkers may be involved in several distinct biological functions. To identify disease-specific biomarkers that provide unique mechanistic insights, a meta-data "knowledge gene pool" (KGP) is first constructed from multiple data sources to provide important information on the likely functions (such as gene ontology information) and regulatory events (such as promoter responsive elements) associated with potential genes of interest. The gene expression and biological meta data associated with the members of the KGP can then be used to guide subsequent analysis. ICA is then applied to multi-scale gene clusters to reveal regulatory modes reflecting the underlying biological mechanisms. Finally disease-specific biomarkers are extracted by their weighted connectivity scores associated with the extracted regulatory modes. A statistical significance test is used to evaluate the significance of transcription factor enrichment for the extracted gene set based on motif information. We applied the proposed method to yeast cell cycle microarray data and Rsf-1-induced ovarian cancer microarray data. The results show that our knowledge-guided ICA approach can extract biologically meaningful regulatory modes and outperform several baseline methods for biomarker identification.</p> <p>Conclusion</p> <p>We have proposed a novel method, namely knowledge-guided multi-scale ICA, to identify disease-specific biomarkers. The goal is to infer knowledge-relevant regulatory signals and then identify corresponding biomarkers through a multi-scale strategy. The approach has been successfully applied to two expression profiling experiments to demonstrate its improved performance in extracting biologically meaningful and disease-related biomarkers. More importantly, the proposed approach shows promising results to infer novel biomarkers for ovarian cancer and extend current knowledge.</p

    Evolution-informed modeling improves outcome prediction for cancers

    Get PDF
    abstract: Despite wide applications of high-throughput biotechnologies in cancer research, many biomarkers discovered by exploring large-scale omics data do not provide satisfactory performance when used to predict cancer treatment outcomes. This problem is partly due to the overlooking of functional implications of molecular markers. Here, we present a novel computational method that uses evolutionary conservation as prior knowledge to discover bona fide biomarkers. Evolutionary selection at the molecular level is nature's test on functional consequences of genetic elements. By prioritizing genes that show significant statistical association and high functional impact, our new method reduces the chances of including spurious markers in the predictive model. When applied to predicting therapeutic responses for patients with acute myeloid leukemia and to predicting metastasis for patients with prostate cancers, the new method gave rise to evolution-informed models that enjoyed low complexity and high accuracy. The identified genetic markers also have significant implications in tumor progression and embrace potential drug targets. Because evolutionary conservation can be estimated as a gene-specific, position-specific, or allele-specific parameter on the nucleotide level and on the protein level, this new method can be extended to apply to miscellaneous “omics” data to accelerate biomarker discoveries.The final version of this article, as published in Evolutionary Applications, can be viewed online at: http://onlinelibrary.wiley.com/doi/10.1111/eva.12417/ful

    Evolution‐informed modeling improves outcome prediction for cancers

    Full text link
    Despite wide applications of high‐throughput biotechnologies in cancer research, many biomarkers discovered by exploring large‐scale omics data do not provide satisfactory performance when used to predict cancer treatment outcomes. This problem is partly due to the overlooking of functional implications of molecular markers. Here, we present a novel computational method that uses evolutionary conservation as prior knowledge to discover bona fide biomarkers. Evolutionary selection at the molecular level is nature’s test on functional consequences of genetic elements. By prioritizing genes that show significant statistical association and high functional impact, our new method reduces the chances of including spurious markers in the predictive model. When applied to predicting therapeutic responses for patients with acute myeloid leukemia and to predicting metastasis for patients with prostate cancers, the new method gave rise to evolution‐informed models that enjoyed low complexity and high accuracy. The identified genetic markers also have significant implications in tumor progression and embrace potential drug targets. Because evolutionary conservation can be estimated as a gene‐specific, position‐specific, or allele‐specific parameter on the nucleotide level and on the protein level, this new method can be extended to apply to miscellaneous “omics” data to accelerate biomarker discoveries.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/135247/1/eva12417_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/135247/2/eva12417.pd

    Deep Learning for Genomics: A Concise Overview

    Full text link
    Advancements in genomic research such as high-throughput sequencing techniques have driven modern genomic studies into "big data" disciplines. This data explosion is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications.Comment: Invited chapter for Springer Book: Handbook of Deep Learning Application

    MICROENVIRONMENT-INDUCED PTEN LOSS BY EXOSOMAL MICRORNA PRIMES BRAIN METASTASIS OUTGROWTH

    Get PDF
    Development of life-threatening cancer metastases at distant organs requires disseminated tumor cells’ adaptation to and co-evolution with the drastically different microenvironments of metastatic sites. Cancer cells of common origin manifest distinct gene expression patterns after metastasizing to different organs. Clearly, the dynamic interplay between metastatic tumor cells and extrinsic signals at individual metastatic organ sites critically impacts the subsequent metastatic outgrowth. Yet, it is unclear when and how disseminated tumor cells acquire the essential traits from the microenvironment of metastatic organs that prime their subsequent outgrowth. Here we show that primary tumor cells with normal expression of PTEN, an important tumor suppressor, lose PTEN expression after dissemination to the brain, but not to other organs. PTEN level in PTEN-loss brain metastatic tumor cells is restored after leaving brain microenvironment. This brain microenvironment-dependent, reversible PTEN mRNA and protein down-regulation is epigenetically regulated by microRNAs (miRNAs) from astrocytes. Mechanistically, astrocyte-derived exosomes mediate an intercellular transfer of PTEN-targeting miRNAs to metastatic tumor cells, while astrocyte-specific depletion of PTEN-targeting miRNAs or blockade of astrocyte exosome secretion rescues the PTEN loss and suppresses brain metastasis in vivo. Furthermore, this adaptive PTEN loss in brain metastatic tumor cells leads to an increased secretion of cytokine chemokine (C-C motif) ligand 2 (CCL2), which recruits Iba1+ myeloid cells that reciprocally enhance outgrowth of brain metastatic tumor cells via enhanced proliferation and reduced apoptosis. Our findings demonstrate a remarkable plasticity of PTEN expression in metastatic tumor cells in response to different organ microenvironments, underpinning an essential role of co-evolution between the metastatic cells and their microenvironment during the adaptive metastatic outgrowth. Our findings signify the dynamic and reciprocal cross-talk between tumor cells and the metastatic niche; importantly, they provide new opportunities for effective anti-metastasis therapies, especially of consequence for those brain metastasis patients who are in dire need

    The role of blood-based biomarkers in ischemic stroke

    Get PDF

    Algorithms and Methods for Robust Processing and Analysis of Mass Spectrometry Data

    Get PDF
    Liquid chromatography-mass spectrometry (LC-MS) and mass spectrometry imaging (MSI) are two techniques that are routinely used to study proteins, peptides, and metabolites at a large scale. Thousands of biological compounds can be identified and quantified in a single experiment with LC-MS, but many studies fail to convert this data to a better understanding of disease biology. One of the primary reasons for this is low reproducibility, which in turn is partially due to inaccurate and/or inconsistent data processing. Protein biomarkers and signatures for various types of cancer are frequently discovered with LC-MS, but their behavior in independent cohorts is often inconsistent to that in the discovery cohort. Biomarker candidates must be thoroughly validated in independent cohorts, which makes the ability to share data across different laboratories crucial to the future success of the MS-based research fields. The emergence and growth of public repositories for MSI data is a step in the rightdirection. Still, many of those data sets remain incompatible one another due to inaccurate or incompatible preprocessing strategies. Ensuring compatibility between data generated in different labs is therefore necessary to gain access to the full potential of MS-based research. In two of the studies that I present in this thesis, we used LC-MS to characterize lymph node metastases from individuals with melanoma. Furthermore, my thesis work has resulted in two novel preprocessing methods for MSI data sets. The first one is a peak detection method that achieves considerably higher sensitivity for faintly expressed compounds than existing methods, and the second one is a accurate, robust, and general approach to mass alignment. Both algorithms deliberately rely on centroid spectra, which makes them compatible with most shared data sets. I believe that the improvements demonstrated by these methods can lead to a higher reproducibility in the MS-based research fields, and, ultimately, to a better understanding of disease processes
    • 

    corecore