121 research outputs found

    Machine Learning Models for Deciphering Regulatory Mechanisms and Morphological Variations in Cancer

    Get PDF
    The exponential growth of multi-omics biological datasets is resulting in an emerging paradigm shift in fundamental biological research. In recent years, imaging and transcriptomics datasets are increasingly incorporated into biological studies, pushing biology further into the domain of data-intensive-sciences. New approaches and tools from statistics, computer science, and data engineering are profoundly influencing biological research. Harnessing this ever-growing deluge of multi-omics biological data requires the development of novel and creative computational approaches. In parallel, fundamental research in data sciences and Artificial Intelligence (AI) has advanced tremendously, allowing the scientific community to generate a massive amount of knowledge from data. Advances in Deep Learning (DL), in particular, are transforming many branches of engineering, science, and technology. Several of these methodologies have already been adapted for harnessing biological datasets; however, there is still a need to further adapt and tailor these techniques to new and emerging technologies. In this dissertation, we present computational algorithms and tools that we have developed to study gene-regulation and cellular morphology in cancer. The models and platforms that we have developed are general and widely applicable to several problems relating to dysregulation of gene expression in diseases. Our pipelines and software packages are disseminated in public repositories for larger scientific community use. This dissertation is organized in three main projects. In the first project, we present Causal Inference Engine (CIE), an integrated platform for the identification and interpretation of active regulators of transcriptional response. The platform offers visualization tools and pathway enrichment analysis to map predicted regulators to Reactome pathways. We provide a parallelized R-package for fast and flexible directional enrichment analysis to run the inference on custom regulatory networks. Next, we designed and developed MODEX, a fully automated text-mining system to extract and annotate causal regulatory interaction between Transcription Factors (TFs) and genes from the biomedical literature. MODEX uses putative TF-gene interactions derived from high-throughput ChIP-Seq or other experiments and seeks to collect evidence and meta-data in the biomedical literature to validate and annotate the interactions. MODEX is a complementary platform to CIE that provides auxiliary information on CIE inferred interactions by mining the literature. In the second project, we present a Convolutional Neural Network (CNN) classifier to perform a pan-cancer analysis of tumor morphology, and predict mutations in key genes. The main challenges were to determine morphological features underlying a genetic status and assess whether these features were common in other cancer types. We trained an Inception-v3 based model to predict TP53 mutation in five cancer types with the highest rate of TP53 mutations. We also performed a cross-classification analysis to assess shared morphological features across multiple cancer types. Further, we applied a similar methodology to classify HER2 status in breast cancer and predict response to treatment in HER2 positive samples. For this study, our training slides were manually annotated by expert pathologists to highlight Regions of Interest (ROIs) associated with HER2+/- tumor microenvironment. Our results indicated that there are strong morphological features associated with each tumor type. Moreover, our predictions highly agree with manual annotations in the test set, indicating the feasibility of our approach in devising an image-based diagnostic tool for HER2 status and treatment response prediction. We have validated our model using samples from an independent cohort, which demonstrates the generalizability of our approach. Finally, in the third project, we present an approach to use spatial transcriptomics data to predict spatially-resolved active gene regulatory mechanisms in tissues. Using spatial transcriptomics, we identified tissue regions with differentially expressed genes and applied our CIE methodology to predict active TFs that can potentially regulate the marker genes in the region. This project bridged the gap between inference of active regulators using molecular data and morphological studies using images. The results demonstrate a significant local pattern in TF activity across the tissue, indicating differential spatial-regulation in tissues. The results suggest that the integrative analysis of spatial transcriptomics data with CIE can capture discriminant features and identify localized TF-target links in the tissue

    The Era of Radiogenomics in Precision Medicine: An Emerging Approach to Support Diagnosis, Treatment Decisions, and Prognostication in Oncology

    Get PDF
    With the rapid development of new technologies, including artificial intelligence and genome sequencing, radiogenomics has emerged as a state-of-the-art science in the field of individualized medicine. Radiogenomics combines a large volume of quantitative data extracted from medical images with individual genomic phenotypes and constructs a prediction model through deep learning to stratify patients, guide therapeutic strategies, and evaluate clinical outcomes. Recent studies of various types of tumors demonstrate the predictive value of radiogenomics. And some of the issues in the radiogenomic analysis and the solutions from prior works are presented. Although the workflow criteria and international agreed guidelines for statistical methods need to be confirmed, radiogenomics represents a repeatable and cost-effective approach for the detection of continuous changes and is a promising surrogate for invasive interventions. Therefore, radiogenomics could facilitate computer-aided diagnosis, treatment, and prediction of the prognosis in patients with tumors in the routine clinical setting. Here, we summarize the integrated process of radiogenomics and introduce the crucial strategies and statistical algorithms involved in current studies

    Deep Learning-Based Prediction of Molecular Tumor Biomarkers from H&E: A Practical Review

    Full text link
    Molecular and genomic properties are critical in selecting cancer treatments to target individual tumors, particularly for immunotherapy. However, the methods to assess such properties are expensive, time-consuming, and often not routinely performed. Applying machine learning to H&E images can provide a more cost-effective screening method. Dozens of studies over the last few years have demonstrated that a variety of molecular biomarkers can be predicted from H&E alone using the advancements of deep learning: molecular alterations, genomic subtypes, protein biomarkers, and even the presence of viruses. This article reviews the diverse applications across cancer types and the methodology to train and validate these models on whole slide images. From bottom-up to pathologist-driven to hybrid approaches, the leading trends include a variety of weakly supervised deep learning-based approaches, as well as mechanisms for training strongly supervised models in select situations. While results of these algorithms look promising, some challenges still persist, including small training sets, rigorous validation, and model explainability. Biomarker prediction models may yield a screening method to determine when to run molecular tests or an alternative when molecular tests are not possible. They also create new opportunities in quantifying intratumoral heterogeneity and predicting patient outcomes.Comment: 20 pages, 2 figure

    Pan-cancer image-based detection of clinically actionable genetic alterations

    Get PDF
    Molecular alterations in cancer can cause phenotypic changes in tumor cells and their microenvironment. Routine histopathology tissue slides, which are ubiquitously available, can reflect such morphological changes. Here, we show that deep learning can consistently infer a wide range of genetic mutations, molecular tumor subtypes, gene expression signatures and standard pathology biomarkers directly from routine histology. We developed, optimized, validated and publicly released a one-stop-shop workflow and applied it to tissue slides of more than 5,000 patients across multiple solid tumors. Our findings show that a single deep learning algorithm can be trained to predict a wide range of molecular alterations from routine, paraffin-embedded histology slides stained with hematoxylin and eosin. These predictions generalize to other populations and are spatially resolved. Our method can be implemented on mobile hardware, potentially enabling point-of-care diagnostics for personalized cancer treatment. More generally, this approach could elucidate and quantify genotype–phenotype links in cancer

    Systems Biology of Gastric Cancer: Perspectives on the Omics-Based Diagnosis and Treatment

    Get PDF
    Gastric cancer is the fifth most diagnosed cancer in the world, affecting more than a million people and causing nearly 783,000 deaths each year. The prognosis of advanced gastric cancer remains extremely poor despite the use of surgery and adjuvant therapy. Therefore, understanding the mechanism of gastric cancer development, and the discovery of novel diagnostic biomarkers and therapeutics are major goals in gastric cancer research. Here, we review recent progress in application of omics technologies in gastric cancer research, with special focus on the utilization of systems biology approaches to integrate multi-omics data. In addition, the association between gastrointestinal microbiota and gastric cancer are discussed, which may offer insights in exploring the novel microbiota-targeted therapeutics. Finally, the application of data-driven systems biology and machine learning approaches could provide a predictive understanding of gastric cancer, and pave the way to the development of novel biomarkers and rational design of cancer therapeutics

    Clinical implications of intratumor heterogeneity : challenges and opportunities

    Get PDF
    In this review, we highlight the role of intratumoral heterogeneity, focusing on the clinical and biological ramifications this phenomenon poses. Intratumoral heterogeneity arises through complex genetic, epigenetic, and protein modifications that drive phenotypic selection in response to environmental pressures. Functionally, heterogeneity provides tumors with significant adaptability. This ranges from mutual beneficial cooperation between cells, which nurture features such as growth and metastasis, to the narrow escape and survival of clonal cell populations that have adapted to thrive under specific conditions such as hypoxia or chemotherapy. These dynamic intercellular interplays are guided by a Darwinian selection landscape between clonal tumor cell populations and the tumor microenvironment. Understanding the involved drivers and functional consequences of such tumor heterogeneity is challenging but also promises to provide novel insight needed to confront the problem of therapeutic resistance in tumors
    corecore