3,554 research outputs found

    Predicting drug response of tumors from integrated genomic profiles by deep neural networks

    Full text link
    The study of high-throughput genomic profiles from a pharmacogenomics viewpoint has provided unprecedented insights into the oncogenic features modulating drug response. A recent screening of ~1,000 cancer cell lines to a collection of anti-cancer drugs illuminated the link between genotypes and vulnerability. However, due to essential differences between cell lines and tumors, the translation into predicting drug response in tumors remains challenging. Here we proposed a DNN model to predict drug response based on mutation and expression profiles of a cancer cell or a tumor. The model contains a mutation and an expression encoders pre-trained using a large pan-cancer dataset to abstract core representations of high-dimension data, followed by a drug response predictor network. Given a pair of mutation and expression profiles, the model predicts IC50 values of 265 drugs. We trained and tested the model on a dataset of 622 cancer cell lines and achieved an overall prediction performance of mean squared error at 1.96 (log-scale IC50 values). The performance was superior in prediction error or stability than two classical methods and four analog DNNs of our model. We then applied the model to predict drug response of 9,059 tumors of 33 cancer types. The model predicted both known, including EGFR inhibitors in non-small cell lung cancer and tamoxifen in ER+ breast cancer, and novel drug targets. The comprehensive analysis further revealed the molecular mechanisms underlying the resistance to a chemotherapeutic drug docetaxel in a pan-cancer setting and the anti-cancer potential of a novel agent, CX-5461, in treating gliomas and hematopoietic malignancies. Overall, our model and findings improve the prediction of drug response and the identification of novel therapeutic options.Comment: Accepted for presentation in the International Conference on Intelligent Biology and Medicine (ICIBM 2018) at Los Angeles, CA, USA. Currently under consideration for publication in a Supplement Issue of BMC Genomic

    NELFE-Dependent MYC Signature Identifies a Unique Cancer Subtype in Hepatocellular Carcinoma.

    Get PDF
    The MYC oncogene is dysregulated in approximately 30% of liver cancer. In an effort to exploit MYC as a therapeutic target, including in hepatocellular carcinoma (HCC), strategies have been developed on the basis of MYC amplification or gene translocation. Due to the failure of these strategies to provide accurate diagnostics and prognostic value, we have developed a Negative Elongation Factor E (NELFE)-Dependent MYC Target (NDMT) gene signature. This signature, which consists of genes regulated by MYC and NELFE, an RNA binding protein that enhances MYC-induced hepatocarcinogenesis, is predictive of NELFE/MYC-driven tumors that would otherwise not be identified by gene amplification or translocation alone. We demonstrate the utility of the NDMT gene signature to predict a unique subtype of HCC, which is associated with a poor prognosis in three independent cohorts encompassing diverse etiologies, demographics, and viral status. The application of gene signatures, such as the NDMT signature, offers patients access to personalized risk assessments, which may be utilized to direct future care

    Algorithm for early diagnosis of hepatocellular carcinoma based on gene pair similarity

    Get PDF
    The article proposes an algorithm based on intelligent methods for the early diagnosis of hepatocellular carcinoma (HCC), known as liver cancer, which is rated third cause of cancer deaths in the world. Initial diagnosis of HСC is based on laboratory studies, computer tomography and X-ray examination. However, in some cases, identifying cancerous tissues as similar non-cancerous tissues (cirrhotic tissues and normal tissues) made it necessary to perform gene analysis for the diagnosis. To predict HCC based on such numerous, diverse and heterogeneous unstructured data, preference is given to the method of artificial intelligence, i.e., machine learning. It shows the possibility of applying machine learning methods to solve the problem of accurate identification of HCC due to the compatibility of HCC tissues with identical CwoHCC non-cancerous tissues. The technology of gene pair profiling using relevant peer databases is described and the Within-Sample Relative Expression Orderings (REO) technique is used to determine the gene pair’s similarity. The article also presents a new approach based on The Within-Sample Relative Expression Orderings technique for determining the gene pair’s similarity, Incremental feature selection method for feature selection, and Support Vector Machine methods for gene pair classification. The proposed approach constitutes the methodological basis of a decision support system for the early diagnosis of HCC, and the development of such a system may be beneficial for physician decision support in the relevant fiel

    Network-based approaches to explore complex biological systems towards network medicine

    Get PDF
    Network medicine relies on different types of networks: from the molecular level of protein–protein interactions to gene regulatory network and correlation studies of gene expression. Among network approaches based on the analysis of the topological properties of protein–protein interaction (PPI) networks, we discuss the widespread DIAMOnD (disease module detection) algorithm. Starting from the assumption that PPI networks can be viewed as maps where diseases can be identified with localized perturbation within a specific neighborhood (i.e., disease modules), DIAMOnD performs a systematic analysis of the human PPI network to uncover new disease-associated genes by exploiting the connectivity significance instead of connection density. The past few years have witnessed the increasing interest in understanding the molecular mechanism of post-transcriptional regulation with a special emphasis on non-coding RNAs since they are emerging as key regulators of many cellular processes in both physiological and pathological states. Recent findings show that coding genes are not the only targets that microRNAs interact with. In fact, there is a pool of different RNAs—including long non-coding RNAs (lncRNAs) —competing with each other to attract microRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The framework of regulatory networks provides a powerful tool to gather new insights into ceRNA regulatory mechanisms. Here, we describe a data-driven model recently developed to explore the lncRNA-associated ceRNA activity in breast invasive carcinoma. On the other hand, a very promising example of the co-expression network is the one implemented by the software SWIM (switch miner), which combines topological properties of correlation networks with gene expression data in order to identify a small pool of genes—called switch genes—critically associated with drastic changes in cell phenotype. Here, we describe SWIM tool along with its applications to cancer research and compare its predictions with DIAMOnD disease genes

    A Machine Learning Framework for Identifying Molecular Biomarkers from Transcriptomic Cancer Data

    Get PDF
    Cancer is a complex molecular process due to abnormal changes in the genome, such as mutation and copy number variation, and epigenetic aberrations such as dysregulations of long non-coding RNA (lncRNA). These abnormal changes are reflected in transcriptome by turning oncogenes on and tumor suppressor genes off, which are considered cancer biomarkers. However, transcriptomic data is high dimensional, and finding the best subset of genes (features) related to causing cancer is computationally challenging and expensive. Thus, developing a feature selection framework to discover molecular biomarkers for cancer is critical. Traditional approaches for biomarker discovery calculate the fold change for each gene, comparing expression profiles between tumor and healthy samples, thus failing to capture the combined effect of the whole gene set. Also, these approaches do not always investigate cancer-type prediction capabilities using discovered biomarkers. In this work, we proposed a machine learning-based framework to address all of the above challenges in discovering lncRNA biomarkers. First, we developed a machine learning pipeline that takes lncRNA expression profiles of cancer samples as input and outputs a small set of key lncRNAs that can accurately predict multiple cancer types. A significant innovation of our work is its ability to identify biomarkers without using healthy samples. However, this initial framework cannot identify cancer-specific lncRNAs. Second, we extended our framework to identify cancer type and subtype-specific lncRNAs. Third, we proposed to use a state-of-the-art deep learning algorithm concrete autoencoder (CAE) in an unsupervised setting, which efficiently identifies a subset of the most informative features. However, CAE does not identify reproducible features in different runs due to its stochastic nature. Thus, we proposed a multi-run CAE (mrCAE) to identify a stable set of features to address this issue. Our deep learning-based pipeline significantly extended the previous state-of-the-art feature selection techniques. Finally, we showed that discovered biomarkers are biologically relevant using literature review and prognostically significant using survival analyses. The discovered novel biomarkers could be used as a screening tool for different cancer diagnoses and as therapeutic targets

    Pathway-extended gene expression signatures integrate novel biomarkers that improve predictions of patient responses to kinase inhibitors

    Get PDF
    Cancer chemotherapy responses have been related to multiple pharmacogenetic biomarkers, often for the same drug. This study utilizes machine learning to derive multi-gene expression signatures that predict individual patient responses to specific tyrosine kinase inhibitors, including erlotinib, gefitinib, sorafenib, sunitinib, lapatinib and imatinib. Support Vector Machine learning was used to train mathematical models that distinguished sensitivity from resistance to these drugs using a novel systems biology-based approach. This began with expression of genes previously implicated in specific drug responses, then expanded to evaluate genes whose products were related through biochemical pathways and interactions. Optimal pathway-extended support vector machines predicted responses in patients at accuracies of 70% (imatinib), 71% (lapatinib), 83% (sunitinib), 83% (erlotinib), 88% (sorafenib) and 91% (gefitinib). These best performing pathway-extended models demonstrated improved balance predicting both sensitive and resistant patient categories, with many of these genes having a known role in cancer etiology. Ensemble machine learning-based averaging of multiple pathway-extended models derived for an individual drug increased accuracy to \u3e70% for erlotinib, gefitinib, lapatinib, and sorafenib. Through incorporation of novel cancer biomarkers, machine learning-based pathway-extended signatures display strong efficacy predicting both sensitive and resistant patient responses to chemotherapy

    Applying artificial intelligence to big data in hepatopancreatic and biliary surgery: a scoping review

    Get PDF
    Aim: Artificial Intelligence (AI) and its applications in healthcare are rapidly developing. The healthcare industry generates ever-increasing volumes of data that should be used to improve patient care. This review aims to examine the use of AI and its applications in hepatopancreatic and biliary (HPB) surgery, highlighting studies leveraging large datasets.Methods: A PRISMA-ScR compliant scoping review using Medline and Google Scholar databases was performed (5th August 2022). Studies focusing on the development and application of AI to HPB surgery were eligible for inclusion. We undertook a conceptual mapping exercise to identify key areas where AI is under active development for use in HPB surgery. We considered studies and concepts in the context of patient pathways - before surgery (including diagnostics), around the time of surgery (supporting interventions) and after surgery (including prognostication).Results: 98 studies were included. Most studies were performed in China or the USA (n = 45). Liver surgery was the most common area studied (n = 51). Research into AI in HPB surgery has increased rapidly in recent years, with almost two-thirds published since 2019 (61/98). Of these studies, 11 have focused on using “big data” to develop and apply AI models. Nine of these studies came from the USA and nearly all focused on the application of Natural Language Processing. We identified several critical conceptual areas where AI is under active development, including improving preoperative optimization, image guidance and sensor fusion-assisted surgery, surgical planning and simulation, natural language processing of clinical reports for deep phenotyping and prediction, and image-based machine learning.Conclusion: Applications of AI in HPB surgery primarily focus on image analysis and computer vision to address diagnostic and prognostic uncertainties. Virtual 3D and augmented reality models to support complex HPB interventions are also under active development and likely to be used in surgical planning and education. In addition, natural language processing may be helpful in the annotation and phenotyping of disease, leading to new scientific insights

    Computational Methods for the Analysis of Genomic Data and Biological Processes

    Get PDF
    In recent decades, new technologies have made remarkable progress in helping to understand biological systems. Rapid advances in genomic profiling techniques such as microarrays or high-performance sequencing have brought new opportunities and challenges in the fields of computational biology and bioinformatics. Such genetic sequencing techniques allow large amounts of data to be produced, whose analysis and cross-integration could provide a complete view of organisms. As a result, it is necessary to develop new techniques and algorithms that carry out an analysis of these data with reliability and efficiency. This Special Issue collected the latest advances in the field of computational methods for the analysis of gene expression data, and, in particular, the modeling of biological processes. Here we present eleven works selected to be published in this Special Issue due to their interest, quality, and originality
    • 

    corecore