Search CORE

7,296 research outputs found

Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties

Author: Ballester Pedro J.
Benes Cyril Henri
Garnett Mathew
Iorio Francesco
McDermott Ultan
Menden Michael P.
Saez-Rodriguez Julio
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/03/2013
Field of study

Predicting the response of a specific cancer to a therapy is a major goal in modern oncology that should ultimately lead to a personalised treatment. High-throughput screenings of potentially active compounds against a panel of genomically heterogeneous cancer cell lines have unveiled multiple relationships between genomic alterations and drug responses. Various computational approaches have been proposed to predict sensitivity based on genomic features, while others have used the chemical properties of the drugs to ascertain their effect. In an effort to integrate these complementary approaches, we developed machine learning models to predict the response of cancer cell lines to drug treatment, quantified through IC50 values, based on both the genomic features of the cell lines and the chemical properties of the considered drugs. Models predicted IC50 values in a 8-fold cross-validation and an independent blind test with coefficient of determination R2 of 0.72 and 0.64 respectively. Furthermore, models were able to predict with comparable accuracy (R2 of 0.61) IC50s of cell lines from a tissue not used in the training stage. Our in silico models can be used to optimise the experimental design of drug-cell screenings by estimating a large proportion of missing IC50 values rather than experimentally measuring them. The implications of our results go beyond virtual drug screening design: potentially thousands of drugs could be probed in silico to systematically test their potential efficacy as anti-tumour agents based on their structure, thus providing a computational framework to identify new drug repositioning opportunities as well as ultimately be useful for personalized medicine by linking the genomic traits of patients to drug sensitivity

arXiv.org e-Print Archive

CiteSeerX

Harvard University - DASH

FigShare

Artificial intelligence, machine learning, and drug repurposing in cancer

Author: Aittokallio Tero
Tanoli Ziaurrehman
Vähä-Koskela Markus
Publication venue
Publication date: 01/01/2021
Field of study

Introduction: Drug repurposing provides a cost-effective strategy to re-use approved drugs for new medical indications. Several machine learning (ML) and artificial intelligence (AI) approaches have been developed for systematic identification of drug repurposing leads based on big data resources, hence further accelerating and de-risking the drug development process by computational means. Areas covered: The authors focus on supervised ML and AI methods that make use of publicly available databases and information resources. While most of the example applications are in the field of anticancer drug therapies, the methods and resources reviewed are widely applicable also to other indications including COVID-19 treatment. A particular emphasis is placed on the use of comprehensive target activity profiles that enable a systematic repurposing process by extending the target profile of drugs to include potent off-targets with therapeutic potential for a new indication. Expert opinion: The scarcity of clinical patient data and the current focus on genetic aberrations as primary drug targets may limit the performance of anticancer drug repurposing approaches that rely solely on genomics-based information. Functional testing of cancer patient cells exposed to a large number of targeted therapies and their combinations provides an additional source of repurposing information for tissue-aware AI approaches.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

NORA - Norwegian Open Research Archives

Dissecting the Genome for Drug Response Prediction

Author: Carrino Chiara
Helmer-Citterich Manuela
Parca Luca
Pepe Gerardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

The prediction of the cancer cell lines sensitivity to a specific treatment is one of the current challenges in precision medicine. With omics and pharmacogenomics data being available for over 1000 cancer cell lines, several machine learning and deep learning algorithms have been proposed for drug sensitivity prediction. However, deciding which omics data to use and which computational methods can efficiently incorporate data from different sources is the challenge which several research groups are working on. In this review, we summarize recent advances in the representative computational methods that have been developed in the last 2 years on three public datasets: COSMIC, CCLE, NCI-60. These methods aim to improve the prediction of the cancer cell lines sensitivity to a given treatment by incorporating drug's chemical information in the input or using a priori feature selection. Finally, we discuss the latest published method which aims to improve the prediction of clinical drug response of real patients starting from cancer cell line molecular profiles

ART

Application of variations of non-linear CCA for feature selection in drug sensitivity prediction

Author: Shadbahr Tolou
Publication venue
Publication date: 17/06/2019
Field of study

Cancer arises due to the genetic alteration in patient DNA. Many studies indicate the fact that these alterations vary among patients and can affect the therapeutic effect of cancer treatment dramatically. Therefore, extensive studies focus on understanding these alterations and their effects. Pre-clinical models play an important role in cancer drug discovery and cancer cell lines are one of the main ingredients of these pre-clinical studies which can capture many different aspects of multi-omics properties of cancer cells. However, the assessment of cancer cell line responses to different drugs is faulty and laborious. Therefore, in-silico models, which perform accurate prediction of drug sensitivity values, enhance cancer drug discovery. In the past decade, many computational methods achieved high performances by studying similarity between cancer cell lines and drug compounds and used them to obtain an accurate predictive model for unknown instances. In this thesis, we study the effect of non-linear feature selection through two variations of canonical correlation analysis, KCCA, and HSIC-SCCA, on the prediction of drug sensitivity. To estimate the performance of these features we use pairwise kernel ridge regression to predict the drug sensitivity, measured as IC50 values. The data set under study is a subset of Genomics of Drug Sensitivity in Cancer comprise of 124 cell lines and 124 drug compounds. The high diversity between cell lines and drug compound samples and the high dimension of data matrices reduce the accuracy of the model obtained by pairwise kernel ridge regression. This accuracy reduced by employing HSIC-SCCA method as a dimension reduction step since the HSIC-SCCA method increased the differences among the samples by employing different projection vectors for samples in different folds of cross-validation. Therefore, the obtained variables are rotated to provide more homogeneous samples. This step slightly improved the accuracy of the model

Aaltodoc Publication Archive

Learning with multiple pairwise kernels for drug bioactivity prediction

Author: Airola
Ali
Ammad-Ud-Din
Anna Cichonska
Antti Airola
Azuaje
Barretina
Brouard
Cheng
Cheng
Cichonska
Cichonska
Cortes
Costello
Ebrahim
Elefsinioti
Engl
Giguère
Gower
Guha
Hall
Heli Julkunen
Juho Rousu
Klekota
Kludas
Marcou
Markus Heinonen
Merget
Nascimento
Pahikkala
Reymond
Sandor Szedmak
Saunders
Shawe-Taylor
Shen
Sigrist
Smirnov
Smith
Sorgenfrei
Tapio Pahikkala
Tero Aittokallio
Wagner
Yang
Publication venue
Publication date: 01/07/2018
Field of study

Motivation: Many inference problems in bioinformatics, including drug bioactivity prediction, can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially multiple kernel learning (MKL) offers promising benefits as it enables integrating various types of complex biomedical information sources in the form of kernels, along with learning their importance for the prediction task. However, the immense size of pairwise kernel spaces remains a major bottleneck, making the existing MKL algorithms computationally infeasible even for small number of input pairs. Results: We introduce pairwiseMKL, the first method for time- and memory-efficient learning with multiple pairwise kernels. pairwiseMKL first determines the mixture weights of the input pairwise kernels, and then learns the pairwise prediction function. Both steps are performed efficiently without explicit computation of the massive pairwise matrices, therefore making the method applicable to solving large pairwise learning problems. We demonstrate the performance of pairwiseMKL in two related tasks of quantitative drug bioactivity prediction using up to 167 995 bioactivity measurements and 3120 pairwise kernels: (i) prediction of anticancer efficacy of drug compounds across a large panel of cancer cell lines; and (ii) prediction of target profiles of anticancer compounds across their kinome-wide target spaces. We show that pairwiseMKL provides accurate predictions using sparse solutions in terms of selected kernels, and therefore it automatically identifies also data sources relevant for the prediction problem.Peer reviewe

Crossref

Aaltodoc Publication Archive

Helsingin yliopiston digitaalinen arkisto

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California