1,209 research outputs found

    Structured penalized regression for drug sensitivity prediction

    Full text link
    Large-scale {\it in vitro} drug sensitivity screens are an important tool in personalized oncology to predict the effectiveness of potential cancer drugs. The prediction of the sensitivity of cancer cell lines to a panel of drugs is a multivariate regression problem with high-dimensional heterogeneous multi-omics data as input data and with potentially strong correlations between the outcome variables which represent the sensitivity to the different drugs. We propose a joint penalized regression approach with structured penalty terms which allow us to utilize the correlation structure between drugs with group-lasso-type penalties and at the same time address the heterogeneity between omics data sources by introducing data-source-specific penalty factors to penalize different data sources differently. By combining integrative penalty factors (IPF) with tree-guided group lasso, we create the IPF-tree-lasso method. We present a unified framework to transform more general IPF-type methods to the original penalized method. Because the structured penalty terms have multiple parameters, we demonstrate how the interval-search Efficient Parameter Selection via Global Optimization (EPSGO) algorithm can be used to optimize multiple penalty parameters efficiently. Simulation studies show that IPF-tree-lasso can improve the prediction performance compared to other lasso-type methods, in particular for heterogenous data sources. Finally, we employ the new methods to analyse data from the Genomics of Drug Sensitivity in Cancer project.Comment: Zhao Z, Zucknick M (2020). Structured penalized regression for drug sensitivity prediction. Journal of the Royal Statistical Society, Series C. 19 pages, 6 figures and 2 table

    Pharmacoproteomic characterisation of human colon and rectal cancer

    Get PDF
    Most molecular cancer therapies act on protein targets but data on the proteome status of patients and cellular models for proteome-guided pre-clinical drug sensitivity studies are only beginning to emerge. Here, we profiled the proteomes of 65 colorectal cancer (CRC) cell lines to a depth of > 10,000 proteins using mass spectrometry. Integration with proteomes of 90 CRC patients and matched transcriptomics data defined integrated CRC subtypes, highlighting cell lines representative of each tumour subtype. Modelling the responses of 52 CRC cell lines to 577 drugs as a function of proteome profiles enabled predicting drug sensitivity for cell lines and patients. Among many novel associations, MERTK was identified as a predictive marker for resistance towards MEK1/2 inhibitors and immunohistochemistry of 1,074 CRC tumours confirmed MERTK as a prognostic survival marker. We provide the proteomic and pharmacological data as a resource to the community to, for example, facilitate the design of innovative prospective clinical trials. © 2017 The Authors. Published under the terms of the CC BY 4.0 licens

    Machine learning and data mining frameworks for predicting drug response in cancer:An overview and a novel <i>in silico</i> screening process based on association rule mining

    Get PDF

    Towards Leveraging Inhibition State of the Kinome for Precision Oncology

    Get PDF
    Protein phosphorylation forms the most common method of regulation in eukaryotes, and kinases are enzymes that chiefly enable its application. Due to their central role in physiology, dysregulation of the kinome is implicated in a myriad of diseases, particularly cancer. This dissertation demonstrates that the measured inhibition of the kinome (the kinome inhibition state) by cancer targeted therapies can be predictive of cell line and patient-derived xenograft (PDX) tumor responses to treatment by that therapy using interpretable machine learning models. The predictive capability of kinome inhibition states with currently used baseline genomics for monotherapy cancer cell line responses across diverse cancer types is demonstrated first using multi-dose kinome inhibition states, and second using multi-assay single-dose data. Then, the predictive value of kinome inhibition states is extended to kinase inhibitor combination therapies, demonstrating that combined kinome inhibition states can accurately predict cancer cell line sensitivity and synergy to combination treatments, providing the basis for rational kinome-informed drug combination selection. Finally, the predictive capacity of kinome inhibition states is demonstrated for PDX tumor responses in five common solid tumor types, confirming the generalizability of kinome inhibition-based prediction models in a preclinical setting, and emphasizing their potential for clinical translation and application in precision oncology. Overall, this dissertation provides compelling evidence that integrating kinome inhibition states in machine learning models can enhance the prediction of cancer cell line and PDX tumor responses. This work shows that kinome inhibition data has potential to be included in precision oncology platforms alongside baseline genomic profiling, aiding in the identification of effective therapeutic strategies and ultimately improving patient outcomes.Doctor of Philosoph

    A Patient-Derived Cell Atlas Informs Precision Targeting of Glioblastoma

    Get PDF
    Glioblastoma (GBM) is a malignant brain tumor with few therapeutic options. The disease presents with a complex spectrum of genomic aberrations, but the pharmacological consequences of these aberrations are partly unknown. Here, we report an integrated pharmacogenomic analysis of 100 patient-derived GBM cell cultures from the human glioma cell culture (HGCC) cohort. Exploring 1,544 drugs, we find that GBM has two main pharmacological subgroups, marked by differential response to proteasome inhibitors and mutually exclusive aberrations in TP53 and CDKN2A/B. We confirm this trend in cell and in xenotransplantation models, and identify both Bcl-2 family inhibitors and p53 activators as potentiators of proteasome inhibitors in GBM cells, We can further predict the responses of individual cell cultures to several existing drug classes, presenting opportunities for drug repurposing and design of stratified trials. Our functionally profiled biobank provides a valuable resource for the discovery of new treatments for GBM.Patrik Johansson, Cecilia Krona and Soumi Kundu share first authorship</p

    Transcriptomic data integration for precision medicine in leukemia

    Get PDF
    This thesis is comprised of three studies demonstrating the application of different statistical and bioinformatic approaches to address distinct challenges of implementing precision medicine strategies for hematological malignancies. The approaches focus on the analysis of next-generation sequencing data, including both genomic and transcriptomics, to deconvolute disease biology and underlying mechanisms of drug sensitivities and resistance. The outcomes of the studies have clinical implications for advancing current diagnosis and treatment paradigms in patients with hematological diseases. Study I, RNA sequencing has not been widely adopted in a clinical diagnostic setting due to continuous development and lack of standardization. Here, the aim was to evaluate the efficiency of two different RNA-seq library preparation protocols applied to cells collected from acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) patients. The poly-A-tailed mRNA selection (PA) and ribo- depletion (RD) based RNA-seq library preparation protocols were compared and evaluated for detection of gene fusions, variant calling and gene expression profiling. Overall, both protocols produced broadly consistent results and similar outcomes. However, the PA protocol was more efficient in quantifying expression of leukemia marker genes and drug targets. It also provided higher sensitivity and specificity for expression-based classification of leukemia. In contrast, the RD protocol was more suitable for gene fusion detection and captured a greater number of transcripts. Importantly, high technical variations were observed in samples from two leukemia patient cases suggesting further development of strategies for transcriptomic quantification and data analysis. Study II, the BCL-2 inhibitor venetoclax is an approved and effective agent in combination with hypomethylating agents or low dose cytarabine for AML patients, unfit for intensive induction chemotherapy. However, a limited number of patients responding to venetoclax and development of resistance to the treatment presents a challenge for using the drug to benefit the majority of the AML patients. The aim was to investigate genomic and transcriptomic biomarkers for venetoclax sensitivity and enable identification of the patients who are most responsive to venetoclax treatment. We found that venetoclax sensitive samples are enriched with WT1 and IDH1/IDH2 mutations. Intriguingly, HOX family genes, including HOXB9, HOXA5, HOXB3, HOXB4, were found to be significantly overexpressed in venetoclax sensitive patients. Thus, these HOX-cluster genes expression biomarkers can be explored in a clinical trial setting to stratify AML patients responding to venetoclax based therapies. Study III, venetoclax treatment does not benefit all AML patients that demands identifying biomarkers to exclude the patients from venetoclax based therapies. The aim was to investigate transcriptomic biomarkers for ex vivo venetoclax resistance in AML patients. The correlation of ex vivo venetoclax response with gene expression profiles using a machine learning approach revealed significant overexpression of S100 family genes, S100A8 and S100A9. Moreover, high expression ofS100A9was found to be associated with birabresib (BET inhibitor) sensitivity. The overexpression of S100A8 and S100A9 could potentially be used to detect and monitor venetoclax resistance. The combination of BCL-2 and BET inhibitors may sensitize AML cells to venetoclax upon BET inhibition and block leukemic cell survival.In this thesis, the aim was to utilize gene expression information for advanced precision medicine outcomes in patients with hematological malignancies. In the study, I, the contemporary mainstream library preparation protocols, Ribo-depletion and PolyA enrichment used for RNA sequencing, were compared in order to select the protocol that suffices the goal of the experiment, especially in patients with acute leukemias. In study II, we applied bioinformatics approaches to identify IDH1/2 mutation and HOX family gene expression correlated with ex vivo sensitivity to BCL-2 inhibitor venetoclax in acute myeloid leukemia (AML) patients. In study III, statistical and machine learning methods were implemented to identify S100A8/A9 gene expression biomarkers for ex vivo resistance to venetoclax in AML patients. In summary, this thesis addresses the challenges of utilizing gene expression information to stratify patients based on biomarkers to promote precision medicine practice in hematological malignancies

    Assessment of modeling strategies for drug response prediction in cell lines and xenografts

    Get PDF
    Despite significant progress in cancer research, effective cancer treatment is still a challenge. Cancer treatment approaches are shifting from standard cytotoxic chemotherapy regimens towards a precision oncology paradigm, where a choice of treatment is personalized, i.e. based on a tumor’s molecular features. In order to match tumor molecular features with therapeutics we need to identify biomarkers of response and build predictive models. Recent growth of large-scale pharmacogenomics resources which combine drug sensitivity and multi-omics information on a large number of samples provides necessary data for biomarker identification and drug response modelling. However, although many efforts of using this information for drug response prediction have been made, our ability to accurately predict drug response using genetic data remains limited. In this work we used pharmacogenomics data from the largest publicly available studies in order to systematically assess various aspects of the drug response model-building process with the ultimate goal of improving prediction accuracy. We applied several machine learning methods (regularized regression, support vector machines, random forest) for predicting response to a number of drugs. We found that while accuracy of response prediction varies across drugs (in most of the cases R2 values vary between 0.1 and 0.3), different machine learning algorithms applied for the the same drug have similar prediction performance. Experiments with a range of different training sets for the same drug showed that predictive power of a model depends on the type of molecular data, the selected drug response metric, and the size of the training set. It depends less on number of features selected for modelling and on class imbalance in training set. We also implemented and tested two methods for improving consistency for pharmacogenomics data coming from different datasets. We tested our ability to correctly predict response in xenografts and patients using models trained on cell lines. Only in a fraction of the tested cases we managed to get reasonably accurate predictions, particularly in case of response to erlotinib in the NSCLC xenograft cohort, and in cases of responses to erlotinib and docetaxel in the NSCLC and BRCA patient cohorts respectively. This work also includes two applied pharmacogenomics analyses. The first is an analysis of a drug-sensitivity screen performed on a panel of Burkitt cell lines. This combines unsupervised data exploration with supervised modelling. The second is an analysis of drug-sensitivity data for the DKFZ-608 compound and the generation of the corresponding response prediction model. In summary, we applied machine learning techniques to available high-throughput pharmacogenomics data to study the determinants of accurate drug response prediction. Our results can help to draft guidelines for building accurate models for personalized drug response prediction and therefore contribute to advancing of precision oncology

    Quantitative modeling and analysis of drug screening data for personalized cancer medicine

    Get PDF
    Despite recent progress in the field of molecular medicine, the treatment and cure of complex diseases such as cancer remains a challenge. Development of resistance to first-line chemotherapy is a common cause of current anticancer treatment failure. To deal with this problem, the personalized medicine (PM) approach has been adapted toward more targeted cancer research and management. The PM approach is based on each patient s genetic, epigenetic and drug response profiling, which is used to design the best treatment option for the given patient. As the PM approach is increasingly being adopted in clinical practice, there is an urgent need for computational models and data mining methods that allow fast processing and analysis of the massive relevant profiling datasets. High-throughput drug screening enables systematic profiling of cellular responses to a wide collection of oncology compounds and their combinations, hence providing an unbiased strategy for personalized drug treatment selection. However, screening experiments with patient-derived cell samples often results in high-dimensional data matrices, with inherent sources of noise. This complicates many downstream analyses, such as the detection of differential drug activity or understanding the mechanisms behind drug sensitivity and resistance in a given patient. To meet these challenges, a computational pipeline for drug response profiling was developed in this thesis. The pipeline was based on a novel metric to quantify drug response, called the drug sensitivity score (DSS). Further, by combining the normalized drug response profile of each cancer sample with a global drug-target interaction network, a target addiction score (TAS) was developed to de-convolute the selective protein targets and obtain knowledge on their functional importance. Finally, delta scoring was developed to quantify drug combination effects and to address the problem of the clonal evolution of cancer, which often leads to resistance to mono therapies. This novel computational pipeline improves understanding of cancer development and translates compound activities into informed treatment choices for clinicians. As exemplified in two case studies of adult acute myeloid leukemia (AML) and adult granulosa cell tumor (AGCT), the models developed here have the potential to significantly contribute to the effective analysis of data from individual cancer patients and from pan-cancer cell line panels. Hence, these models will play a substantial role in future personalized cancer treatment strategies and the selection of effective treatment options for individual cancer patients.N

    Developing genomic models for cancer prevention and treatment stratification

    Get PDF
    Malignant tumors remain one of the leading causes of mortality with over 8.2 million deaths worldwide in 2012. Over the last two decades, high-throughput profiling of the human transcriptome has become an essential tool to investigate molecular processes involved in carcinogenesis. In this thesis I explore how gene expression profiling (GEP) can be used in multiple aspects of cancer research, including prevention, patient stratification and subtype discovery. The first part details how GEP could be used to supplement or even replace the current gold standard assay for testing the carcinogenic potential of chemicals. This toxicogenomic approach coupled with a Random Forest algorithm allowed me to build models capable of predicting carcinogenicity with an area under the curve of up to 86.8% and provided valuable insights into the underlying mechanisms that may contribute to cancer development. The second part describes how GEP could be used to stratify heterogeneous populations of lymphoma patients into therapeutically relevant disease sub-classes, with a particular focus on diffuse large B-cell lymphoma (DLBCL). Here, I successfully translated established biomarkers from the Affymetrix platform to the clinically relevant Nanostring nCounter© assay. This translation allowed us to profile custom sets of transcripts from formalin-fixed samples, transforming these biomarkers into clinically relevant diagnostic tools. Finally, I describe my effort to discover tumor samples dependent on altered metabolism driven by oxidative phosphorylation (OxPhos) across multiple tissue types. This work was motivated by previous studies that identified a therapeutically relevant OxPhos sub-type in DLBCL, and by the hypothesis that this stratification might be applicable to other solid tumor types. To that end, I carried out a transcriptomics-based pan-cancer analysis, derived a generalized PanOxPhos gene signature, and identified mTOR as a potential regulator in primary tumor samples. High throughput GEP coupled with statistical machine learning methods represent an important toolbox in modern cancer research. It provides a cost effective and promising new approach for predicting cancer risk associated to chemical exposure, it can reduce the cost of the ever increasing drug development process by identifying therapeutically actionable disease subtypes, and it can increase patients’ survival by matching them with the most effective drugs.2016-12-01T00:00:00
    • …
    corecore