1,429 research outputs found

    Matching anticancer compounds and tumor cell lines by neural networks with ranking loss

    Get PDF
    Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drug components that are likely to achieve the highest efficacy for a cancer cell line at hand at a therapeutic dose. State of the art drug sensitivity models use regression techniques to predict the inhibitory concentration of a drug for a tumor cell line. This regression objective is not directly aligned with either of these principal goals of drug sensitivity models: We argue that drug sensitivity modeling should be seen as a ranking problem with an optimization criterion that quantifies a drug’s inhibitory capacity for the cancer cell line at hand relative to its toxicity for healthy cells. We derive an extension to the well-established drug sensitivity regression model PaccMann that employs a ranking loss and focuses on the ratio of inhibitory concentration and therapeutic dosage range. We find that the ranking extension significantly enhances the model’s capability to identify the most effective anticancer drugs for unseen tumor cell profiles based in on in-vitro data

    A Patient-Derived Cell Atlas Informs Precision Targeting of Glioblastoma

    Get PDF
    Glioblastoma (GBM) is a malignant brain tumor with few therapeutic options. The disease presents with a complex spectrum of genomic aberrations, but the pharmacological consequences of these aberrations are partly unknown. Here, we report an integrated pharmacogenomic analysis of 100 patient-derived GBM cell cultures from the human glioma cell culture (HGCC) cohort. Exploring 1,544 drugs, we find that GBM has two main pharmacological subgroups, marked by differential response to proteasome inhibitors and mutually exclusive aberrations in TP53 and CDKN2A/B. We confirm this trend in cell and in xenotransplantation models, and identify both Bcl-2 family inhibitors and p53 activators as potentiators of proteasome inhibitors in GBM cells, We can further predict the responses of individual cell cultures to several existing drug classes, presenting opportunities for drug repurposing and design of stratified trials. Our functionally profiled biobank provides a valuable resource for the discovery of new treatments for GBM.Patrik Johansson, Cecilia Krona and Soumi Kundu share first authorship</p

    Pancreatic Cancer - Early Detection, Prognostic Factors, and Treatment

    Get PDF
    Background: Pancreatic cancer is the fourth leading cause of cancer-related death. Only about 6% of patients are alive 5 years after diagnosis. One reason for this low survival rate is that most patients are diagnosed at a late stage, when the tumor has spread to surrounding tissues or distant organs. Less than 20% of cases are diagnosed at an early stage that allows them to undergo potentially curative surgery. However, even for patients with a tumor that has been surgically removed, local and systemic recurrence is common and the median survival is only 17-23 months. This underscores the importance to identify factors that can predict postresection survival. With technical advances and centralization of care, pancreatic surgery has become a safe procedure. The future optimal treatment for pancreatic cancer is dependent on increased understanding of tumor biology and development of individualized and systemic treatment. Previous experimental studies have reported that mucins, especially the MUC4 mucin, may confer resistance to the chemotherapeutic agent gemcitabine and may serve as targets for the development of novel types of intervention. Aim: The aim of the thesis was to investigate strategies to improve management of pancreatic cancer, with special reference to early detection, prognostic factors, and treatment. Methods: In paper I, 27 prospectively collected serum samples from resectable pancreatic cancer (n=9), benign pancreatic disease (n=9), and healthy controls (n=9) were analyzed by high definition mass spectrometry (HDMSE). In paper II, an artificial neural network (ANN) model was constructed on 84 pancreatic cancer patients undergoing surgical resection. In paper III, we investigated the effects of transition from a low- to a high volume-center for pancreaticoduodenectomy in 221 patients. In paper IV, the grade of concordance in terms of MUC4 expression was examined in 17 tissue sections from primary pancreatic cancer and matched lymph node metastases. In paper V, pancreatic xenograft tumors were generated in 15 immunodeficient mice by subcutaneous injection of MUC4+ human pancreatic cancer cell lines; Capan-1, HPAF-II, or CD18/HPAF. In paper VI, a 76-member combined epigenetics and phosphatase small-molecule inhibitor library was screened against Capan-1 (MUC4+) and Panc-1 (MUC4-) cells, followed by high content screening of protein expression. Results/Conclusion: 134 differentially expressed serum proteins were identified, of which 40 proteins showed a significant up-regulation in the pancreatic cancer group. Pancreatic disease link associations could be made for BAZ2A, CDK13, DAPK1, DST, EXOSC3, INHBE, KAT2B, KIF20B, SMC1B, and SPAG5, by pathway network linkages to p53, the most frequently altered tumor suppressor in pancreatic cancer (I). An ANN survival model was developed, identifying 7 risk factors. The C-index for the model was 0.79, and it performed significantly better than the Cox regression (II). We experienced improved surgical results for pancreaticoduodenectomy after the transition to a high-volume center (≥25 procedures/year), including decreased operative duration, blood loss, hemorrhagic complications, reoperations, and hospital stay. There was also a tendency toward reduced operative mortality, from 4% to 0% (III). MUC4 positivity was detected in most primary pancreatic cancer tissues, as well as in matched metastatic lymph nodes (15/17 vs. 14/17), with a high concordance level (82%) (IV). The tumor incidence was 100% in the xenograft model. The median MUC4 count was found to be highest in Capan-1 tumors. α-SMA and collagen extent were also highest in Capan-1 tumors (V). Apicidin (a histone deacetylase inhibitor) had potent antiproliferative activity against Capan-1 cells and significantly reduced the expression of MUC4 and its transcription factor HNF4α. The combined treatment of apicidin and gemcitabine synergistically inhibited growth of Capan-1 cells (VI)

    Enhancing drug and cell line representations via contrastive learning for improved anti-cancer drug prioritization

    Full text link
    Due to cancer's complex nature and variable response to therapy, precision oncology informed by omics sequence analysis has become the current standard of care. However, the amount of data produced for each patients makes it difficult to quickly identify the best treatment regimen. Moreover, limited data availability has hindered computational methods' abilities to learn patterns associated with effective drug-cell line pairs. In this work, we propose the use of contrastive learning to improve learned drug and cell line representations by preserving relationship structures associated with drug mechanism of action and cell line cancer types. In addition to achieving enhanced performance relative to a state-of-the-art method, we find that classifiers using our learned representations exhibit a more balances reliance on drug- and cell line-derived features when making predictions. This facilitates more personalized drug prioritizations that are informed by signals related to drug resistance.Comment: 60 pages, 4 figures, 4 tables, 11 supplementary tables, 1 supplementary note, submitted to Nature Communication

    Identifier les variations conduisant au cancer dans le génome non codant et du transcriptome

    Get PDF
    Functional annotation of somatic mutations have been a consistent hotspot of cancer genomics studies. In the past, researchers preferentially focused on mutations in the coding fraction of the genome, for which ample bioinformatics tools were developed to distinguish cancer-driver mutations from neutral ones. In recent years, as an increasing number of variants were being identified as disease-associated in the non-coding genome, interpreting non-coding cancer mutations has become an urgent task. The completion of large scale projects such as ENCODE, has made functional interpretation of cancer variants achievable, and several programs were produced based on this functional information. However, there still exists some limitations as to these prediction tools, such as low prediction accuracy, lack of cancer mutation information and significant ascertainment bias. In chapter 2 of this thesis, in order to functionally interpret non-coding mutations in cancer, we developed two independent random forest models, referred to as SNP and SOM. Given a combination of features at a given genome positions, the SNP model predicts the expected fraction of rare SNPs (a measure of negative selection), and the SOM model predicts the expected mutation density at this position. We applied our two models to score these non-coding disease-associated clinvariant and HGMD variants and a set of random control SNPs. Results showed that disease-associated variants were scored higher than control SNPs with the SNP model and lower than control SNPs with the SOM model, supporting our hypothesis that purifying selection as measured by fraction of rare SNPs and mutation density is informative for the evaluation of the functional impact of cancer mutations in the non-coding genome. In the past, researchers have preferentially considered protein-coding genes as critical to the initiation and progression of cancers. However, recent evidences have shown that ncRNAs, in particular lncRNAs, are actively implicated in various cancer processes. A chapter of this thesis is devoted to this class of non-coding transcripts. Similar to protein coding genes, there might be a large number of lncRNAs with cancer-driving functions. The development of bioinformatics tools to prioritize them has become a new focus of research for computational oncologists.The last part of this thesis is devoted to the implementation of methods for discovering potential cancer-driving non-coding elements in lncRNA and protein-coding genes. We applied three scoring tools, CADD, funSeq2, GWAVA, together with our SNP and SOM scoring systems to prioritize cancer-associated elements using a permutation-based algorithm. For each locus, we compute the average score of all observed variants using one of the models, and we randomly take the same number of variants and compute their average score 1 million times to form a null distribution and obtain a P value for this locus. To validate our hypothesis and permutation model, we tested this system on 61 cancer-related lncRNA and 452 cancer genes using somatic mutation data from liver cancer, lung cancer, CLL and melanoma. We observed that both cancer lncRNAs and protein-coding genes had significantly lower average P values than total lncRNAs and protein-coding genes in all cases. Applying the permutation test to lncRNAs with five different scoring systems enabled us to prioritize hundreds to thousands of cancer-related lncRNA candidates. These candidates can be used for future experimental validation.L'annotation fonctionnelle de mutations somatiques est un point focal des études de génomique du cancer. Jusque récemment, la recherche s'est concentré sur des mutations dans la fraction codante du génome, pour lesquelles de puissants outils bioinformatiques ont été développés afin de distinguer des mutations délétères des mutations neutres. On identifie un nombre croissant de variants associés à des maladies dans le génome non-codant. L'interprétation des mutations non-codantes dans le cancer est donc devenue une tâche urgente. Des projets de grande envergure tels que ENCODE ont rendu possible l'interprétation fonctionnelle de variants dans les cancers. Plusieurs programmes ont été produits sur la base de ces informations fonctionnelles. Ces outilssont encore limités, notamment, une bas précision de la prédiction, le manque d'information de la mutation de cancer et biais de constatation importante. Dans le chapitre 2 de cette thèse, pour interpréter fonctionnellement les mutations non-codantes dans les cancers, nous avons développé deux modèles de forêts aléatoires indépendants, appelées SNP et SOM. Compte tenu de la combinaison de caractéristiques fonctionnelles à une position donnée du génome, le modèle SNP prédit la fraction de SNP rares (une mesure de la sélection négative), et le modèle SOM prédit la densité de mutations somatiques attendue à cette position. Nous avons appliqué nos deux modèles pour évaluer des clinvariant and HGMD variants asociés à des maladies, et un ensemble de SNP-contrôle aléatoires. Les résultats ont montré que les variants associés à des maladies ont des scores plus élevés que les SNP-contrôle avec le modèle SNP et inférieures avec le modèle SOM, confortant notre hypothèse selon laquelle la sélection négative, telle que mesurée par fraction de SNP rares et de densité de mutation somatiques, nous informe sur l'impact fonctionnel des mutations tumorales dans le génome non-codant. Jusqu'à présent, les chercheurs ont surtout considéré les gènes protéiques comme critiques dans l'initiation et la progression des cancers. Toutefois, des preuves récentes ont montré que les ARN non-codants, en particulier les lncRNAs, sont activement impliqués dans divers processus de cancer. Un chapitre de cette thèse est consacré à cette classe de transcripts non codants. Comme pour les gènes codants, il pourrait exister un grand nombre de lncRNAs driver de cancer. Le développement d'outils bioinformatiques pour identifier et hiérarchiser les lncRNA et autres ARN non-codants est devenu un important objet de recherche en oncologie.La dernière partie de cette thèse est consacrée à la mise en œuvre de méthodes pour découvrir des éléments non-codants potentiellement driver de cancer. Nous avons d'abord appliqué trois outils tierces, CADD, funSeq2, GWAVA, ainsi que nos modèles SNP et SOM, pour évaluer l'impact des mutations non-codantes dans tout le génome. Pour chaque locus, nous calculons la moyenne des scores de tous les variants observés à l'aide de l'un des modèles, et nous prenons au hasard le même nombre de variants et calculons leur score moyen 1 million de fois pour former une distribution nulle et obtenir une P-valeur pour ce locus. Pour valider notre hypothèse et notre modèle de permutation, nous avons testé ce système sur 452 gènes codants et 61 lncRNA liés au cancer, en utilisant des données de mutation somatique de cancer du foie, cancer du poumon, CLL et mélanome. Nous avons constaté que les lncRNAs et gènes codants associés au cancer avaient des valeurs-P significativement plus faibles que l'ensemble de lncRNAs et gènes codant. Appliquer ce test de permutation à des lncRNAs avec cinq systèmes de notation différents nous a permis de prioriser les centaines de candidats potentiellement liés au cancer.Ces candidats peuvent maintenant être soumis à validation expérimentale

    Development of Integrated Machine Learning and Data Science Approaches for the Prediction of Cancer Mutation and Autonomous Drug Discovery of Anti-Cancer Therapeutic Agents

    Get PDF
    Few technological ideas have captivated the minds of biochemical researchers to the degree that machine learning (ML) and artificial intelligence (AI) have. Over the last few years, advances in the ML field have driven the design of new computational systems that improve with experience and are able to model increasingly complex chemical and biological phenomena. In this dissertation, we capitalize on these achievements and use machine learning to study drug receptor sites and design drugs to target these sites. First, we analyze the significance of various single nucleotide variations and assess their rate of contribution to cancer. Following that, we used a portfolio of machine learning and data science approaches to design new drugs to target protein kinase inhibitors. We show that these techniques exhibit strong promise in aiding cancer research and drug discovery

    Oncological drug discovery: AI meets structure-based computational research

    Get PDF
    The integration of machine learning and structure-based methods has proven valuable in the past as a way to prioritize targets and compounds in early drug discovery. In oncological research, these methods can be highly beneficial in addressing the diversity of neoplastic diseases portrayed by the different hallmarks of cancer. Here, we review six use case scenarios for integrated computational methods, namely driver prediction, computational mutagenesis, (off)-target prediction, binding site prediction, virtual screening, and allosteric modulation analysis. We address the heterogeneity of integration approaches and individual methods, while acknowledging their current limitations and highlighting their potential to bring drugs for personalized oncological therapies to the market faster.Medicinal Chemistr

    Computational Strategies in Cancer Drug Discovery

    Get PDF
    • …
    corecore