469 research outputs found

    Explainable artificial intelligence for patient stratification and drug repositioning

    Get PDF
    Enabling precision medicine requires developing robust patient stratification methods as well as drugs tailored to homogeneous subgroups of patients from a heterogeneous population. Developing de novo drugs is expensive and time consuming with an ultimately low FDA approval rate. These limitations make developing new drugs for a small portion of a disease population unfeasible. Therefore, drug repositioning is an essential alternative for developing new drugs for a disease subpopulation. There is a crucial need to develop data-driven approaches that find druggable homogeneous subgroups within the disease population and reposition the drugs for these subgroups. In this study, we developed an explainable AI approach for patient stratification and drug repositioning. Exploratory mining mimicking the trial recruitment process as well as network analysis were used to discover homogeneous subgroups within a disease population. For each subgroup, a biomedical network analysis was done to find the drugs that are most relevant to a given subgroup of patients. The set of candidate drugs for each subgroup was ranked using an aggregated drug score assigned to each drug. The method represents a human-in-the-loop framework, where medical experts use data-driven results to generate hypotheses and obtain insights into potential therapeutic candidates for patients who belong to a subgroup. To examine the validity of our method, we implemented our method on individual cancer types and on pan-cancer data to consider the inter- and intra-heterogeneity within a cancer type and among cancer types. Patients' phenotypic and genotypic data was utilized with a heterogeneous knowledge base because it gives a multi-view perspective for finding new indications for drugs outside of their original use. Our analysis of the top candidate drugs for the subgroups showed that most of these drugs are FDA-approved drugs for cancer, and others are non-cancer related, but have the potential to be repurposed for cancer. We have discovered novel cancer-related mechanisms that these drugs can target in different cancer types to reduce cancer treatment costs and improve patient survival. Further wet lab experiments to validate these findings are required prior to initiating clinical trials using these repurposed therapies.Includes bibliographical references

    Application of Benchmark Dose Analysis to in vitro Genotoxicity Data for Compound Risk Characterisation

    Get PDF
    Genotoxic risk from exposure to pharmaceutical compounds has historically been focussed on dichotomous hazard characterisation, with little regulatory acceptance of risk assessment paradigms. The regulations focus on testing novel compounds with outdated genotoxicity test systems. Recent overwhelming support of the Benchmark Dose (BMD) methodology provides the baseline for advanced exposure risk assessments. Novel flow cytometric in vitro DNA damage response assays (MultiFlow and ToxTracker) have been developed that provide quantitative dose-response information that can be used in a high-throughput screening environment. In the following work, BMD modelling is applied to the MultiFlow and ToxTracker biomarker dose-response datasets. This work demonstrates that the MultiFlow dose-response biomarker datasets are amenable to BMD analysis for a set of clastogens and aneugens, and that the biomarker dose-responses correlate with dose-responses from the gold-standard in vitro micronucleus assay. A detailed appraisal of BMD confidence intervals (CIs) is provided for a selection of 10 clastogens requiring metabolic activation (with S9), demonstrating the criticality of using BMD uncertainty measures in comparative potency analysis. A comparative potency algorithm is developed and utilised in machine learning to distinguish four S9-dependent groupings: high and low-level potentiation, no effect, and diminution. A deep dive case study is presented for MultiFlow and ToxTracker analysis of Topoisomerase II Poisons, where BMD CI potency ranks are shown to correlate broadly with compound structural information. The Adverse Outcome Pathway (AOP) for Topoisomerase-II Poisoning is developed upon, and the Lhasa Derek Nexus alerts are mapped to the AOP. A Quantitative Structural Activity Relationship model is developed using Topoisomerase-II Poison molecular descriptors and BMD measurements from MultiFlow and ToxTracker biomarkers that correspond to Key Events relative to the Topoisomerase-II Poison AOP. This thesis provides an all-encompassing report of in vitro DNA damage response biomarker BMD analysis for compound potency ranking and read across

    Machine learning-based virtual screening, molecular docking, drug-likeness, pharmacokinetics and toxicity analyses to identify new natural inhibitors of the glycoprotein spike (S1) od SARS-CoV-2

    Get PDF
    To identify natural bioactive compounds (NBCs) as potential inhibitors of the spike (S1) by means of in silico assays. NBCs with previously proven biological in vitro activity were obtained from the ZINC database and analyzed through virtual screening and molecular docking to identify those with higher affinity to the spike protein. Eight machine learning models were used to validate the results: Principal Component Analysis (PCA), Artificial Neural Network (ANN), Support Vector Machine (SVM), k-Nearest Neighbors (KNN), Partial Least Squares-Discriminant Analysis (PLS-DA), Gradient Boosted Tree Discriminant Analysis (XGBoostDA), Soft Independent Modelling of Class Analogies (SIMCA) and Logistic Regression Discriminate Analysis (LREG). Selected NBCs were submitted to drug-likeness prediction using Lipinski’s and Veber’s rule of five. A prediction of pharmacokinetic parameters and toxicity was also performed (ADMET). Antivirals currently used for COVID-19 (remdesivir and molnupiravir) were used as a comparator. A total of 170,906 compounds were analyzed. Of these, 34 showed a greater affinity with the S1 (affinity energy < -7 kcal mol-1). Most of these compounds belonged to the class of coumarins (benzopyrones), presenting a benzene ring fused to a lactone (group of heterosides). The PLS-DA model was able to reproduce the results of the virtual screening and molecular docking (accuracy of 97.0%). Of the 34 compounds, only NBC5 (feselol), NBC14, NBC15, and NBC27 had better results in ADMET predictions. These had a similar binding affinity to S1 when compared to remdesivir and molnupirvir. Feselol and three other NBCs were the most promising candidates for treating COVID-19. In vitro and in vivo studies are needed to confirm these findings.info:eu-repo/semantics/publishedVersio

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    Molecular mechanisms and targets of new anticancer treatments

    Get PDF
    The work presented in this thesis is an effort to decipher and understand the mechanism of action (MOA) of anticancer agents by building on and complementing chemical proteomics methods. The backbone of the thesis relies on a recent method called Functional Identification of Target by Expression Proteomics (FITExP) developed in Zubarev lab, where drug induced proteomic signatures are analyzed in various cell lines and top differentially regulated proteins with consistent behavior are determined, among which the drug target and mechanistic proteins are usually present. FITExP relies on the assumption that proteins most affected with a perturbation have a higher probability of being involved in that process. In this regard, Paper I aimed to enhance the performance of FITExP analysis by merging proteomic data from drug-treated matrix attached and detached cells. This is while the majority if not all proteomics and molecular biology experiments are performed in matrix attached cells, as the general belief is that detached cells lose their structural integrity and do not harbor valuable information. However, detached cells are those that are more sensitive to chemotherapeutics and might reflect the proteome changes better. The comparative proteomics of living and dying cells improved FITExP performance with regards to identification of targets and provided insight about proteins involved in cellular life and death decisions. Furthermore, the orthogonal partial least squares-discriminant analysis (OPLS-DA) paradigm presented in this study, was used throughout the thesis for contrasting and visualizing the proteomic signature of a molecule against others, to reveal targets and specific proteins changing in response to the molecule of interest. In Paper II, as a further development of FITExP and to demonstrate its applicability in a broader context, we built a proteome signature library of 56 clinical and experimental anticancer agents in A549 lung adenocarcinoma cell line. This resource called ProTargetMiner can be used for different purposes. The proximity of compounds in hierarchical clustering or t-SNE could be used for prediction of the mechanism of new compounds. Contrasting each molecule against other treatments using the OPLS-DA scheme presented in Paper I, revealed drug targets, mechanistic proteins, resistance factors, drug metabolizing enzymes and effects on protein complexes. Representative examples were used to demonstrate that the specificity factors extracted from the OPLS-DA models can help identify subtle but biologically significant processes, even when such an effect is as low as 15% fold change. Furthermore, we showed that the inclusion of 8-10 contrasting molecules in the OPLS-DA models can produce enough specificity for drug target deconvolution, which offered a miniaturization opportunity. Therefore, we built three deeper datasets using 9 compounds that showed the most diverse proteome changes in the orthogonal space in three cell lines from major cancer types: A549 lung, MCF-7 breast and RKO colon cancers. These datasets provide a unique depth of 7398, 8735 and 8551 respectively, with no missing values. Subsequently, a Shiny package was created in R, which can employ these datasets as a resource and merge it with user data and provide OPLS-DA output and target deconvolution opportunity for new compounds. Finally, using the original ProTargetMiner data, we also built a first of its kind proteomic correlation database which can find applications in deciphering the function of uncharacterized proteins. Moreover, the resource helped to identify a set of core or untouchable proteins with stable expression across all the treatments, revealing essential functions within the cells. Such proteins could be used as house-keeping controls in molecular biology experiments. In paper III, we combined FITExP with other chemical proteomics tools Thermal Proteome Profiling (TPP) and multiplexed redox proteomics, to study the target and mechanism space of auranofin. This would also allow to assess the power, orthogonality and complementarity of these techniques in the realm of chemical proteomics. TPP is a recently developed technique that can monitor changes in the stability of proteins upon binding to small molecules. Redox proteomics is a method by which the oxidation level of protein cysteinome can be quantitatively analyzed. Auranofin is an FDA-approved anti-inflammatory drug for treatment of rheumatoid arthritis, but due to its potent antitumor activity, it is currently in clinical trials against cancer. Although several MOAs have been suggested for auranofin, uncertainties exist regarding its cellular targets; therefore, this molecule was chosen as a challenging candidate to test the chemical proteomics tools. A combination of the above mentioned tools confirmed thioredoxin reductase 1 (TXNRD1) (ranking 3rd) as the cognate target of auranofin and demonstrated that perturbation of oxidoreductase pathway is the main route of auranofin cytotoxicity. We next showed that changes in the redox state of specific cysteines can be linked to protein stability in TPP. Some of these cysteines were mapped to the active sites of redox-active enzymes. In Paper IV, using quantitative multiplexed proteomics, we helped to show that b-AP15, a bis-benzylidine piperidone compound inhibiting deubiquitinases USP14 and UCHL5, produces a similar perturbation signature as bortezomib in colon cancer cells. However, in comparison with bortezomib, b-AP15 induces chaperone expression to a significantly higher level and leads to a more extensive accumulation of polyubiqutinated proteins. The polyubiqutinated proteins co-localize with mitochondrial membrane and subsequently reduce oxidative phosphorylation. These results help define the atypical cell death induced by b-AP15 and describe why this molecule is effective against apoptosis resistant cells in variety of tumor models. Finally, in Paper V, we extended the applications of TPP and combined it with specificity concept for proteome-wide discovery of specific protein substrates for enzymes. We developed a universal method called System-wide Identification of Enzyme Substrates by Thermal Analysis (SIESTA) that relies on the hypothesis that enzymatic post-translational modification of substrate proteins can potentially change their stability against thermal denaturation. Furthermore, we applied the concept of specificity similar to the above papers, to reveal potential substrates using OPLS-DA. SIESTA was applied to two enzyme systems, namely TXNRD1 and poly-(ADP-ribose) polymerase-10 (PARP10), identifying known and putative candidate substrates. A number of these candidate proteins were validated as PARP10 substrates by targeted mass spectrometry, chemiluminescence and other assays. SIESTA is an unbiased and system wide approach and its broad application can improve our understanding of enzyme function in homeostasis and disease. In turn, specific protein substrates can serve as readouts in high throughput screening and facilitate drug discovery. Taken together, in this thesis, FITExP methodology was improved in two directions. In paper I, we improved the performance of FITExP by combining the proteomics data from detached and attached cells. In Paper II, we demonstrated how the proteomics data on a multitude of drugs in a single cell line enables the discovery of compound targets and MOA. Furthermore, we built an R Shiny package which can serve as a resource for the cancer community in target and MOA deconvolution. In Papers III and IV, we applied an arsenal of chemical proteomics tools for characterization of two anticancer compounds. In Paper V, we expanded the applications of TPP to identification of specific protein substrates for enzymes in a system-wide manner

    Correlation between cell line chemosensitivity and protein expression pattern as new approach for the design of targeted anticancer small molecules

    Get PDF
    BACKGROUND AND RATIONALE: Over the past few decades, several databases with a significant amount of biological data related to cancer cells and anticancer agents (e.g.: National Cancer Institute database, NCI; Cancer Cell Line Encyclopedia, CCLE; Genomic and Drug Sensitivity in Cancer portal, GDSC) have been developed. The huge amount of heterogeneous biological data extractable from these databanks (among all, drug response and protein expression) provides a real foundation for predictive cancer chemogenomics, which aims to investigate the relationships between genomic traits and the response of cancer cells to drug treatment with the aim to identify novel therapeutic molecules and targets. In very recent times many computational and statistical approaches have been proposed to integrate and correlate these heterogeneous biological data sequences (protein expression – drug response), with the aim to assign the putative mechanism of action of anticancer small molecules with unknown biological target/s. The main limitation of all these computational methods is the need for experimental drug response data (after screening data). From this point of view, the possibility to predict in silico the antiproliferative activity of new/untested small molecules against specific cell lines, could enable correlations to be found between the predicted drug response and protein expression of the desired target from the very earliest stages of research. Such an innovative approach could allow to select the compounds with molecular mechanisms that are more likely to be connected with the target of interest preliminary to the in vitro assays, which would be a critical aid in the design of new targeted anticancer agents. RESULTS: In the present study, we aimed to develop a new innovative computational protocol based on the correlation of drug activity and protein expression data to support the discovery of new targeted anticancer agents. Compared with the approaches reported in the literature, the main novelty of the proposed protocol was represented by the use of predicted antiproliferative activity data, instead of experimental ones. To this aim, in the first phase of the research the new in silico Antiproliferative Activity Predictor (AAP) tool able to predict the anticancer activity (expressed as GI50) of new/untested small molecules against the NCI-60 panel was developed. The ligand-based tool, which took the advantages of the consolidated expertise of the research group in the manipulation of molecular descriptors, was adequately validated and the reliability of the prediction was further confirmed by the analysis of an in-house database and subsequent evaluation of a set of molecules selected by the NCI for the one-dose/five-doses antiproliferative assays. In the second part of the study, a new computational method to correlate drug activity data and protein expression pattern data was proposed and evaluated by analysing several case studies of targeted drugs tested by NCI, confirming the reliability of the proposed method for the biological data analysis. In the last part of the project the proposed correlation approach was applied to design new small molecules as selective inhibitors of Cdc25 phosphatase, a well-known protein involved in carcinogenic processes. By means of this innovative approach, integrated with other classical ligand/structures-based techniques, it was possible to screen a large database of molecular structures, and to select the ones with optimal relationship with the focused target. In vitro antiproliferative and enzymatic inhibition assays of the selected compounds led to the identification of new structurally heterogeneous inhibitors of Cdc25 proteins and confirmed the results of the in silico analysis. CONCLUSIONS: Collectively, the obtained results showed that the correlation between protein expression pattern and chemosensitivity is an innovative, alternative, and effective method to identify new modulators for the selected targets. In contrast to traditional in silico methods, the proposed protocol allows for the selection of molecular structures with heterogeneous scaffolds, which are not strictly related to the binding sites and with chemical-physical features that may be more suitable for all the pathways involved in the overall mechanism. The biological assays further corroborate the robustness and the reliability of this new approach and encourage its application in the anticancer targeted drug discovery field

    Priorización de genes y búsqueda de fármacos por medio de herramientas informáticas y técnicas de aprendizaje de máquinas en osteosarcoma

    Get PDF
    Programa Oficial de Doutoramento en Tecnoloxías da Información e as Comunicacións. 5032V01Tese por compendio de publicacións[Resumen] El osteosarcoma es el subtipo más común de cáncer de hueso primario y afecta principalmente a adolescentes. En los últimos años, varios estudios se han centrado en dilucidar los mecanismos moleculares de este sarcoma; sin embargo, su etiología molecular aún no se ha determinado con precisión. Por otro lado, su diagnóstico clínico es generalista y sus terapias no han cambiado en las últimas décadas. Aunque hoy en día las tasas de supervivencia a 5 años pueden alcanzar hasta el 60-70%, las complicaciones agudas y los efectos tardíos del tratamiento del osteosarcoma son dos de los factores limitantes de los tratamientos. Así, el objetivo de esta tesis doctoral es desarrollar una estrategia de priorización que permita la identificación de genes asociados con la patogenicidad del osteosarcoma y explicar de forma más completa la etiología de esta enfermedad. Por otro lado, se busca desarrollar algoritmos de predicción de fármacos basados en aprendizaje de máquinas que permitan proponer nuevos agentes terapéuticos para el tratamiento de esta enfermedad. Todos los resultados obtenidos se publicaron en revistas científicas internacionales con importante factor de impacto JCR.[Abstract] Osteosarcoma is the most common subtype of primary bone cancer, affecting mainly adolescents. In recent years, several studies have focused on elucidating the molecular mechanisms of this sarcoma; however, its molecular etiology has not yet been accurately determined. On the other hand, the clinical diagnosis is generalist and therapies have not changed in recent decades. Although nowadays 5-year survival rates can reach up to 60-70%, acute complications and late effects of osteosarcoma therapy are two of the limiting factors in treatments. Thus, the objective of this doctoral thesis is to develop a prioritization strategy that allows the identification of genes associated with the pathogenicity of osteosarcoma, and to explain more fully the etiology of this disease. On the other hand, it seeks to develop drug prediction algorithms based on machine learning techniques that allow proposing new therapeutic agents for the treatment of this disease. All the results obtained in this research were published in international scientific journals with an important JCR impact factor.[Resumo] O osteosarcoma é o subtipo máis común de cancro óseo primario, que afecta principalmente a adolescentes. Nos últimos anos, varios estudos centráronse en dilucidar os mecanismos moleculares deste sarcoma; con todo, a súa etioloxía molecular aínda non foi determinada con precisión. Por outra banda, o seu diagnóstico clínico é xeralista e as súas terapias non cambiaron nas últimas décadas. Aínda que hoxe as taxas de supervivencia a 5 anos poden chegar ata o 60- 70%, as complicacións agudas e os efectos tardíos do tratamento con osteosarcoma son dous dos factores limitantes dos tratamentos. Deste xeito, o obxectivo desta tese de doutoramento é desenvolver unha estratexia de priorización que permita a identificación de xenes asociados á patoxenicidade do osteosarcoma e explicar máis plenamente a etioloxía desta enfermidade. Por outra banda, buscamos desenvolver algoritmos de predición de medicamentos baseados na aprendizaxe automática que permitan propoñer novos axentes terapéuticos para o tratamento desta enfermidade. Todos os resultados obtidos publicáronse en revistas científicas internacionais cun importante factor de impacto JCR

    IN SILICO METHODS FOR DRUG DESIGN AND DISCOVERY

    Get PDF
    Computer-aided drug design (CADD) methodologies are playing an ever-increasing role in drug discovery that are critical in the cost-effective identification of promising drug candidates. These computational methods are relevant in limiting the use of animal models in pharmacological research, for aiding the rational design of novel and safe drug candidates, and for repositioning marketed drugs, supporting medicinal chemists and pharmacologists during the drug discovery trajectory.Within this field of research, we launched a Research Topic in Frontiers in Chemistry in March 2019 entitled “In silico Methods for Drug Design and Discovery,” which involved two sections of the journal: Medicinal and Pharmaceutical Chemistry and Theoretical and Computational Chemistry. For the reasons mentioned, this Research Topic attracted the attention of scientists and received a large number of submitted manuscripts. Among them 27 Original Research articles, five Review articles, and two Perspective articles have been published within the Research Topic. The Original Research articles cover most of the topics in CADD, reporting advanced in silico methods in drug discovery, while the Review articles offer a point of view of some computer-driven techniques applied to drug research. Finally, the Perspective articles provide a vision of specific computational approaches with an outlook in the modern era of CADD

    Identification of novel modulators of protein synthesis and nucleolar biology using high throughput phenotypic screens

    Get PDF
    Protein synthesis and ribosome biogenesis are fundamental steps in gene expression and constitute the most energy demanding processes in living cells. Dysregulation of these processes is associated to a variety of human disorders including cancer, metabolic diseases, immunodeficiency, neurological and developmental disorders, and physiological aging. Therapeutic strategies modulating protein synthesis and ribosome biogenesis or nucleolar biology, have proven to be efficient for several of these disorders, and some of them are already used in the clinic, predominantly in the context of cancer. However, the success of these drugs has been limited due to activation of mechanisms of resistance or lack of general effects among different cancer types. Additionally, the application of modulators of protein and ribosome production in other disease contexts is just starting to be explored. This is particularly important for disorders where altered translation control is a hallmark, such as in the case of some neurodegenerative diseases. Moreover, different disorders may require different therapeutic approaches, hence, research in less known disease areas opens possibilities of finding new ways of regulating protein synthesis and ribosome biogenesis, and perhaps new biology. In this thesis we have used high throughput phenotypic screens to discover new modulators of protein synthesis and nucleolar biology. Phenotypic screening allows for the systematic identification of regulators of an organismal feature (phenotype) without having any prior knowledge. In paper I we benefited from novel technologies allowing visualization of changes in protein synthesis to evaluate the effects of medically approved and well-characterized drugs in mRNA translation. Our screen failed to identify small molecules stimulating translation in cancer cells growing in complete media. Yet, it seems that translation can only be boosted when the translation machinery of cells is challenged, such as when cells are grown under starvation conditions. Nevertheless, our screen identified known down-regulators of translation, supporting the validity of our approach, and a new translation inhibitor, SKI-II. SKI-II was developed as a sphingosine kinase inhibitor (SPHK), and this group of compounds has been explored extensively as anticancer drugs. However, in our hands, SKI-II inhibited translation by inducing the integrated stress response (ISR), causing physical damage to the endoplasmic reticulum (ER), which resulted in cell death. The toxicity of SKI-II and its clinically relevant analog ABC294640 was not abrogated when knocking out sphingosine kinases, while it was partially rescued upon inhibition of the ISR. Our work is the first to systematically examine the effect of known drugs in translation in cells and to report cytotoxic properties of SPHK inhibitors that are independent of SPHKs. In paper II we conducted a chemical screen to identify compounds limiting the toxicity of amyotrophic lateral sclerosis (ALS)-related dipeptide repeats (DPRs). ALS is a fatal neurodegenerative disease characterized by loss of upper and lower motor neurons, leading to muscular paralysis and death, within 3 to 5 years after diagnosis. The expansion of G4C2 repeats within the first intron of the C9ORF72 gene constitutes the most common cause of ALS and frontotemporal dementia (FTD). Through repeat-associated non-ATG (RAN) translation, these expansions are translated into DPRs, some of which, poly-proline arginine (PR) and poly-glycine arginine (GR), bind to the nucleoli and lead to cell death. Here we conducted a screen to identify compounds reducing toxicity of twenty-repeats poly-PR peptides (PR20) added exogenously to cells in culture. Our screen identified two BET bromodomain inhibitors (Bromosporine-1 and PFI-1) and sodium phenylbutyrate (Na-Phen), currently in clinical trials, as modifiers of PR20 toxicity in different cell lines and in developing zebrafish embryos. Our work shows that BET Bromodomain inhibitors rescue the nucleolar stress induced by PR20 and the known nucleolar stressor Actinomycin D (ActD). To our knowledge, this is the first time that compounds able to protect nucleolar integrity are reported in the literature, and therefore, they might have beneficial effects in diseases associated to nucleolar stress, such as ALS/FTD. Inspired by our results, we conducted four additional screens that are collected in the section preliminary results. Following paper I, we applied the same screening pipeline to identify novel modulators of translation among natural compounds (preliminary results I). Related to paper II, the literature points to two main issues with current modulators of ribosome biogenesis, promiscuity, even among the so-called selective modulators, and heterogeneity in the efficacy of compounds across different cancer types. Regarding the first, the discovery of regulators of ribosome biogenesis has advanced in parallel with the technology allowing their study. Current methods allow better characterization of the activities of these drugs and development of strategies to find more selective modulators, which we reviewed in annex I. Nevertheless, there is a growing need for novel modulators of nucleolar activity, and we benefited from publicly available image datasets to explore the effects of known drugs in the nucleolus (preliminary results II). Also, we conducted a genome-wide CRISPR/Cas9 screen to identify vulnerabilities to nucleolar stressors and systematically interrogate in which genetic backgrounds these drugs are suitable anticancer therapies (preliminary results III). Lastly, triggered by the discovery of “nucleolar protectors” in paper II, we conducted a chemical screen to explore novel nucleolar functions of known drugs using the Drug Repurposing Hub library 1 from the Broad Institute (preliminary results IV). Altogether, here we have used high throughput phenotypic screens to discover new modulators of protein synthesis and nucleolar biology relevant for disease contexts, and to uncover new biology linked to these processes
    corecore