15 research outputs found

    The influence of the inactives subset generation on the performance of machine learning methods

    Get PDF
    Background: A growing popularity of machine learning methods application in virtual screening, in both classification and regression tasks, can be observed in the past few years. However, their effectiveness is strongly dependent on many different factors. Results: In this study, the influence of the way of forming the set of inactives on the classification process was examined: random and diverse selection from the ZINC database, MDDR database and libraries generated according to the DUD methodology. All learning methods were tested in two modes: using one test set, the same for each method of inactive molecules generation and using test sets with inactives prepared in an analogous way as for training. The experiments were carried out for 5 different protein targets, 3 fingerprints for molecules representation and 7 classification algorithms with varying parameters. It appeared that the process of inactive set formation had a substantial impact on the machine learning methods performance. Conclusions: The level of chemical space limitation determined the ability of tested classifiers to select potentially active molecules in virtual screening tasks, as for example DUDs (widely applied in docking experiments) did not provide proper selection of active molecules from databases with diverse structures. The study clearly showed that inactive compounds forming training set should be representative to the highest possible extent for libraries that undergo screening

    Virtual target screening to rapidly identify potential protein targets of natural products in drug discovery

    Get PDF
    Inherent biological viability and diversity of natural products make them a potentially rich source for new therapeutics. However, identification of bioactive compounds with desired therapeutic effects and identification of their protein targets is a laborious, expensive process. Extracts from organism samples may show desired activity in phenotypic assays but specific bioactive compounds must be isolated through further separation methods and protein targets must be identified by more specific phenotypic and in vitro experimental assays. Still, questions remain as to whether all relevant protein targets for a compound have been identified. The desire is to understand breadth of purposing for the compound to maximize its use and intellectual property, and to avoid further development of compounds with insurmountable adverse effects. Previously we developed a Virtual Target Screening system that computationally screens one or more compounds against a collection of virtual protein structures. By scoring each compound-protein interaction, we can compare against averaged scores of synthetic drug-like compounds to determine if a particular protein would be a potential target of a compound of interest. Here we provide examples of natural products screened through our system as we assess advantages and shortcomings of our current system in regards to natural product drug discovery

    Inhibitors of Streptococcus pneumoniae Surface Endonuclease EndA Discovered by High-Throughput Screening Using a PicoGreen Fluorescence Assay

    Get PDF
    The human commensal pathogen, Streptococcus pneumoniae, expresses a number of virulence factors that promote serious pneumococcal diseases, resulting in significant morbidity and mortality worldwide. These virulence factors may give S. pneumoniae the capacity to escape immune defenses, resist antimicrobial agents, or a combination of both. Virulence factors also present possible points of therapeutic intervention. The activities of the surface endonuclease, EndA, allow S. pneumoniae to establish invasive pneumococcal infection. EndA’s role in DNA uptake during transformation contributes to gene transfer and genetic diversitifcation. Moreover, EndA’s nuclease activity degrades the DNA backbone of neutrophil extracellular traps (NETs), allowing pneumococcus to escape host immune responses. Given its potential impact on pneumococcal pathogenicity, EndA is an attractive target for novel antimicrobial therapy. Herein, we describe the development of a high-throughput screening assay for the discovery of nuclease inhibitors. Nuclease-mediated digestion of double-stranded DNA was assessed using fluorescence intensity changes of the DNA dye ligand, PicoGreen. Under optimized conditions, the assay provided robust and reproducible activity data (Z'=0.87) and was used to screen 4727 small molecules against an imidazole-rescued variant of EndA. In total, 10 small molecules were confirmed as novel EndA inhibitors that may have utility as research tools for understanding pneumococcal pathogenesis, and ultimately drug discovery

    From Knowledgebases to Toxicity Prediction and Promiscuity Assessment

    Get PDF
    Polypharmacology marked a paradigm shift in drug discovery from the traditional ‘one drug, one target’ approach to a multi-target perspective, indicating that highly effective drugs favorably modulate multiple biological targets. This ability of drugs to show activity towards many targets is referred to as promiscuity, an essential phenomenon that may as well lead to undesired side-effects. While activity at therapeutic targets provides desired biological response, toxicity often results from non-specific modulation of off-targets. Safety, efficacy and pharmacokinetics have been the primary concerns behind the failure of a majority of candidate drugs. Computer-based (in silico) models that can predict the pharmacological and toxicological profiles complement the ongoing efforts to lower the high attrition rates. High-confidence bioactivity data is a prerequisite for the development of robust in silico models. Additionally, data quality has been a key concern when integrating data from publicly-accessible bioactivity databases. A majority of the bioactivity data originates from high- throughput screening campaigns and medicinal chemistry literature. However, large numbers of screening hits are considered false-positives due to a number of reasons. In stark contrast, many compounds do not demonstrate biological activity despite being tested in hundreds of assays. This thesis work employs cheminformatics approaches to contribute to the aforementioned diverse, yet highly related, aspects that are crucial in rationalizing and expediting drug discovery. Knowledgebase resources of approved and withdrawn drugs were established and enriched with information integrated from multiple databases. These resources are not only useful in small molecule discovery and optimization, but also in the elucidation of mechanisms of action and off- target effects. In silico models were developed to predict the effects of small molecules on nuclear receptor and stress response pathways and human Ether-à-go-go-Related Gene encoded potassium channel. Chemical similarity and machine-learning based methods were evaluated while highlighting the challenges involved in the development of robust models using public domain bioactivity data. Furthermore, the true promiscuity of the potentially frequent hitter compounds was identified and their mechanisms of action were explored at the molecular level by investigating target-ligand complexes. Finally, the chemical and biological spaces of the extensively tested, yet inactive, compounds were investigated to reconfirm their potential to be promising candidates.Die Polypharmakologie beschreibt einen Paradigmenwechsel von "einem Wirkstoff - ein Zielmolekül" zu "einem Wirkstoff - viele Zielmoleküle" und zeigt zugleich auf, dass hochwirksame Medikamente nur durch die Interaktion mit mehreren Zielmolekülen Ihre komplette Wirkung entfalten können. Hierbei ist die biologische Aktivität eines Medikamentes direkt mit deren Nebenwirkungen assoziiert, was durch die Interaktion mit therapeutischen bzw. Off-Targets erklärt werden kann (Promiskuität). Ein Ungleichgewicht dieser Wechselwirkungen resultiert oftmals in mangelnder Wirksamkeit, Toxizität oder einer ungünstigen Pharmakokinetik, anhand dessen man das Scheitern mehrerer potentieller Wirkstoffe in ihrer präklinischen und klinischen Entwicklungsphase aufzeigen kann. Die frühzeitige Vorhersage des pharmakologischen und toxikologischen Profils durch computergestützte Modelle (in-silico) anhand der chemischen Struktur kann helfen den Prozess der Medikamentenentwicklung zu verbessern. Eine Voraussetzung für die erfolgreiche Vorhersage stellen zuverlässige Bioaktivitätsdaten dar. Allerdings ist die Datenqualität oftmals ein zentrales Problem bei der Datenintegration. Die Ursache hierfür ist die Verwendung von verschiedenen Bioassays und „Readouts“, deren Daten zum Großteil aus primären und bestätigenden Bioassays gewonnen werden. Während ein Großteil der Treffer aus primären Assays als falsch-positiv eingestuft werden, zeigen einige Substanzen keine biologische Aktivität, obwohl sie in beiden Assay- Typen ausgiebig getestet wurden (“extensively assayed compounds”). In diese Arbeit wurden verschiedene chemoinformatische Methoden entwickelt und angewandt, um die zuvor genannten Probleme zu thematisieren sowie Lösungsansätze aufzuzeigen und im Endeffekt die Arzneimittelforschung zu beschleunigen. Hierfür wurden nicht redundante, Hand-validierte Wissensdatenbanken für zugelassene und zurückgezogene Medikamente erstellt und mit weiterführenden Informationen angereichert, um die Entdeckung und Optimierung kleiner organischer Moleküle voran zu treiben. Ein entscheidendes Tool ist hierbei die Aufklärung derer Wirkmechanismen sowie Off-Target-Interaktionen. Für die weiterführende Charakterisierung von Nebenwirkungen, wurde ein Hauptaugenmerk auf Nuklearrezeptoren, Pathways in welchen Stressrezeptoren involviert sind sowie den hERG-Kanal gelegt und mit in-silico Modellen simuliert. Die Erstellung dieser Modelle wurden Mithilfe eines integrativen Ansatzes aus “state-of-the-art” Algorithmen wie Ähnlichkeitsvergleiche und “Machine- Learning” umgesetzt. Um ein hohes Maß an Vorhersagequalität zu gewährleisten, wurde bei der Evaluierung der Datensätze explizit auf die Datenqualität und deren chemische Vielfalt geachtet. Weiterführend wurden die in-silico-Modelle dahingehend erweitert, das Substrukturfilter genauer betrachtet wurden, um richtige Wirkmechanismen von unspezifischen Bindungsverhalten (falsch- positive Substanzen) zu unterscheiden. Abschließend wurden der chemische und biologische Raum ausgiebig getesteter, jedoch inaktiver, kleiner organischer Moleküle (“extensively assayed compounds”) untersucht und mit aktuell zugelassenen Medikamenten verglichen, um ihr Potenzial als vielversprechende Kandidaten zu bestätigen

    Development and Characterization of Novel Mcl-1 Inhibitors for Treatment of Cancer.

    Full text link
    Myeloid cell leukemia-1 (Mcl-1) is a potent anti-apoptotic protein, member of the anti-apoptotic Bcl-2 family. Overexpression of Mcl-1 is associated with high tumor grade, resistance to chemotherapy, and poor prognosis in many types of cancers. Thus, Mcl-1 is emerging as a critical survival factor in a broad range of human cancers and represents an attractive molecular target for the development of a new class of cancer therapy. Applying an integrated screening strategy through combining high throughput and virtual screenings, multiple hit compounds with structural diversity were validated as Mcl-1 inhibitors using biochemical and biophysical methods. Based on the confirmed hit molecule and analyzing structure activity relationship (SAR) together with computational docking predicted binding poses supported by HSQC NMR studies, we have designed and optimized a novel class of selective small-molecule inhibitors of Mcl-1 using a 2,4,5 substituted benzoic acid as a scaffold. Several co-crystal structures of this class of inhibitors in complex with Mcl-1 have provided a basis for their further optimization, which ultimately led to the discovery of nanomolar potent and selective ligands that bind to the BH3 hydrophobic groove of the Mcl-1 protein. Mechanistic studies performed in genetically engineered cell lines revealed that our inhibitors have on-target activity and induce Bax/Bak dependent apoptosis; selectively antagonizing Mcl-1 function leading to the induction of the hallmarks of apoptosis. Using functional BH3 profiling assay, we identified heterogeneous dependency on Bcl-2 family members for survival in hematologic malignancies, as well as in solid human cancers. The mitochondrial response to selective Mcl-1 BH3 peptides (Noxa and MS1) predicted the in vitro sensitivity to Mcl-1 inhibitors of several cell lines found to be Mcl-1 dependent, including multiple myeloma cell line H929. 483LM, one of the most potent developed Mcl-1 inhibitors, inhibited the cell growth and induced mechanism-based apoptotic cell death in the H929 cells. Intraperitoneal treatment of the H929 cancer xenograft model with 483LM led to significant dose-dependent tumor regression. Collectively, our data demonstrates that the new class of Mcl-1 inhibitors has promising in vitro and in vivo efficacy, warranting further development toward clinical use in the treatment of human cancers.PHDMedicinal ChemistryUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135852/1/asmady_1.pdfDescription of asmady_1.pdf : Restricted to UM users only

    Computational Analysis of Structure-Activity Relationships : From Prediction to Visualization Methods

    Get PDF
    Understanding how structural modifications affect the biological activity of small molecules is one of the central themes in medicinal chemistry. By no means is structure-activity relationship (SAR) analysis a priori dependent on computational methods. However, as molecular data sets grow in size, we quickly approach our limits to access and compare structures and associated biological properties so that computational data processing and analysis often become essential. Here, different types of approaches of varying complexity for the analysis of SAR information are presented, which can be applied in the context of screening and chemical optimization projects. The first part of this thesis is dedicated to machine-learning strategies that aim at de novo ligand prediction and the preferential detection of potent hits in virtual screening. High emphasis is put on benchmarking of different strategies and a thorough evaluation of their utility in practical applications. However, an often claimed disadvantage of these prediction methods is their "black box" character because they do not necessarily reveal which structural features are associated with biological activity. Therefore, these methods are complemented by more descriptive SAR analysis approaches showing a higher degree of interpretability. Concepts from information theory are adapted to identify activity-relevant structure-derived descriptors. Furthermore, compound data mining methods exploring prespecified properties of available bioactive compounds on a large scale are designed to systematically relate molecular transformations to activity changes. Finally, these approaches are complemented by graphical methods that primarily help to access and visualize SAR data in congeneric series of compounds and allow the formulation of intuitive SAR rules applicable to the design of new compounds. The compendium of SAR analysis tools introduced in this thesis investigates SARs from different perspectives

    Development of an Advanced Molecular Profiling Pipeline for Human Population Screening

    Get PDF
    The interaction between a human’s genes and their environment is dynamic, producing phenotypes that are subject to variance among individuals and across time. Metabolic interpretation of phenotypes, including the elucidation of underlying biochemical causes and effects for physiological or pathological processes, allows for the potential discovery of biomarkers and diagnostics which are important in understanding human health and disease. The study of large cohorts has been pursued in hopes of gaining sufficient statistical power to observe subtle biochemical processes relevant to human phenotypes. In order to minimise the effects of analytical variance in metabolic profiling and maximise extractable information, it is necessary to develop a refined analytical approach to large scale metabolic profiling that allows for efficient and high quality collection of data, facilitating analysis on a scale appropriate for molecular epidemiology applications. The analytical methods used for the multidimensional separation and detection of metabolic content from complex biofluids must be made fit for this purpose, deriving data with unprecedented reproducibility for direct comparison of metabolic profiles across thousands of individuals. Furthermore, computational methods must be established for collating this data into a form that is suitable for analysis and interpretation without compromising the quality achieved in the raw data. These developments together constitute a pipeline for large scale analysis, the components of which are explored and refined herein with a common thread of improving laboratory efficiency and measurement precision. Complimentary chromatographic methods are developed and implemented in the separation of human urine samples, and further mated to separation and detection by mass spectrometry to provide information rich metabolic maps. This system is optimised to derive precision from sustained analysis, with emphasis on minimisation of sample batching thereby allowing the development of metabolite collation tools that leverage the chromatographic reproducibility. Finally, the challenge of metabolite identification in molecular profiling is conceptually addressed in a manner that does not preclude the further reinvention of the analytical approaches established within this thesis. In summary, the thesis offers a novel and practical analytical pipeline suitable for achieving high quality population phenotyping and metabolome wide association studies.Open Acces

    Application of Computational Methods for the Design of New Potential Therapeutic Agents

    Get PDF
    Computer-aided drug discovery (CADD) represents a very useful tool to search for potential drug candidates and plays a strategic role in the discovery of new potential therapeutic agents for both pharmaceutical companies and academic research groups. Nevertheless, the modelling of biological systems still represents a challenge for computational chemists, and, at present, a single computational method able to face such challenge is not available. Computational tools are therefore evolving in the direction of combining molecular-mechanic (MM), molecular dynamics (MD), and quantum-mechanical (QM) approaches in order to achieve an overall better simulation of the actual molecular behaviour. In addition, many sampling methods have been developed and applied for the characterisation and comparison of the collective motions of protein structures related to the dynamics of proteins, protein folding and ligand-protein docking simulations. This prompted us, as computational medicinal chemists, to develop various CADD approaches, depending on the specific case under study, integrating theoretical and experimental data. In particular, the research activity carried out during the three years of my PhD led to: i) the development of three-dimensional (3-D) pharmacophore models for the analysis of 3-D structure-activity relationships (SARs) of bioactive compounds, ii) the identification of new molecular targets, iii) the simulation of large-scale protein conformational changes, iv) the simulation of protein/protein and ligand/protein interactions, and v) the design of new bioactive compounds. Computational studies were always performed in the frame of multi-disciplinary projects guided by a unique research strategy, which involved several international and national research groups, and were carried out by integrating and validating our computational studies with the experimental data coming from the other researchers involved in the various projects. The results obtained enabled to: i) identify a new class of anticancer agents against paclitaxel resistant cancer cells, ii) provide important information on the mechanism of action of cationic porphyrins, a novel class of proteasome conformational regulators with great potentiality as “lead” pharmacophores, and iii) optimise the cellular pharmacokinetic and pharmacodynamic properties of a new series of antimalarial agents. In addition, I spent a training period abroad of eight-months at the Institute of Research in Biomedicine (IRB) in Barcelona, under the supervision of prof. Modesto Orozco, during which I have had the opportunity to extend my computational background by learning and, then, performing metadynamic and MD simulations, investigating the open/close conformational transition of 20S human proteasome by molecular dynamics simulations

    Application of Computational Methods for the Design of New Potential Therapeutic Agents

    Get PDF
    Computer-aided drug discovery (CADD) represents a very useful tool to search for potential drug candidates and plays a strategic role in the discovery of new potential therapeutic agents for both pharmaceutical companies and academic research groups. Nevertheless, the modelling of biological systems still represents a challenge for computational chemists, and, at present, a single computational method able to face such challenge is not available. Computational tools are therefore evolving in the direction of combining molecular-mechanic (MM), molecular dynamics (MD), and quantum-mechanical (QM) approaches in order to achieve an overall better simulation of the actual molecular behaviour. In addition, many sampling methods have been developed and applied for the characterisation and comparison of the collective motions of protein structures related to the dynamics of proteins, protein folding and ligand-protein docking simulations. This prompted us, as computational medicinal chemists, to develop various CADD approaches, depending on the specific case under study, integrating theoretical and experimental data. In particular, the research activity carried out during the three years of my PhD led to: i) the development of three-dimensional (3-D) pharmacophore models for the analysis of 3-D structure-activity relationships (SARs) of bioactive compounds, ii) the identification of new molecular targets, iii) the simulation of large-scale protein conformational changes, iv) the simulation of protein/protein and ligand/protein interactions, and v) the design of new bioactive compounds. Computational studies were always performed in the frame of multi-disciplinary projects guided by a unique research strategy, which involved several international and national research groups, and were carried out by integrating and validating our computational studies with the experimental data coming from the other researchers involved in the various projects. The results obtained enabled to: i) identify a new class of anticancer agents against paclitaxel resistant cancer cells, ii) provide important information on the mechanism of action of cationic porphyrins, a novel class of proteasome conformational regulators with great potentiality as “lead” pharmacophores, and iii) optimise the cellular pharmacokinetic and pharmacodynamic properties of a new series of antimalarial agents. In addition, I spent a training period abroad of eight-months at the Institute of Research in Biomedicine (IRB) in Barcelona, under the supervision of prof. Modesto Orozco, during which I have had the opportunity to extend my computational background by learning and, then, performing metadynamic and MD simulations, investigating the open/close conformational transition of 20S human proteasome by molecular dynamics simulations
    corecore