61 research outputs found

    Optimization Algorithms for Chemoinformatics and Material-informatics

    Get PDF
    Modeling complex phenomena in chemoinformatics and material-informatics can often be formulated as single-objective or multi-objective optimization problems (SOOPs or MOOPs). For example, the design of new drugs or new materials is inherently a MOOP since drugs/materials require the simultaneous optimization of multiple parameters

    Drug Screening Identifies Sigma-1-Receptor as a Target for the Therapy of VWM Leukodystrophy

    Get PDF
    Vanishing white matter (VWM) disease is an autosomal genetic leukodystrophy caused by mutations in subunits of eukaryotic translation initiation factor 2B (eIF2B). The clinical symptoms exhibit progressive loss of white matter in both hemispheres of the brain, accompanied by motor functions deterioration, neurological deficits, and early death. To date there is no treatment for VWM disease. The aim of this work was to expedite rational development of a therapeutic opportunity. Our approach was to design a computer-aided strategy for an efficient and reliable screening of drug-like molecules; and to use primary cultures of fibroblasts isolated from the Eif2b5R132H/R132H VWM mouse model for screening. The abnormal mitochondria content phenotype of the mutant cells was chosen as a read-out for a simple cell-based fluorescent assay to assess the effect of the tested compounds. We obtained a hit rate of 0.04% (20 hits out of 50,000 compounds from the selected library). All primary hits decreased mitochondria content and brought it closer to WT levels. Structural similarities between our primary hits and other compounds with known targets allowed the identification of three putative cellular pathways/targets: 11β-hydroxysteroid dehydrogenase type 1, Sonic hedgehog (Shh), and Sigma-1-Receptor (S1R). In addition to initial experimental indication of Shh pathway impairment in VWM mouse brains, the current study provides evidence that S1R is a relevant target for pharmaceutical intervention for potential treatment of the disease. Specifically, we found lower expression level of S1R protein in fibroblasts, astrocytes, and whole brains isolated from Eif2b5R132H/R132H compared to WT mice, and confirmed that one of the hits is a direct binder of S1R, acting as agonist. Furthermore, we provide evidence that treatment of mutant mouse fibroblasts and astrocytes with various S1R agonists corrects the functional impairments of their mitochondria and prevents their need to increase their mitochondria content for compensation purposes. Moreover, S1R activation enhances the survival rate of mutant cells under ER stress conditions, bringing it to WT levels. This study marks S1R as a target for drug development toward treatment of VWM disease. Moreover, it further establishes the important connection between white matter well-being and S1R-mediated proper mitochondria/ER function

    A Comparison between Enrichment Optimization Algorithm (EOA)-Based and Docking-Based Virtual Screening

    No full text
    Virtual screening (VS) is a well-established method in the initial stages of many drug and material design projects. VS is typically performed using structure-based approaches such as molecular docking, or various ligand-based approaches. Most docking tools were designed to be as global as possible, and consequently only require knowledge on the 3D structure of the biotarget. In contrast, many ligand-based approaches (e.g., 3D-QSAR and pharmacophore) require prior development of project-specific predictive models. Depending on the type of model (e.g., classification or regression), predictive ability is typically evaluated using metrics of performance on either the training set (e.g.,QCV2) or the test set (e.g., specificity, selectivity or QF1/F2/F32). However, none of these metrics were developed with VS in mind, and consequently, their ability to reliably assess the performances of a model in the context of VS is at best limited. With this in mind we have recently reported the development of the enrichment optimization algorithm (EOA). EOA derives QSAR models in the form of multiple linear regression (MLR) equations for VS by optimizing an enrichment-based metric in the space of the descriptors. Here we present an improved version of the algorithm which better handles active compounds and which also takes into account information on inactive (either known inactive or decoy) compounds. We compared the improved EOA in small-scale VS experiments with three common docking tools, namely, Glide-SP, GOLD and AutoDock Vina, employing five molecular targets (acetylcholinesterase, human immunodeficiency virus type 1 protease, MAP kinase p38 alpha, urokinase-type plasminogen activator, and trypsin I). We found that EOA consistently outperformed all docking tools in terms of the area under the ROC curve (AUC) and EF1% metrics that measured the overall and initial success of the VS process, respectively. This was the case when the docking metrics were calculated based on a consensus approach and when they were calculated based on two different sets of single crystal structures. Finally, we propose that EOA could be combined with molecular docking to derive target-specific scoring functions

    Materials Informatics

    No full text

    Anomeric Free Energy of d

    No full text

    RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells

    No full text
    Abstract An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a “one stop shop” algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For “future” predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions

    Optimization of Molecular Representativeness

    No full text
    Representative subsets selected from within larger data sets are useful in many chemoinformatics applications including the design of information-rich compound libraries, the selection of compounds for biological evaluation, and the development of reliable quantitative structure–activity relationship (QSAR) models. Such subsets can overcome many of the problems typical of diverse subsets, most notably the tendency of the latter to focus on outliers. Yet only a few algorithms for the selection of representative subsets have been reported in the literature. Here we report on the development of two algorithms for the selection of representative subsets from within parent data sets based on the optimization of a newly devised representativeness function either alone or simultaneously with the MaxMin function. The performances of the new algorithms were evaluated using several measures representing their ability to produce (1) subsets which are, on average, close to data set compounds; (2) subsets which, on average, span the same space as spanned by the entire data set; (3) subsets mirroring the distribution of biological indications in a parent data set; and (4) test sets which are well predicted by qualitative QSAR models built on data set compounds. We demonstrate that for three data sets (containing biological indication data, logBBB permeation data, and Plasmodium falciparum inhibition data), subsets obtained using the new algorithms are more representative than subsets obtained by hierarchical clustering, <i>k</i>-means clustering, or the MaxMin optimization at least in three of these measures

    Optimization of Molecular Representativeness

    No full text
    Representative subsets selected from within larger data sets are useful in many chemoinformatics applications including the design of information-rich compound libraries, the selection of compounds for biological evaluation, and the development of reliable quantitative structure–activity relationship (QSAR) models. Such subsets can overcome many of the problems typical of diverse subsets, most notably the tendency of the latter to focus on outliers. Yet only a few algorithms for the selection of representative subsets have been reported in the literature. Here we report on the development of two algorithms for the selection of representative subsets from within parent data sets based on the optimization of a newly devised representativeness function either alone or simultaneously with the MaxMin function. The performances of the new algorithms were evaluated using several measures representing their ability to produce (1) subsets which are, on average, close to data set compounds; (2) subsets which, on average, span the same space as spanned by the entire data set; (3) subsets mirroring the distribution of biological indications in a parent data set; and (4) test sets which are well predicted by qualitative QSAR models built on data set compounds. We demonstrate that for three data sets (containing biological indication data, logBBB permeation data, and Plasmodium falciparum inhibition data), subsets obtained using the new algorithms are more representative than subsets obtained by hierarchical clustering, <i>k</i>-means clustering, or the MaxMin optimization at least in three of these measures
    • …
    corecore