64 research outputs found

    CheS-Mapper - Chemical Space Mapping and Visualization in 3D

    Get PDF
    Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis

    Data analysis and navigation in high-dimensional chemical and biological spaces

    Get PDF
    The goal of this master thesis is to develop and validate a visual data-mining approach suitable for the screening of chemicals in the context of REACH [Registration, Evaluation, Authorization and Restriction of Chemicals]. The proposed approach will facilitate the development and validation of non-testing methods via the exploration of environmental endpoints and their relationship with the chemical structure and physicochemical properties of chemicals. The use of an interactive chemical space data exploration tool using 3D visualization and navigation will enrich the information available with additional variables like size, texture and color of the objects of the scene (compounds). The features that distinguish this approach and make it unique are (i) the integration of multiple data sources allowing the recovery in real time of complementary information of the studied compounds, (ii) the integration of several algorithms for the data analysis (dimensional reduction, generation of composite variables and clustering) and (iii) direct user interaction with the data through the virtual navigation mechanism. All this is achieved without the need for specialized hardware or the use of specific devices and high-cost virtual reality and mixed reality

    Web-based 3D-visualization of the DrugBank chemical space

    Get PDF
    BACKGROUND Similarly to the periodic table for elements, chemical space offers an organizing principle for representing the diversity of organic molecules, usually in the form of multi-dimensional property spaces that are subjected to dimensionality reduction methods to obtain 3D-spaces or 2D-maps suitable for visual inspection. Unfortunately, tools to look at chemical space on the internet are currently very limited. RESULTS Herein we present webDrugCS, a web application freely available at www.gdb.unibe.ch to visualize DrugBank (www.drugbank.ca, containing over 6000 investigational and approved drugs) in five different property spaces. WebDrugCS displays 3D-clouds of color-coded grid points representing molecules, whose structural formula is displayed on mouse over with an option to link to the corresponding molecule page at the DrugBank website. The 3D-clouds are obtained by principal component analysis of high dimensional property spaces describing constitution and topology (42D molecular quantum numbers MQN), structural features (34D SMILES fingerprint SMIfp), molecular shape (20D atom pair fingerprint APfp), pharmacophores (55D atom category extended atom pair fingerprint Xfp) and substructures (1024D binary substructure fingerprint Sfp). User defined molecules can be uploaded as SMILES lists and displayed together with DrugBank. In contrast to 2D-maps where many compounds fold onto each other, these 3D-spaces have a comparable resolution to their parent high-dimensional chemical space. CONCLUSION To the best of our knowledge webDrugCS is the first publicly available web tool for interactive visualization and exploration of the DrugBank chemical space in 3D. WebDrugCS works on computers, tablets and phones, and facilitates the visual exploration of DrugBank to rapidly learn about the structural diversity of small molecule drugs.Graphical abstractwebDrugCS visualization of DrugBank projected in 3D MQN space color-coded by ring count, with pointer showing the drug 5-fluorouracil

    ChemVA: Interactive visual analysis of chemical compound similarity in virtual screening

    Get PDF
    In the modern drug discovery process, medicinal chemists deal with the complexity of analysis of large ensembles of candidate molecules. Computational tools, such as dimensionality reduction (DR) and classification, are commonly used to efficiently process the multidimensional space of features. These underlying calculations often hinder interpretability of results and prevent experts from assessing the impact of individual molecular features on the resulting representations. To provide a solution for scrutinizing such complex data, we introduce ChemVA, an interactive application for the visual exploration of large molecular ensembles and their features. Our tool consists of multiple coordinated views: Hexagonal view, Detail view, 3D view, Table view, and a newly proposed Difference view designed for the comparison of DR projections. These views display DR projections combined with biological activity, selected molecular features, and confidence scores for each of these projections. This conjunction of views allows the user to drill down through the dataset and to efficiently select candidate compounds. Our approach was evaluated on two case studies of finding structurally similar ligands with similar binding affinity to a target protein, as well as on an external qualitative evaluation. The results suggest that our system allows effective visual inspection and comparison of different high-dimensional molecular representations. Furthermore, ChemVA assists in the identification of candidate compounds while providing information on the certainty behind different molecular representations.Fil: Sabando, María Virginia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Ulbrich, Pavol. Masaryk University. Faculty of Sciences; República ChecaFil: Selzer, Matias Nicolas. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Laboratorio de Ciencias de la Imágenes; ArgentinaFil: Byska, Jan. Masaryk University. Faculty of Sciences; República ChecaFil: Mican, Jan. Masaryk University. Faculty of Sciences; República ChecaFil: Ponzoni, Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Soto, Axel Juan. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Ganuza, María Luján. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Laboratorio de Ciencias de la Imágenes; ArgentinaFil: Kozlikova, Barbora. Masaryk University. Faculty of Sciences; República Chec

    Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods

    Get PDF
    The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert?s knowledge in the selection process is needed for increase the confidence in the final set of descriptors.Fil: Martínez, María Jimena. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Laboratorio de Investigación y Desarrollo en Computación Científica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Ponzoni, Ignacio. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Laboratorio de Investigación y Desarrollo en Computación Científica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Diaz, Monica Fatima. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Planta Piloto de Ingeniería Química. Universidad Nacional del Sur. Planta Piloto de Ingeniería Química; ArgentinaFil: Vazquez, Gustavo Esteban. Universidad Católica del Uruguay. Facultad de Ingeniería y Tecnologías; Uruguay. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Soto, Axel Juan. Dalhousie University. Faculty of Computer Science; Canadá. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

    Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning

    Get PDF
    Additional file 1. The list of descriptor names, instructions on how to run the python script, the distribution plots for the important descriptors, the heat map of activities for the dense dataset, the structure of the over-represented scaffolds in the sparse dataset, a 2D representation of a PCA run on Morgan fingerprints (ECFP-like) for both dense and sparse datasets, and the structures of the 9 misclassified compounds

    An in-silico investigation of Morita-Baylis-Hillman accessible heterocyclic analogues for applications as novel HIV-1 C protease inhibitors

    Get PDF
    Cheminformatic approaches have been employed to optimize the bis-coumarin scaffold identified by Onywera et al. (2012) as a potential hit against the protease HIV-1 protein. The Open Babel library of commands was used to access functions that were incorporated into a markov chain recursive program that generated 17750 analogues of the bis-coumarin scaffold. The Morita-Baylis-Hillman accessible heterocycles were used to introduce structural diversity within the virtual library. In silico high through-put virtual screening using AutoDock Vina was used to rapidly screen the virtual library ligand set against 61 protease models built by Onywera et al. (2012). CheS-Mapper computed a principle component analysis of the compounds based on 13 selected chemical descriptors. The compounds were plotted against the principle component analysis within a 3 dimensional chemical space in order to inspect the diversity of the virtual library. The physicochemical properties and binding affinities were used to identify the top 3 performing ligands. ACPYPE was used to inspect the constitutional properties and eliminated virtual compounds that possessed open valences. Chromene based ligand 805 and ligand 6610 were selected as the lead candidates from the high-throughput virtual screening procedure we employed. Molecular dynamic simulations of the lead candidates performed for 5 ns allowed the stability of the ligand protein complexes with protease model 305152. The free energy of binding of the leads with protease model 305152 was computed over the first 50 ps of simulation using the molecular mechanics Poisson-Boltzmann method. Analysis structural features and energy profiles from molecular dynamic simulations of the protein–ligand complexes indicated that although ligand 805 had a weaker binding affinity in terms of docking, it outperformed ligand 6610 in terms of complex stability and free energy of binding. Medicinal chemistry approaches will be used to optimize the lead candidates before their analogues will be synthesized and assayed for in vivo protease activity

    Challenges in working towards an internal Threshold of Toxicological Concern (iTTC) for use in the safety assessment of cosmetics: Discussions from the Cosmetics Europe iTTC Working Group workshop

    Get PDF
    The Threshold of Toxicological Concern (TTC) is an important risk assessment tool which establishes acceptable low-level exposure values to be applied to chemicals with limited toxicological data. One of the logical next steps in the continued evolution of TTC is to develop this concept further so that it is representative of internal exposures (TTC based on plasma concentration). An internal TTC (iTTC) would provide threshold values that could be utilized in exposure-based safety assessments. As part of a Cosmetics Europe (CosEu) research program, CosEu has initiated a project that is working towards the development of iTTCs that can be used for the human safety assessment. Knowing that the development of an iTTC is an ambitious and broad-spanning topic, CosEu organized a Working Group comprised a balance of multiple stakeholders (cosmetics and chemical industries, the EPA and JRC and academia) with relevant experience and expertise and workshop to critically evaluate the requirements to establish an iTTC. Outcomes from the workshop included an evaluation on the current state of the science for iTTC, the overall iTTC strategy, selection of chemical databases, capture and curation of chemical information, ADME and repeat dose data, expected challenges, as well as next steps and ongoing work

    Satellite Imagery to Map Topsoil Organic Carbon Content over Cultivated Areas: An Overview

    Get PDF
    There is a need to update soil maps and monitor soil organic carbon (SOC) in the upper horizons or plough layer for enabling decision support and land management, while complying with several policies, especially those favoring soil carbon storage. This review paper is dedicated to the satellite-based spectral approaches for SOC assessment that have been achieved from several satellite sensors, study scales and geographical contexts in the past decade. Most approaches relying on pure spectral models have been carried out since 2019 and have dealt with temperate croplands in Europe, China and North America at the scale of small regions, of some hundreds of km(2): dry combustion and wet oxidation were the analytical determination methods used for 50% and 35% of the satellite-derived SOC studies, for which measured topsoil SOC contents mainly referred to mineral soils, typically cambisols and luvisols and to a lesser extent, regosols, leptosols, stagnosols and chernozems, with annual cropping systems with a SOC value of similar to 15 g.kg(-1) and a range of 30 g.kg(-1) in median. Most satellite-derived SOC spectral prediction models used limited preprocessing and were based on bare soil pixel retrieval after Normalized Difference Vegetation Index (NDVI) thresholding. About one third of these models used partial least squares regression (PLSR), while another third used random forest (RF), and the remaining included machine learning methods such as support vector machine (SVM). We did not find any studies either on deep learning methods or on all-performance evaluations and uncertainty analysis of spatial model predictions. Nevertheless, the literature examined here identifies satellite-based spectral information, especially derived under bare soil conditions, as an interesting approach that deserves further investigations. Future research includes considering the simultaneous analysis of imagery acquired at several dates i.e., temporal mosaicking, testing the influence of possible disturbing factors and mitigating their effects fusing mixed models incorporating non-spectral ancillary information
    • …
    corecore