116 research outputs found

    Data Set Modelability by QSAR

    Get PDF
    We introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (Correct Classification Rate above 0.7) for a binary dataset of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of the nearest neighbor pairs of compounds with the same activity class versus the total number of pairs. The MODI values were calculated for more than 100 datasets and the threshold of 0.65 was found to separate non-modelable from the modelable datasets

    Gelatin/carboxymethyl cellulose mucoadhesive films with lysozyme: Development and characterization

    Get PDF
    The goal of our study is to develop and characterize mucoadhesive films with entrapped lysozyme based on gelatin/sodium carboxymethyl cellulose as perspective antimicrobial preparation. Lysozyme in mucoadhesive films retains more than 95 % of its initial activity for 3 years of storage. Different physical-chemical and biochemical characteristics of entrapped enzyme were evaluated, such as film thickness, weight, time of dissolution in water, bioadhesive force, in vitro lysozyme release, pH- and thermoprofiles of hydrolytic activity, effect of γ-sterilization, etc. We have shown that gelatin/sodium carboxymethyl cellulose films have adhesive force on the level of 4380 Pa. Scanning electron microscopy images shows the relative uniformity of the gelatin surface with entrapped lysozyme. Mucoadhesive films with lysozyme have 100% bactericidal effect on the test strain, Staphylococcus aureus ATCC 25923 F - 49 and thus could be considered as a perspective antimicrobial preparation

    Computer-Assisted Decision Support for Student Admissions Based on Their Predicted Academic Performance

    Get PDF
    Objective. To develop predictive computational models forecasting the academic performance of students in the didactic-rich portion of a doctor of pharmacy (PharmD) curriculum as admission-assisting tools

    Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research

    Get PDF
    Molecular modelers and cheminformaticians typically analyze experimental data generated by other scientists. Consequently, when it comes to data accuracy, cheminformaticians are always at the mercy of data providers who may inadvertently publish (partially) erroneous data. Thus, dataset curation is crucial for any cheminformatics analysis such as similarity searching, clustering, QSAR modeling, virtual screening, etc., especially nowadays when the availability of chemical datasets in public domain has skyrocketed in recent years. Despite the obvious importance of this preliminary step in the computational analysis of any dataset, there appears to be no commonly accepted guidance or set of procedures for chemical data curation. The main objective of this paper is to emphasize the need for a standardized chemical data curation strategy that should be followed at the onset of any molecular modeling investigation. Herein, we discuss several simple but important steps for cleaning chemical records in a database including the removal of a fraction of the data that cannot be appropriately handled by conventional cheminformatics techniques. Such steps include the removal of inorganic and organometallic compounds, counterions, salts and mixtures; structure validation; ring aromatization; normalization of specific chemotypes; curation of tautomeric forms; and the deletion of duplicates. To emphasize the importance of data curation as a mandatory step in data analysis, we discuss several case studies where chemical curation of the original “raw” database enabled the successful modeling study (specifically, QSAR analysis) or resulted in a significant improvement of model's prediction accuracy. We also demonstrate that in some cases rigorously developed QSAR models could be even used to correct erroneous biological data associated with chemical compounds. We believe that good practices for curation of chemical records outlined in this paper will be of value to all scientists working in the fields of molecular modeling, cheminformatics, and QSAR studies

    Computational assessment of environmental hazards of nitroaromatic compounds: influence of the type and position of aromatic ring substituents on toxicity

    Get PDF
    This study summarizes the results of our recent QSAR and QSPR investigations on prediction of numerous aspects of environmental behavior of nitro compounds. In this study, we applied the QSAR/QSPR models previously developed by our group for virtual screening of energetic compounds, their precursors and other compounds containing nitro groups. To make predictions on the environmental impact of nitro compounds, we analyzed the trends in the change of the experimentally obtained and QSAR/QSPR-predicted values of aqueous solubility, lipophilicity, Ames mutagenicity, bioavailability, blood–brain barrier penetration, aquatic toxicity on T. pyriformis and acute oral toxicity on rats as a function of chemical structure of nitro compounds. All the models were developed using simplex descriptors in combination with random forest (RF) modeling techniques. We interpreted the possible environmental impact (different toxicological properties) in terms of dividing considered nitro compounds based on hydrophobic and hydrophilic characteristics and in terms of the influence of their molecular fragments that promote and interfere with toxicity. In particular, we found that, in general, the presence of amide or tertiary amine groups leads to an increase in toxicity. Also, it was predicted that compounds containing a NO2 group in the para-position of a benzene ring are more toxic than meta-isomers, which, in turn, are more toxic than ortho-isomers. In general, we concluded that hydrophobic nitroaromatic compounds, especially the ones with electron-accepting substituents, halogens and amino groups, are the most environmentally hazardous

    The N-ary in the Coal Mine: Avoiding Mixture Model Failure with Proper Validation

    Full text link
    Modeling the properties of chemical mixtures is a difficult but important part of any modeling process intended to be applicable to the often messy and impure phenomena of everyday life, including food and environmental safety, healthcare, etc. Part of this difficulty stems from the increased complexity of designing suitable model validation schemes for mixture data, a fact which has been elucidated in previous work only in the case of binary mixture models. We extend these previously defined validation strategies for QSAR modeling of binary mixtures to the more complex case of general, NN-ary mixtures and argue that these strategies are applicable to many modeling tasks beyond simple chemical mixtures. Additionally, we propose a method of establishing a baseline model performance for each mixture dataset to be in used in model selection comparisons. This baseline is intended to account for the statistical dependence generically present between the properties of mixtures that share constituents. We contend that without such a baseline, estimates of model performance can be dramatically overestimated, and we demonstrate this with multiple case studies using real and simulated data.Comment: 22 pages, 1 figur

    QSAR-Based Virtual Screening: Advances and Applications in Drug Discovery

    Get PDF
    Virtual screening (VS) has emerged in drug discovery as a powerful computational approach to screen large libraries of small molecules for new hits with desired properties that can then be tested experimentally. Similar to other computational approaches, VS intention is not to replace in vitro or in vivo assays, but to speed up the discovery process, to reduce the number of candidates to be tested experimentally, and to rationalize their choice. Moreover, VS has become very popular in pharmaceutical companies and academic organizations due to its time-, cost-, resources-, and labor-saving. Among the VS approaches, quantitative structure–activity relationship (QSAR) analysis is the most powerful method due to its high and fast throughput and good hit rate. As the first preliminary step of a QSAR model development, relevant chemogenomics data are collected from databases and the literature. Then, chemical descriptors are calculated on different levels of representation of molecular structure, ranging from 1D to nD, and then correlated with the biological property using machine learning techniques. Once developed and validated, QSAR models are applied to predict the biological property of novel compounds. Although the experimental testing of computational hits is not an inherent part of QSAR methodology, it is highly desired and should be performed as an ultimate validation of developed models. In this mini-review, we summarize and critically analyze the recent trends of QSAR-based VS in drug discovery and demonstrate successful applications in identifying perspective compounds with desired properties. Moreover, we provide some recommendations about the best practices for QSAR-based VS along with the future perspectives of this approach

    Predicting Binding Affinity of CSAR Ligands Using Both Structure-Based and Ligand-Based Approaches

    Get PDF
    We report on the prediction accuracy of ligand-based (2D QSAR) and structure-based (MedusaDock) methods used both independently and in consensus for ranking the congeneric series of ligands binding to three protein targets (UK, ERK2, and CHK1) from the CSAR 2011 benchmark exercise. An ensemble of predictive QSAR models was developed using known binders of these three targets extracted from the publicly-available ChEMBL database. Selected models were used to predict the binding affinity of CSAR compounds towards the corresponding targets and rank them accordingly; the overall ranking accuracy evaluated by Spearman correlation was as high as 0.78 for UK, 0.60 for ERK2, and 0.56 for CHK1, placing our predictions in top-10% among all the participants. In parallel, MedusaDock designed to predict reliable docking poses was also used for ranking the CSAR ligands according to their docking scores; the resulting accuracy (Spearman correlation) for UK, ERK2, and CHK1 were 0.76, 0.31, and 0.26, respectively. In addition, performance of several consensus approaches combining MedusaDock and QSAR predicted ranks altogether has been explored; the best approach yielded Spearman correlation coefficients for UK, ERK2, and CHK1 of 0.82, 0.50, and 0.45, respectively. This study shows that (i) externally validated 2D QSAR models were capable of ranking CSAR ligands at least as accurately as more computationally intensive structure-based approaches used both by us and by other groups and (ii) ligand-based QSAR models can complement structure-based approaches by boosting the prediction performances when used in consensus
    corecore