17,652 research outputs found
Prediction Of Carboxylic Acid Toxicity Using Machine Learning Model
Carboxylic acids are organic compounds characterized by the presence of a carboxyl functional group capable of donating a proton and forming carboxylate ions in aqueous solutions. The carboxylic acid has widely been used in in manufacturing and medical applications. The rapid growth in carboxylic acid has established a need to predict its toxicity. The purpose of this paper to build predictive toxicity of carboxylic acid models by using five molecular descriptors (refractive index, The octanol/water partition coefficient (log P), acid dissociation constant (pKa), density, and dipole moment) through Machine Learning algorithms. The accuracy of the Machine Learning algorithm was determined by using three different types of models which are Decision Tree, Random Forest and k-Nearest Neighbour (k-NN). Among the machine learning algorithms used, we have determined that the decision tree is the best model for predicting the toxicity of carboxylic acid. This finding demonstrates that the decision tree model exhibits an acceptable level of performance in predicting toxicity within the field of toxicology
Computational methods for prediction of in vitro effects of new chemical structures
Background With a constant increase in the number of new chemicals synthesized
every year, it becomes important to employ the most reliable and fast in
silico screening methods to predict their safety and activity profiles. In
recent years, in silico prediction methods received great attention in an
attempt to reduce animal experiments for the evaluation of various
toxicological endpoints, complementing the theme of replace, reduce and
refine. Various computational approaches have been proposed for the prediction
of compound toxicity ranging from quantitative structure activity relationship
modeling to molecular similarity-based methods and machine learning. Within
the “Toxicology in the 21st Century” screening initiative, a crowd-sourcing
platform was established for the development and validation of computational
models to predict the interference of chemical compounds with nuclear receptor
and stress response pathways based on a training set containing more than
10,000 compounds tested in high-throughput screening assays. Results Here, we
present the results of various molecular similarity-based and machine-learning
based methods over an independent evaluation set containing 647 compounds as
provided by the Tox21 Data Challenge 2014. It was observed that the Random
Forest approach based on MACCS molecular fingerprints and a subset of 13
molecular descriptors selected based on statistical and literature analysis
performed best in terms of the area under the receiver operating
characteristic curve values. Further, we compared the individual and combined
performance of different methods. In retrospect, we also discuss the reasons
behind the superior performance of an ensemble approach, combining a
similarity search method with the Random Forest algorithm, compared to
individual methods while explaining the intrinsic limitations of the latter.
Conclusions Our results suggest that, although prediction methods were
optimized individually for each modelled target, an ensemble of similarity and
machine-learning approaches provides promising performance indicating its
broad applicability in toxicity prediction
Data Quality in Predictive Toxicology: Identification of Chemical Structures and Calculation of Chemical Descriptors
Every technique for toxicity prediction and for the detection of structure–activity relationships relies on the accurate estimation and representation of chemical and toxicologic properties. In this paper we discuss the potential sources of errors associated with the identification of compounds, the representation of their structures, and the calculation of chemical descriptors. It is based on a case study where machine learning techniques were applied to data from noncongeneric compounds and a complex toxicologic end point (carcinogenicity). We propose methods applicable to the routine quality control of large chemical datasets, but our main intention is to raise awareness about this topic and to open a discussion about quality assurance in predictive toxicology. The accuracy and reproducibility of toxicity data will be reported in another paper
- …