6 research outputs found
ΠΡΠΎΠ³Π½ΠΎΡΡΠΈΡΠ΅ΡΠΊΠ°Ρ ΠΌΠΎΠ΄Π΅Π»Ρ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π½ΠΎΠ²ΡΡ Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 Π½Π° Π°Π½Π°Π»ΠΈΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΏΠ»Π°ΡΡΠΎΡΠΌΠ΅ KNIME
Β Β Π‘ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½Π° Π±Π°Π·Π° Π΄Π°Π½Π½ΡΡ
Ρ
ΠΈΠΌΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠΎΠ΅Π΄ΠΈΠ½Π΅Π½ΠΈΠΉ β Π½ΠΈΠ·ΠΊΠΎΠΌΠΎΠ»Π΅ΠΊΡΠ»ΡΡΠ½ΡΡ
Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 (Π°ΡΠΎΠΌΠ°ΡΠ°Π·Ρ) ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ° Π½Π° ΠΎΡΠ½ΠΎΠ²Π°Π½ΠΈΠΈ ΠΏΡΠΎΠ°Π½Π°Π»ΠΈΠ·ΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
Π΄Π°Π½Π½ΡΡ
, ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΡΡ
in vitro. Π‘ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΠΎΠΉ Π±Π°Π·Ρ Π΄Π°Π½Π½ΡΡ
ΠΏΡΠΈ ΠΏΠΎΠΌΠΎΡΠΈ ΠΌΠ΅ΡΠΎΠ΄Π° ΠΌΠ°ΡΠΈΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ Β«ΡΠ»ΡΡΠ°ΠΉΠ½ΡΠΉ Π»Π΅Ρ Π΄Π΅ΡΠ΅Π²ΡΠ΅Π² ΠΏΡΠΈΠ½ΡΡΠΈΡ ΡΠ΅ΡΠ΅Π½ΠΈΠΉΒ» Π½Π° Π°Π½Π°Π»ΠΈΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΏΠ»Π°ΡΡΠΎΡΠΌΠ΅ KNIME ΠΏΠΎΡΡΡΠΎΠ΅Π½Ρ Π΄Π²Π΅ ΠΏΡΠΎΠ³Π½ΠΎΡΡΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π΄Π»Ρ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π°ΠΊΡΠΈΠ²Π½ΠΎΡΡΠΈ Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² ΡΡΠ΅ΡΠΎΠΈΠ΄Π½ΠΎΠΉ (I ΡΠΈΠΏΠ°) ΠΈ Π½Π΅ΡΡΠ΅ΡΠΎΠΈΠ΄Π½ΠΎΠΉ (II ΡΠΈΠΏΠ°) ΡΡΡΡΠΊΡΡΡΡ. Π ΠΊΠ°ΡΠ΅ΡΡΠ²Π΅ ΠΎΠ±ΡΡΠ°ΡΡΠΈΡ
Π΄Π°Π½Π½ΡΡ
ΠΏΡΠΈ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΠΈ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΏΡΠΈΠΌΠ΅Π½ΡΠ»ΠΈΡΡ ΡΠΎΠΏΠΎΠ»ΠΎΠ³ΠΈΡΠ΅ΡΠΊΠΈΠ΅ Π΄Π΅ΡΠΊΡΠΈΠΏΡΠΎΡΡ Ρ
ΠΈΠΌΠΈΡΠ΅ΡΠΊΠΎΠΉ ΡΡΡΡΠΊΡΡΡΡ, ΡΡΠΈΡΡΠ²Π°ΡΡΠΈΠ΅ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΡ ΠΌΠ΅ΠΆΠ΄Ρ ΡΡΡΡΠΊΡΡΡΠΎΠΉ ΠΌΠΎΠ»Π΅ΠΊΡΠ»Ρ ΠΈ Π±ΠΈΠΎΠ»ΠΎΠ³ΠΈΡΠ΅ΡΠΊΠΈΠΌ ΡΡΡΠ΅ΠΊΡΠΎΠΌ. ΠΠ»Ρ ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π±ΡΠ» ΠΎΡΡΡΠ΅ΡΡΠ²Π»Π΅Π½ ΠΎΡΠ±ΠΎΡ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ Π·Π½Π°ΡΠΈΠΌΡΡ
ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ² (Π΄Π΅ΡΠΊΡΠΈΠΏΡΠΎΡΠΎΠ²), ΠΏΡΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΎ Π²ΡΡΠΈΡΠ»Π΅Π½ΠΈΠ΅ ΠΎΠΏΡΠΈΠΌΠ°Π»ΡΠ½ΡΡ
ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΠΎΠ² ΠΈ Π½Π°ΠΉΠ΄Π΅Π½Π° ΠΎΠ±Π»Π°ΡΡΡ ΠΏΡΠΈΠΌΠ΅Π½ΠΈΠΌΠΎΡΡΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ. ΠΠ° ΠΎΡΠ½ΠΎΠ²Π°Π½ΠΈΠΈ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠΎΠ² ΠΏΠΎΠΊΠ°Π·Π°ΡΠ΅Π»Π΅ΠΉ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° AUC ΠΏΡΠΎΠ²Π΅Π΄Π΅Π½Π° ΠΎΡΠ΅Π½ΠΊΠ° ΡΠΏΠΎΡΠΎΠ±Π½ΠΎΡΡΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΏΡΠ΅Π΄ΡΠΊΠ°Π·ΡΠ²Π°ΡΡ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΡΠ΅ΡΡΠΎΠ²ΠΎΠΉ Π²ΡΠ±ΠΎΡΠΊΠΈ. ΠΠΎΠ»ΡΡΠ΅Π½Π½ΡΠ΅ ΠΏΠΎΠΊΠ°Π·Π°ΡΠ΅Π»ΠΈ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° ΡΠ²ΠΈΠ΄Π΅ΡΠ΅Π»ΡΡΡΠ²ΡΡΡ ΠΎ Π΄ΠΎΡΡΠ°ΡΠΎΡΠ½ΠΎ Π²ΡΡΠΎΠΊΠΎΠΉ ΠΏΡΠΎΠ³Π½ΠΎΡΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΡΠΏΠΎΡΠΎΠ±Π½ΠΎΡΡΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΈ ΠΏΠ΅ΡΡΠΏΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΠΈ ΠΈΡ
ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ Π΄Π»Ρ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π½ΠΎΠ²ΡΡ
Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ°. ΠΠ°ΠΉΠ΄Π΅Π½Π½ΡΠ΅ ΡΠ°ΠΊΠΈΠΌ ΡΠΏΠΎΡΠΎΠ±ΠΎΠΌ ΡΠΎΠ΅Π΄ΠΈΠ½Π΅Π½ΠΈΡ ΠΌΠΎΠ³ΡΡ ΡΠ°ΡΡΠΌΠ°ΡΡΠΈΠ²Π°ΡΡΡΡ ΠΊΠ°ΠΊ ΠΏΠΎΡΠ΅Π½ΡΠΈΠ°Π»ΡΠ½ΡΠ΅ ΠΊ ΡΠΎΠ·Π΄Π°Π½ΠΈΡ Π»Π΅ΠΊΠ°ΡΡΡΠ²Π΅Π½Π½ΡΠ΅ ΠΏΡΠ΅ΠΏΠ°ΡΠ°ΡΡ Π΄Π»Ρ Π»Π΅ΡΠ΅Π½ΠΈΡ Π³ΠΎΡΠΌΠΎΠ½-Π·Π°Π²ΠΈΡΠΈΠΌΡΡ
ΠΎΠΏΡΡ
ΠΎΠ»Π΅ΠΉ.Β Β The purpose of this studyΒ was to create a database of the chemical compounds β ligands of human steroid-hydroxylating cytochrome CYP19A1 (aromatase) in order to build a predictive model.Β Β The idea was to create a model on the basis of the machinery learning method such as random forest for two types of ligands β with steroidal (I type) and non-steroidal structure (II type). Two predictive models were built with the help of the KNIME analytical platform. Topological descriptors of the chemical structure were used as training data when building a model that takes into account their correlation between the structure of the molecule and the biological effect. The selection of the feature importance of the descriptors, optimal parameters of random forest and the definition of applicability domain of the models were carried out. The assessment of the ability to predict the results of a test sample was performed for each model. The quality marks of the obtained models indicated a rather high predictive ability of the models and the prospects of their use for identification of new human CYP19A1 ligands as potential drugs for treatment of hormone-dependent tumors.Β Β Π‘ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½Π° Π±Π°Π·Π° Π΄Π°Π½Π½ΡΡ
Ρ
ΠΈΠΌΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠΎΠ΅Π΄ΠΈΠ½Π΅Π½ΠΈΠΉ β Π½ΠΈΠ·ΠΊΠΎΠΌΠΎΠ»Π΅ΠΊΡΠ»ΡΡΠ½ΡΡ
Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 (Π°ΡΠΎΠΌΠ°ΡΠ°Π·Ρ) ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ° Π½Π° ΠΎΡΠ½ΠΎΠ²Π°Π½ΠΈΠΈ ΠΏΡΠΎΠ°Π½Π°Π»ΠΈΠ·ΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
Π΄Π°Π½Π½ΡΡ
, ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΡΡ
in vitro. Π‘ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΠΎΠΉ Π±Π°Π·Ρ Π΄Π°Π½Π½ΡΡ
ΠΏΡΠΈ ΠΏΠΎΠΌΠΎΡΠΈ ΠΌΠ΅ΡΠΎΠ΄Π° ΠΌΠ°ΡΠΈΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ Β«ΡΠ»ΡΡΠ°ΠΉΠ½ΡΠΉ Π»Π΅Ρ Π΄Π΅ΡΠ΅Π²ΡΠ΅Π² ΠΏΡΠΈΠ½ΡΡΠΈΡ ΡΠ΅ΡΠ΅Π½ΠΈΠΉΒ» Π½Π° Π°Π½Π°Π»ΠΈΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΏΠ»Π°ΡΡΠΎΡΠΌΠ΅ KNIME ΠΏΠΎΡΡΡΠΎΠ΅Π½Ρ Π΄Π²Π΅ ΠΏΡΠΎΠ³Π½ΠΎΡΡΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π΄Π»Ρ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π°ΠΊΡΠΈΠ²Π½ΠΎΡΡΠΈ Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² ΡΡΠ΅ΡΠΎΠΈΠ΄Π½ΠΎΠΉ (I ΡΠΈΠΏΠ°) ΠΈ Π½Π΅ΡΡΠ΅ΡΠΎΠΈΠ΄Π½ΠΎΠΉ (II ΡΠΈΠΏΠ°) ΡΡΡΡΠΊΡΡΡΡ. Π ΠΊΠ°ΡΠ΅ΡΡΠ²Π΅ ΠΎΠ±ΡΡΠ°ΡΡΠΈΡ
Π΄Π°Π½Π½ΡΡ
ΠΏΡΠΈ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΠΈ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΏΡΠΈΠΌΠ΅Π½ΡΠ»ΠΈΡΡ ΡΠΎΠΏΠΎΠ»ΠΎΠ³ΠΈΡΠ΅ΡΠΊΠΈΠ΅ Π΄Π΅ΡΠΊΡΠΈΠΏΡΠΎΡΡ Ρ
ΠΈΠΌΠΈΡΠ΅ΡΠΊΠΎΠΉ ΡΡΡΡΠΊΡΡΡΡ, ΡΡΠΈΡΡΠ²Π°ΡΡΠΈΠ΅ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΡ ΠΌΠ΅ΠΆΠ΄Ρ ΡΡΡΡΠΊΡΡΡΠΎΠΉ ΠΌΠΎΠ»Π΅ΠΊΡΠ»Ρ ΠΈ Π±ΠΈΠΎΠ»ΠΎΠ³ΠΈΡΠ΅ΡΠΊΠΈΠΌ ΡΡΡΠ΅ΠΊΡΠΎΠΌ. ΠΠ»Ρ ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π±ΡΠ» ΠΎΡΡΡΠ΅ΡΡΠ²Π»Π΅Π½ ΠΎΡΠ±ΠΎΡ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ Π·Π½Π°ΡΠΈΠΌΡΡ
ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ² (Π΄Π΅ΡΠΊΡΠΈΠΏΡΠΎΡΠΎΠ²), ΠΏΡΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΎ Π²ΡΡΠΈΡΠ»Π΅Π½ΠΈΠ΅ ΠΎΠΏΡΠΈΠΌΠ°Π»ΡΠ½ΡΡ
ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΠΎΠ² ΠΈ Π½Π°ΠΉΠ΄Π΅Π½Π° ΠΎΠ±Π»Π°ΡΡΡ ΠΏΡΠΈΠΌΠ΅Π½ΠΈΠΌΠΎΡΡΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ. ΠΠ° ΠΎΡΠ½ΠΎΠ²Π°Π½ΠΈΠΈ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠΎΠ² ΠΏΠΎΠΊΠ°Π·Π°ΡΠ΅Π»Π΅ΠΉ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° AUC ΠΏΡΠΎΠ²Π΅Π΄Π΅Π½Π° ΠΎΡΠ΅Π½ΠΊΠ° ΡΠΏΠΎΡΠΎΠ±Π½ΠΎΡΡΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΏΡΠ΅Π΄ΡΠΊΠ°Π·ΡΠ²Π°ΡΡ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΡΠ΅ΡΡΠΎΠ²ΠΎΠΉ Π²ΡΠ±ΠΎΡΠΊΠΈ. ΠΠΎΠ»ΡΡΠ΅Π½Π½ΡΠ΅ ΠΏΠΎΠΊΠ°Π·Π°ΡΠ΅Π»ΠΈ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° ΡΠ²ΠΈΠ΄Π΅ΡΠ΅Π»ΡΡΡΠ²ΡΡΡ ΠΎ Π΄ΠΎΡΡΠ°ΡΠΎΡΠ½ΠΎ Π²ΡΡΠΎΠΊΠΎΠΉ ΠΏΡΠΎΠ³Π½ΠΎΡΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΡΠΏΠΎΡΠΎΠ±Π½ΠΎΡΡΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΈ ΠΏΠ΅ΡΡΠΏΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΠΈ ΠΈΡ
ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ Π΄Π»Ρ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π½ΠΎΠ²ΡΡ
Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 ΡΠ΅Π»ΠΎΠ²Π΅ΠΊΠ°. ΠΠ°ΠΉΠ΄Π΅Π½Π½ΡΠ΅ ΡΠ°ΠΊΠΈΠΌ ΡΠΏΠΎΡΠΎΠ±ΠΎΠΌ ΡΠΎΠ΅Π΄ΠΈΠ½Π΅Π½ΠΈΡ ΠΌΠΎΠ³ΡΡ ΡΠ°ΡΡΠΌΠ°ΡΡΠΈΠ²Π°ΡΡΡΡ ΠΊΠ°ΠΊ ΠΏΠΎΡΠ΅Π½ΡΠΈΠ°Π»ΡΠ½ΡΠ΅ ΠΊ ΡΠΎΠ·Π΄Π°Π½ΠΈΡ Π»Π΅ΠΊΠ°ΡΡΡΠ²Π΅Π½Π½ΡΠ΅ ΠΏΡΠ΅ΠΏΠ°ΡΠ°ΡΡ Π΄Π»Ρ Π»Π΅ΡΠ΅Π½ΠΈΡ Π³ΠΎΡΠΌΠΎΠ½-Π·Π°Π²ΠΈΡΠΈΠΌΡΡ
ΠΎΠΏΡΡ
ΠΎΠ»Π΅ΠΉ
Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine
Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers
Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds
Repetitive exposure to a chemical agent can induce an immune reaction in inherently susceptible individuals that leads to skin sensitization. Although many chemicals have been reported as skin sensitizers, there have been very few rigorously validated QSAR models with defined applicability domains (AD) that were developed using a large group of chemically diverse compounds. In this study, we have aimed to compile, curate, and integrate the largest publicly available dataset related to chemically-induced skin sensitization, use this data to generate rigorously validated and QSAR models for skin sensitization, and employ these models as a virtual screening tool for identifying putative sensitizers among environmental chemicals. We followed best practices for model building and validation implemented with our predictive QSAR workflow using random forest modeling technique in combination with SiRMS and Dragon descriptors. The Correct Classification Rate (CCR) for QSAR models discriminating sensitizers from non-sensitizers were 71β88% when evaluated on several external validation sets, within a broad AD, with positive (for sensitizers) and negative (for non-sensitizers) predicted rates of 85% and 79% respectively. When compared to the skin sensitization module included in the OECD QSAR toolbox as well as to the skin sensitization model in publicly available VEGA software, our models showed a significantly higher prediction accuracy for the same sets of external compounds as evaluated by Positive Predicted Rate, Negative Predicted Rate, and CCR. These models were applied to identify putative chemical hazards in the ScoreCard database of possible skin or sense organ toxicants as primary candidates for experimental validation
Quantifying the Effects of Correlated Covariates on Variable Importance Estimates from Random Forests
Recent advances in computing technology have lead to the development of algorithmic modeling techniques. These methods can be used to analyze data which are difficult to analyze using traditional statistical models. This study examined the effectiveness of variable importance estimates from the random forest algorithm in identifying the true predictor among a large number of candidate predictors. A simulation study was conducted using twenty different levels of association among the independent variables and seven different levels of association between the true predictor and the response. We conclude that the random forest method is an effective classification tool when the goals of a study are to produce an accurate classifier and to provide insight regarding the discriminative ability of individual predictor variables. These goals are common in gene expression analysis, therefore we apply the random forest method for the purpose of estimating variable importance on a microarray data set
Recommended from our members
A Knowledge Based Approach of Toxicity Prediction for Drug Formulation. Modelling Drug Vehicle Relationships Using Soft Computing Techniques
This multidisciplinary thesis is concerned with the prediction of drug formulations for the reduction of drug toxicity. Both scientific and computational approaches are utilised to make original contributions to the field of predictive toxicology.
The first part of this thesis provides a detailed scientific discussion on all aspects of drug formulation and toxicity. Discussions are focused around the principal mechanisms of drug toxicity and how drug toxicity is studied and reported in the literature. Furthermore, a review of the current technologies available for formulating drugs for toxicity reduction is provided. Examples of studies reported in the literature that have used these technologies to reduce drug toxicity are also reported. The thesis also provides an overview of the computational approaches currently employed in the field of in silico predictive toxicology. This overview focuses on the machine learning approaches used to build predictive QSAR classification models, with examples discovered from the literature provided.
Two methodologies have been developed as part of the main work of this thesis. The first is focused on use of directed bipartite graphs and Venn diagrams for the visualisation and extraction of drug-vehicle relationships from large un-curated datasets which show changes in the patterns of toxicity. These relationships can be rapidly extracted and visualised using the methodology proposed in chapter 4.
The second methodology proposed, involves mining large datasets for the extraction of drug-vehicle toxicity data. The methodology uses an area-under-the-curve principle to make pairwise comparisons of vehicles which are classified according to the toxicity protection they offer, from which predictive classification models based on random forests and decisions trees are built. The results of this methodology are reported in chapter 6