Search CORE

6 research outputs found

Прогностическая модель идентификации новых лигандов CYP19A1 на аналитической платформе KNIME

Author: M. I. Shaladonova
S. A. Usanov
Ya. V. Dzichenka
М. И. Шаладонова
М. И. Шаладонова
С. А. Усанов
С. А. Усанов
Я. В. Диченко
Я. В. Диченко
Publication venue: The Republican Unitary Enterprise Publishing House "Belaruskaya Navuka"
Publication date: 31/10/2023
Field of study

Сформирована база данных химических соединений – низкомолекулярных лигандов CYP19A1 (ароматазы) человека на основании проанализированных данных, полученных in vitro. С использованием полученной базы данных при помощи метода машинного обучения «случайный лес деревьев принятия решений» на аналитической платформе KNIME построены две прогностические модели для идентификации активности лигандов стероидной (I типа) и нестероидной (II типа) структуры. В качестве обучающих данных при построении модели применялись топологические дескрипторы химической структуры, учитывающие корреляцию между структурой молекулы и биологическим эффектом. Для каждой модели был осуществлен отбор наиболее значимых признаков (дескрипторов), произведено вычисление оптимальных параметров и найдена область применимости моделей. На основании результатов показателей качества AUC проведена оценка способности моделей предсказывать результаты тестовой выборки. Полученные показатели качества свидетельствуют о достаточно высокой прогностической способности моделей и перспективности их использования для идентификации новых лигандов CYP19A1 человека. Найденные таким способом соединения могут рассматриваться как потенциальные к созданию лекарственные препараты для лечения гормон-зависимых опухолей. The purpose of this study was to create a database of the chemical compounds – ligands of human steroid-hydroxylating cytochrome CYP19A1 (aromatase) in order to build a predictive model. The idea was to create a model on the basis of the machinery learning method such as random forest for two types of ligands – with steroidal (I type) and non-steroidal structure (II type). Two predictive models were built with the help of the KNIME analytical platform. Topological descriptors of the chemical structure were used as training data when building a model that takes into account their correlation between the structure of the molecule and the biological effect. The selection of the feature importance of the descriptors, optimal parameters of random forest and the definition of applicability domain of the models were carried out. The assessment of the ability to predict the results of a test sample was performed for each model. The quality marks of the obtained models indicated a rather high predictive ability of the models and the prospects of their use for identification of new human CYP19A1 ligands as potential drugs for treatment of hormone-dependent tumors. Сформирована база данных химических соединений – низкомолекулярных лигандов CYP19A1 (ароматазы) человека на основании проанализированных данных, полученных in vitro. С использованием полученной базы данных при помощи метода машинного обучения «случайный лес деревьев принятия решений» на аналитической платформе KNIME построены две прогностические модели для идентификации активности лигандов стероидной (I типа) и нестероидной (II типа) структуры. В качестве обучающих данных при построении модели применялись топологические дескрипторы химической структуры, учитывающие корреляцию между структурой молекулы и биологическим эффектом. Для каждой модели был осуществлен отбор наиболее значимых признаков (дескрипторов), произведено вычисление оптимальных параметров и найдена область применимости моделей. На основании результатов показателей качества AUC проведена оценка способности моделей предсказывать результаты тестовой выборки. Полученные показатели качества свидетельствуют о достаточно высокой прогностической способности моделей и перспективности их использования для идентификации новых лигандов CYP19A1 человека. Найденные таким способом соединения могут рассматриваться как потенциальные к созданию лекарственные препараты для лечения гормон-зависимых опухолей

Reports of the National Academy of Sciences of Belarus

Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine

Author: Andersen
Aptula
Aptula
Chenzhong Cao
Estrada
Fedorowicz
Golla
Han
Hua Yuan
Huang
Jiang
Jianping Huang
Kennedy
Kubinyi
Langton
Li
Li
Li
Lin
Lushniak
Mosier
Ren
Ren
Roberts
Roberts
Shen
Shen
Todeschini
Vapnik
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 01/07/2009
Field of study

Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

PubMed Central

Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds

Author: Alves Vinicius M.
Andrade Carolina H.
Fourches Denis
Kleinstreuer Nicole
Muratov Eugene
Strickland Judy
Tropsha Alexander
Publication venue
Publication date: 01/01/2015
Field of study

Repetitive exposure to a chemical agent can induce an immune reaction in inherently susceptible individuals that leads to skin sensitization. Although many chemicals have been reported as skin sensitizers, there have been very few rigorously validated QSAR models with defined applicability domains (AD) that were developed using a large group of chemically diverse compounds. In this study, we have aimed to compile, curate, and integrate the largest publicly available dataset related to chemically-induced skin sensitization, use this data to generate rigorously validated and QSAR models for skin sensitization, and employ these models as a virtual screening tool for identifying putative sensitizers among environmental chemicals. We followed best practices for model building and validation implemented with our predictive QSAR workflow using random forest modeling technique in combination with SiRMS and Dragon descriptors. The Correct Classification Rate (CCR) for QSAR models discriminating sensitizers from non-sensitizers were 71–88% when evaluated on several external validation sets, within a broad AD, with positive (for sensitizers) and negative (for non-sensitizers) predicted rates of 85% and 79% respectively. When compared to the skin sensitization module included in the OECD QSAR toolbox as well as to the skin sensitization model in publicly available VEGA software, our models showed a significantly higher prediction accuracy for the same sets of external compounds as evaluated by Positive Predicted Rate, Negative Predicted Rate, and CCR. These models were applied to identify putative chemical hazards in the ScoreCard database of possible skin or sense organ toxicants as primary candidates for experimental validation

PubMed Central

Carolina Digital Repository

Quantifying the Effects of Correlated Covariates on Variable Importance Estimates from Random Forests

Author: Kimes Ryan Vincent
Publication venue: VCU Scholars Compass
Publication date: 01/01/2006
Field of study

Recent advances in computing technology have lead to the development of algorithmic modeling techniques. These methods can be used to analyze data which are difficult to analyze using traditional statistical models. This study examined the effectiveness of variable importance estimates from the random forest algorithm in identifying the true predictor among a large number of candidate predictors. A simulation study was conducted using twenty different levels of association among the independent variables and seven different levels of association between the true predictor and the response. We conclude that the random forest method is an effective classification tool when the goals of a study are to produce an accurate classifier and to provide insight regarding the discriminative ability of individual predictor variables. These goals are common in gene expression analysis, therefore we apply the random forest method for the purpose of estimating variable importance on a microarray data set

VCU Scholars Compass

Application of the Random Forest Method in Studies of Local Lymph Node Assay Based Skin Sensitization Data

Author: Adam Fedorowicz
Harshinder Singh
Shengqiao Li
Sidney C. Soderholm
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

Recommended from our members

A Knowledge Based Approach of Toxicity Prediction for Drug Formulation. Modelling Drug Vehicle Relationships Using Soft Computing Techniques

Author: Mistry Pritesh
Publication venue: Faculty of Engineering and Informatics
Publication date: 01/01/2015
Field of study

This multidisciplinary thesis is concerned with the prediction of drug formulations for the reduction of drug toxicity. Both scientific and computational approaches are utilised to make original contributions to the field of predictive toxicology. The first part of this thesis provides a detailed scientific discussion on all aspects of drug formulation and toxicity. Discussions are focused around the principal mechanisms of drug toxicity and how drug toxicity is studied and reported in the literature. Furthermore, a review of the current technologies available for formulating drugs for toxicity reduction is provided. Examples of studies reported in the literature that have used these technologies to reduce drug toxicity are also reported. The thesis also provides an overview of the computational approaches currently employed in the field of in silico predictive toxicology. This overview focuses on the machine learning approaches used to build predictive QSAR classification models, with examples discovered from the literature provided. Two methodologies have been developed as part of the main work of this thesis. The first is focused on use of directed bipartite graphs and Venn diagrams for the visualisation and extraction of drug-vehicle relationships from large un-curated datasets which show changes in the patterns of toxicity. These relationships can be rapidly extracted and visualised using the methodology proposed in chapter 4. The second methodology proposed, involves mining large datasets for the extraction of drug-vehicle toxicity data. The methodology uses an area-under-the-curve principle to make pairwise comparisons of vehicles which are classified according to the toxicity protection they offer, from which predictive classification models based on random forests and decisions trees are built. The results of this methodology are reported in chapter 6

Bradford Scholars