6 research outputs found

    ΠŸΡ€ΠΎΠ³Π½ΠΎΡΡ‚ΠΈΡ‡Π΅ΡΠΊΠ°Ρ модСль ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΈ Π½ΠΎΠ²Ρ‹Ρ… Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 Π½Π° аналитичСской ΠΏΠ»Π°Ρ‚Ρ„ΠΎΡ€ΠΌΠ΅ KNIME

    Get PDF
    Β  Β Π‘Ρ„ΠΎΡ€ΠΌΠΈΡ€ΠΎΠ²Π°Π½Π° Π±Π°Π·Π° Π΄Π°Π½Π½Ρ‹Ρ… химичСских соСдинСний – низкомолСкулярных Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 (Π°Ρ€ΠΎΠΌΠ°Ρ‚Π°Π·Ρ‹) Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ° Π½Π° основании ΠΏΡ€ΠΎΠ°Π½Π°Π»ΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Ρ… Π΄Π°Π½Π½Ρ‹Ρ…, ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹Ρ… in vitro. Π‘ использованиСм ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΠΎΠΉ Π±Π°Π·Ρ‹ Π΄Π°Π½Π½Ρ‹Ρ… ΠΏΡ€ΠΈ ΠΏΠΎΠΌΠΎΡ‰ΠΈ ΠΌΠ΅Ρ‚ΠΎΠ΄Π° машинного обучСния «случайный лСс Π΄Π΅Ρ€Π΅Π²ΡŒΠ΅Π² принятия Ρ€Π΅ΡˆΠ΅Π½ΠΈΠΉΒ» Π½Π° аналитичСской ΠΏΠ»Π°Ρ‚Ρ„ΠΎΡ€ΠΌΠ΅ KNIME построСны Π΄Π²Π΅ прогностичСскиС ΠΌΠΎΠ΄Π΅Π»ΠΈ для ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΈ активности Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² стСроидной (I Ρ‚ΠΈΠΏΠ°) ΠΈ нСстСроидной (II Ρ‚ΠΈΠΏΠ°) структуры. Π’ качСствС ΠΎΠ±ΡƒΡ‡Π°ΡŽΡ‰ΠΈΡ… Π΄Π°Π½Π½Ρ‹Ρ… ΠΏΡ€ΠΈ построСнии ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΏΡ€ΠΈΠΌΠ΅Π½ΡΠ»ΠΈΡΡŒ топологичСскиС дСскрипторы химичСской структуры, ΡƒΡ‡ΠΈΡ‚Ρ‹Π²Π°ΡŽΡ‰ΠΈΠ΅ ΠΊΠΎΡ€Ρ€Π΅Π»ΡΡ†ΠΈΡŽ ΠΌΠ΅ΠΆΠ΄Ρƒ структурой ΠΌΠΎΠ»Π΅ΠΊΡƒΠ»Ρ‹ ΠΈ биологичСским эффСктом. Для ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π±Ρ‹Π» осущСствлСн ΠΎΡ‚Π±ΠΎΡ€ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ Π·Π½Π°Ρ‡ΠΈΠΌΡ‹Ρ… ΠΏΡ€ΠΈΠ·Π½Π°ΠΊΠΎΠ² (дСскрипторов), ΠΏΡ€ΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΎ вычислСниС ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»ΡŒΠ½Ρ‹Ρ… ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ² ΠΈ Π½Π°ΠΉΠ΄Π΅Π½Π° ΠΎΠ±Π»Π°ΡΡ‚ΡŒ примСнимости ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ. На основании Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚ΠΎΠ² ΠΏΠΎΠΊΠ°Π·Π°Ρ‚Π΅Π»Π΅ΠΉ качСства AUC ΠΏΡ€ΠΎΠ²Π΅Π΄Π΅Π½Π° ΠΎΡ†Π΅Π½ΠΊΠ° способности ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΏΡ€Π΅Π΄ΡΠΊΠ°Π·Ρ‹Π²Π°Ρ‚ΡŒ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ тСстовой Π²Ρ‹Π±ΠΎΡ€ΠΊΠΈ. ΠŸΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹Π΅ ΠΏΠΎΠΊΠ°Π·Π°Ρ‚Π΅Π»ΠΈ качСства ΡΠ²ΠΈΠ΄Π΅Ρ‚Π΅Π»ΡŒΡΡ‚Π²ΡƒΡŽΡ‚ ΠΎ достаточно высокой прогностичСской способности ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΈ пСрспСктивности ΠΈΡ… использования для ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΈ Π½ΠΎΠ²Ρ‹Ρ… Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ°. НайдСнныС Ρ‚Π°ΠΊΠΈΠΌ способом соСдинСния ΠΌΠΎΠ³ΡƒΡ‚ Ρ€Π°ΡΡΠΌΠ°Ρ‚Ρ€ΠΈΠ²Π°Ρ‚ΡŒΡΡ ΠΊΠ°ΠΊ ΠΏΠΎΡ‚Π΅Π½Ρ†ΠΈΠ°Π»ΡŒΠ½Ρ‹Π΅ ΠΊ созданию лСкарствСнныС ΠΏΡ€Π΅ΠΏΠ°Ρ€Π°Ρ‚Ρ‹ для лСчСния Π³ΠΎΡ€ΠΌΠΎΠ½-зависимых ΠΎΠΏΡƒΡ…ΠΎΠ»Π΅ΠΉ.Β  Β The purpose of this studyΒ was to create a database of the chemical compounds – ligands of human steroid-hydroxylating cytochrome CYP19A1 (aromatase) in order to build a predictive model.Β  Β The idea was to create a model on the basis of the machinery learning method such as random forest for two types of ligands – with steroidal (I type) and non-steroidal structure (II type). Two predictive models were built with the help of the KNIME analytical platform. Topological descriptors of the chemical structure were used as training data when building a model that takes into account their correlation between the structure of the molecule and the biological effect. The selection of the feature importance of the descriptors, optimal parameters of random forest and the definition of applicability domain of the models were carried out. The assessment of the ability to predict the results of a test sample was performed for each model. The quality marks of the obtained models indicated a rather high predictive ability of the models and the prospects of their use for identification of new human CYP19A1 ligands as potential drugs for treatment of hormone-dependent tumors.Β  Β Π‘Ρ„ΠΎΡ€ΠΌΠΈΡ€ΠΎΠ²Π°Π½Π° Π±Π°Π·Π° Π΄Π°Π½Π½Ρ‹Ρ… химичСских соСдинСний – низкомолСкулярных Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 (Π°Ρ€ΠΎΠΌΠ°Ρ‚Π°Π·Ρ‹) Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ° Π½Π° основании ΠΏΡ€ΠΎΠ°Π½Π°Π»ΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Ρ… Π΄Π°Π½Π½Ρ‹Ρ…, ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹Ρ… in vitro. Π‘ использованиСм ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΠΎΠΉ Π±Π°Π·Ρ‹ Π΄Π°Π½Π½Ρ‹Ρ… ΠΏΡ€ΠΈ ΠΏΠΎΠΌΠΎΡ‰ΠΈ ΠΌΠ΅Ρ‚ΠΎΠ΄Π° машинного обучСния «случайный лСс Π΄Π΅Ρ€Π΅Π²ΡŒΠ΅Π² принятия Ρ€Π΅ΡˆΠ΅Π½ΠΈΠΉΒ» Π½Π° аналитичСской ΠΏΠ»Π°Ρ‚Ρ„ΠΎΡ€ΠΌΠ΅ KNIME построСны Π΄Π²Π΅ прогностичСскиС ΠΌΠΎΠ΄Π΅Π»ΠΈ для ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΈ активности Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² стСроидной (I Ρ‚ΠΈΠΏΠ°) ΠΈ нСстСроидной (II Ρ‚ΠΈΠΏΠ°) структуры. Π’ качСствС ΠΎΠ±ΡƒΡ‡Π°ΡŽΡ‰ΠΈΡ… Π΄Π°Π½Π½Ρ‹Ρ… ΠΏΡ€ΠΈ построСнии ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΏΡ€ΠΈΠΌΠ΅Π½ΡΠ»ΠΈΡΡŒ топологичСскиС дСскрипторы химичСской структуры, ΡƒΡ‡ΠΈΡ‚Ρ‹Π²Π°ΡŽΡ‰ΠΈΠ΅ ΠΊΠΎΡ€Ρ€Π΅Π»ΡΡ†ΠΈΡŽ ΠΌΠ΅ΠΆΠ΄Ρƒ структурой ΠΌΠΎΠ»Π΅ΠΊΡƒΠ»Ρ‹ ΠΈ биологичСским эффСктом. Для ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π±Ρ‹Π» осущСствлСн ΠΎΡ‚Π±ΠΎΡ€ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ Π·Π½Π°Ρ‡ΠΈΠΌΡ‹Ρ… ΠΏΡ€ΠΈΠ·Π½Π°ΠΊΠΎΠ² (дСскрипторов), ΠΏΡ€ΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΎ вычислСниС ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»ΡŒΠ½Ρ‹Ρ… ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ² ΠΈ Π½Π°ΠΉΠ΄Π΅Π½Π° ΠΎΠ±Π»Π°ΡΡ‚ΡŒ примСнимости ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ. На основании Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚ΠΎΠ² ΠΏΠΎΠΊΠ°Π·Π°Ρ‚Π΅Π»Π΅ΠΉ качСства AUC ΠΏΡ€ΠΎΠ²Π΅Π΄Π΅Π½Π° ΠΎΡ†Π΅Π½ΠΊΠ° способности ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΏΡ€Π΅Π΄ΡΠΊΠ°Π·Ρ‹Π²Π°Ρ‚ΡŒ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ тСстовой Π²Ρ‹Π±ΠΎΡ€ΠΊΠΈ. ΠŸΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹Π΅ ΠΏΠΎΠΊΠ°Π·Π°Ρ‚Π΅Π»ΠΈ качСства ΡΠ²ΠΈΠ΄Π΅Ρ‚Π΅Π»ΡŒΡΡ‚Π²ΡƒΡŽΡ‚ ΠΎ достаточно высокой прогностичСской способности ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΈ пСрспСктивности ΠΈΡ… использования для ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΈ Π½ΠΎΠ²Ρ‹Ρ… Π»ΠΈΠ³Π°Π½Π΄ΠΎΠ² CYP19A1 Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ°. НайдСнныС Ρ‚Π°ΠΊΠΈΠΌ способом соСдинСния ΠΌΠΎΠ³ΡƒΡ‚ Ρ€Π°ΡΡΠΌΠ°Ρ‚Ρ€ΠΈΠ²Π°Ρ‚ΡŒΡΡ ΠΊΠ°ΠΊ ΠΏΠΎΡ‚Π΅Π½Ρ†ΠΈΠ°Π»ΡŒΠ½Ρ‹Π΅ ΠΊ созданию лСкарствСнныС ΠΏΡ€Π΅ΠΏΠ°Ρ€Π°Ρ‚Ρ‹ для лСчСния Π³ΠΎΡ€ΠΌΠΎΠ½-зависимых ΠΎΠΏΡƒΡ…ΠΎΠ»Π΅ΠΉ

    Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine

    Get PDF
    Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers

    Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds

    Get PDF
    Repetitive exposure to a chemical agent can induce an immune reaction in inherently susceptible individuals that leads to skin sensitization. Although many chemicals have been reported as skin sensitizers, there have been very few rigorously validated QSAR models with defined applicability domains (AD) that were developed using a large group of chemically diverse compounds. In this study, we have aimed to compile, curate, and integrate the largest publicly available dataset related to chemically-induced skin sensitization, use this data to generate rigorously validated and QSAR models for skin sensitization, and employ these models as a virtual screening tool for identifying putative sensitizers among environmental chemicals. We followed best practices for model building and validation implemented with our predictive QSAR workflow using random forest modeling technique in combination with SiRMS and Dragon descriptors. The Correct Classification Rate (CCR) for QSAR models discriminating sensitizers from non-sensitizers were 71–88% when evaluated on several external validation sets, within a broad AD, with positive (for sensitizers) and negative (for non-sensitizers) predicted rates of 85% and 79% respectively. When compared to the skin sensitization module included in the OECD QSAR toolbox as well as to the skin sensitization model in publicly available VEGA software, our models showed a significantly higher prediction accuracy for the same sets of external compounds as evaluated by Positive Predicted Rate, Negative Predicted Rate, and CCR. These models were applied to identify putative chemical hazards in the ScoreCard database of possible skin or sense organ toxicants as primary candidates for experimental validation

    Quantifying the Effects of Correlated Covariates on Variable Importance Estimates from Random Forests

    Get PDF
    Recent advances in computing technology have lead to the development of algorithmic modeling techniques. These methods can be used to analyze data which are difficult to analyze using traditional statistical models. This study examined the effectiveness of variable importance estimates from the random forest algorithm in identifying the true predictor among a large number of candidate predictors. A simulation study was conducted using twenty different levels of association among the independent variables and seven different levels of association between the true predictor and the response. We conclude that the random forest method is an effective classification tool when the goals of a study are to produce an accurate classifier and to provide insight regarding the discriminative ability of individual predictor variables. These goals are common in gene expression analysis, therefore we apply the random forest method for the purpose of estimating variable importance on a microarray data set
    corecore