4 research outputs found

    Computational techniques for prediction of effects of small molecules on model organisms

    Get PDF
    Studying the response of the model organisms exposed to chemicals can help us understand chemical activities, underlying biological processes and cell mechanisms. In the following disertation we have designed a set of computational techniques for predicting the effect of chemical compounds on model organisms. For chemical descriptors we have examined three different annotation systems, including QSAR based descriptors, molecular fingerprints (presence of specific short fragments) and MeSH terms from the MeSH ontology. Use of MeSH terms is also the the distinctive feature of our approach. We have developed a technique for computing MeSH term enrichment which enabled us to identify enriched subsets of chemicals with statistical significant ratio of chemicals with the target effect on phenotype. In order to identify the most suitable chemical description we have also developed a method for evaluating different types of attribute-based chemical descriptions. We used the support vector machine for predicting the effect of chemical compounds. Using the developed methods we analyzed the data from the experiment where model organism D. discoideum was exposed to 1.045 different chemical compounds and relative growth inhibition was observed as a phenotype. In general we are not able to predict the effects. However, if we split the chemicals to groups sharing some MeSH annotation term, we were able to find the terms for which our predictive procedures worked well. Results from the chemical description evaluation show that attributes based on MeSH terms are more suitable than QSAR-based descriptors and molecular fingerprints. We have also indentified 27 enriched (p < 0.02) MeSH terms which determine the same number of subsets with statisticaly significant ratio of chemical compounds causing the observing phenotype on model organism. Results confirm that use of MeSH terms improves prediction of the chemical impact on model organism

    Orange: data mining toolbox in Python

    Get PDF
    Orange is a machine learning and data mining suite for data analysis through Python scripting and visual programming. Here we report on the scripting part, which features interactive data analysis and component-based assembly of data mining procedures. In the selection and design of components, we focus on the flexibility of their reuse: our principal intention is to let the user write simple and clear scripts in Python, which build upon C++ implementations of computationally-intensive tasks. Orange is intended both for experienced users and programmers, as well as for students of data mining
    corecore