189 research outputs found

    Inductive queries for a drug designing robot scientist

    Get PDF
    It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

    S.cerevisiae Complex Function Prediction with Modular Multi-Relational Framework

    Full text link
    Proceeding of: 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Córdoba, Spain, June 1-4, 2010Determining the functions of genes is essential for understanding how the metabolisms work, and for trying to solve their malfunctions. Genes usually work in groups rather than isolated, so functions should be assigned to gene groups and not to individual genes. Moreover, the genetic knowledge has many relations and is very frequently changeable. Thus, a propositional ad-hoc approach is not appropriate to deal with the gene group function prediction domain. We propose the Modular Multi-Relational Framework (MMRF), which faces the problem from a relational and flexible point of view. The MMRF consists of several modules covering all involved domain tasks (grouping, representing and learning using computational prediction techniques). A specific application is described, including a relational representation language, where each module of MMRF is individually instantiated and refined for obtaining a prediction under specific given conditions.This research work has been supported by CICYT, TRA 2007-67374-C02-02 project and by the expert biological knowledge of the Structural Computational Biology Group in Spanish National Cancer Research Centre (CNIO). The authors would like to thank members of Tilde tool developer group in K.U.Leuven for providing their help and many useful suggestions.Publicad

    Predicting gene function using hierarchical multi-label decision tree ensembles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>S. cerevisiae</it>, <it>A. thaliana </it>and <it>M. musculus </it>are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability.</p> <p>Results</p> <p>We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO). We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use.</p> <p>Conclusions</p> <p>Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.</p

    An annotated checklist of bryophytes of Europe, Macaronesia and Cyprus

    Get PDF
    Introduction. Following on from work on the European bryophyte Red List, the taxonomically and nomenclaturally updated spreadsheets used for that project have been expanded into a new checklist for the bryophytes of Europe. Methods. A steering group of ten European bryologists was convened, and over the course of a year, the spreadsheets were compared with previous European checklists, and all changes noted. Recent literature was searched extensively. A taxonomic system was agreed, and the advice and expertise of many European bryologists sought. Key results. A new European checklist of bryophytes, comprising hornworts, liverworts and mosses, is presented. Fifteen new combinations are proposed. Conclusions. This checklist provides a snapshot of the current European bryophyte flora in 2019. It will already be out-of-date on publication, and further research, particularly molecular work, can be expected to result in many more changes over the next few years.Peer reviewe

    Using classification and regression tree modelling to investigate response shift patterns in dentine hypersensitivity

    Get PDF
    BACKGROUND: Dentine hypersensitivity (DH) affects people's quality of life (QoL). However changes in the internal meaning of QoL, known as Response shift (RS) may undermine longitudinal assessment of QoL. This study aimed to describe patterns of RS in people with DH using Classification and Regression Trees (CRT) and to explore the convergent validity of CRT with the then-test and ideals approaches. METHODS: Data from an 8-week clinical trial of mouthwashes for dentine hypersensitivity (n = 75) using the Dentine Hypersensitivity Experience Questionnaire (DHEQ) as the outcome measure, were analysed. CRT was used to examine 8-week changes in DHEQ total score as a dependent variable with clinical status for DH and each DHEQ subscale score (restrictions, coping, social, emotional and identity) as independent variables. Recalibration was inferred when the clinical change was not consistent with the DHEQ change score using a minimally important difference for DHEQ of 22 points. Reprioritization was inferred by changes in the relative importance of each subscale to the model over time. RESULTS: Overall, 50.7% of participants experienced a clinical improvement in their DH after treatment and 22.7% experienced an important improvement in their quality of life. Thirty-six per cent shifted their internal standards downward and 14.7% upwards, suggesting recalibration. Reprioritization occurred over time among the social and emotional impacts of DH. CONCLUSIONS: CRT was a useful method to reveal both, the types and nature of RS in people with a mild health condition and demonstrated convergent validity with design based approaches to detect RS
    corecore