29 research outputs found

    Algebraic Comparison of Partial Lists in Bioinformatics

    Get PDF
    The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

    Ranking and reliable classification

    Get PDF

    A Critical Analysis of Variants of the AUC

    No full text

    An Overview of Algorithmic Randomness and its Application to Reliable Instance Classification

    No full text
    Item does not contain fulltextMachine-learning classiers are difficult to apply in application domains where incorrect predictions can have serious consequences. In these domains, classifiers can be applied only if they guarantee reliable predictions. The transductive confidence machines framework allows to extend classifiers such that they produce predictions that are complemented with a confidence value. The confidence value provides an upper bound on the error rate and can be defined prior to classification. Transductive confidence machines are based on difficult mathematical concepts such as algorithmic randomness and Martin-Löf randomness tests. In the report we explain these concepts in detail, integrate them to motivate transductive confidence machines, and review crucial theoretical and practical properties of transductive confidence machines

    An Overview of Algorithmic Randomness and its Application to Reliable Instance Classification

    No full text
    Machine-learning classiers are difficult to apply in application domains where incorrect predictions can have serious consequences. In these domains, classifiers can be applied only if they guarantee reliable predictions. The transductive confidence machines framework allows to extend classifiers such that they produce predictions that are complemented with a confidence value. The confidence value provides an upper bound on the error rate and can be defined prior to classification. Transductive confidence machines are based on difficult mathematical concepts such as algorithmic randomness and Martin-Löf randomness tests. In the report we explain these concepts in detail, integrate them to motivate transductive confidence machines, and review crucial theoretical and practical properties of transductive confidence machines

    The ROC isometrics approach to construct reliable classifiers

    Get PDF
    Contains fulltext : 77313.pdf (publisher's version ) (Open Access)We address the problem of applying machine-learning classifiers in domains where incorrect classifications have severe consequences. In these domains we propose to apply classifiers only when their performance can be defined by the domain expert prior to classification. The classifiers so obtained are called reliable classifiers. In the article we present three main contributions. First, we establish the effect on an ROC curve when ambiguous instances are left unclassified. Second, we propose the ROC isometrics approach to tune and transform a classifier in such a way that it becomes reliable. Third, we provide an empirical evaluation of the approach. From our analysis and experimental evaluation we may conclude that the ROC isometrics approach is an effective and efficient approach to construct reliable classifiers. In addition, a discussion about related work clearly shows the benefits of the approach when compared with existing approaches that also have the option to leave ambiguous instances unclassified

    A Comparison of Two Approaches to Classify with Guaranteed Performance

    No full text
    Item does not contain fulltex

    Why Fuzzy Decision Trees are Good Rankers

    No full text
    corecore