1,253 research outputs found

    Data complexity in machine learning

    Get PDF
    We investigate the role of data complexity in the context of binary classification problems. The universal data complexity is defined for a data set as the Kolmogorov complexity of the mapping enforced by the data set. It is closely related to several existing principles used in machine learning such as Occam's razor, the minimum description length, and the Bayesian approach. The data complexity can also be defined based on a learning model, which is more realistic for applications. We demonstrate the application of the data complexity in two learning problems, data decomposition and data pruning. In data decomposition, we illustrate that a data set is best approximated by its principal subsets which are Pareto optimal with respect to the complexity and the set size. In data pruning, we show that outliers usually have high complexity contributions, and propose methods for estimating the complexity contribution. Since in practice we have to approximate the ideal data complexity measures, we also discuss the impact of such approximations

    Increasing Fairness in Compromise on Accuracy via Weighted Vote with Learning Guarantees

    Full text link
    As the bias issue is being taken more and more seriously in widely applied machine learning systems, the decrease in accuracy in most cases deeply disturbs researchers when increasing fairness. To address this problem, we present a novel analysis of the expected fairness quality via weighted vote, suitable for both binary and multi-class classification. The analysis takes the correction of biased predictions by ensemble members into account and provides learning bounds that are amenable to efficient minimisation. We further propose a pruning method based on this analysis and the concepts of domination and Pareto optimality, which is able to increase fairness under a prerequisite of little or even no accuracy decline. The experimental results indicate that the proposed learning bounds are faithful and that the proposed pruning method can indeed increase ensemble fairness without much accuracy degradation.Comment: 18 pages, 15 figures, and 6 table

    Multiobjective Sparse Ensemble Learning by Means of Evolutionary Algorithms

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Ensemble learning can improve the performance of individual classifiers by combining their decisions. The sparseness of ensemble learning has attracted much attention in recent years. In this paper, a novel multiobjective sparse ensemble learning (MOSEL) model is proposed. Firstly, to describe the ensemble classifiers more precisely the detection error trade-off (DET) curve is taken into consideration. The sparsity ratio (sr) is treated as the third objective to be minimized, in addition to false positive rate (fpr) and false negative rate (fnr) minimization. The MOSEL turns out to be augmented DET (ADET) convex hull maximization problem. Secondly, several evolutionary multiobjective algorithms are exploited to find sparse ensemble classifiers with strong performance. The relationship between the sparsity and the performance of ensemble classifiers on the ADET space is explained. Thirdly, an adaptive MOSEL classifiers selection method is designed to select the most suitable ensemble classifiers for a given dataset. The proposed MOSEL method is applied to well-known MNIST datasets and a real-world remote sensing image change detection problem, and several datasets are used to test the performance of the method on this problem. Experimental results based on both MNIST datasets and remote sensing image change detection show that MOSEL performs significantly better than conventional ensemble learning methods

    Beyond neural scaling laws: beating power law scaling via data pruning

    Full text link
    Widely observed neural scaling laws, in which error falls off as a power of the training set size, model size, or both, have driven substantial performance improvements in deep learning. However, these improvements through scaling alone require considerable costs in compute and energy. Here we focus on the scaling of error with dataset size and show how in theory we can break beyond power law scaling and potentially even reduce it to exponential scaling instead if we have access to a high-quality data pruning metric that ranks the order in which training examples should be discarded to achieve any pruned dataset size. We then test this improved scaling prediction with pruned dataset size empirically, and indeed observe better than power law scaling in practice on ResNets trained on CIFAR-10, SVHN, and ImageNet. Next, given the importance of finding high-quality pruning metrics, we perform the first large-scale benchmarking study of ten different data pruning metrics on ImageNet. We find most existing high performing metrics scale poorly to ImageNet, while the best are computationally intensive and require labels for every image. We therefore developed a new simple, cheap and scalable self-supervised pruning metric that demonstrates comparable performance to the best supervised metrics. Overall, our work suggests that the discovery of good data-pruning metrics may provide a viable path forward to substantially improved neural scaling laws, thereby reducing the resource costs of modern deep learning.Comment: Outstanding Paper Award @ NeurIPS 2022. Added github link to metric score

    A Diversity-Accuracy Measure for Homogenous Ensemble Selection

    Get PDF
    Several selection methods in the literature are essentially based on an evaluation function that determines whether a model M contributes positively to boost the performances of the whole ensemble. In this paper, we propose a method called DIversity and ACcuracy for Ensemble Selection (DIACES) using an evaluation function based on both diversity and accuracy. The method is applied on homogenous ensembles composed of C4.5 decision trees and based on a hill climbing strategy. This allows selecting ensembles with the best compromise between maximum diversity and minimum error rate. Comparative studies show that in most cases the proposed method generates reduced size ensembles with better performances than usual ensemble simplification methods
    • ā€¦
    corecore