129 research outputs found
The Shapley Value of Classifiers in Ensemble Games
What is the value of an individual model in an ensemble of binary
classifiers? We answer this question by introducing a class of transferable
utility cooperative games called \textit{ensemble games}. In machine learning
ensembles, pre-trained models cooperate to make classification decisions. To
quantify the importance of models in these ensemble games, we define
\textit{Troupe} -- an efficient algorithm which allocates payoffs based on
approximate Shapley values of the classifiers. We argue that the Shapley value
of models in these games is an effective decision metric for choosing a high
performing subset of models from the ensemble. Our analytical findings prove
that our Shapley value estimation scheme is precise and scalable; its
performance increases with size of the dataset and ensemble. Empirical results
on real world graph classification tasks demonstrate that our algorithm
produces high quality estimates of the Shapley value. We find that Shapley
values can be utilized for ensemble pruning, and that adversarial models
receive a low valuation. Complex classifiers are frequently found to be
responsible for both correct and incorrect classification decisions.Comment: Source code is available here:
https://github.com/benedekrozemberczki/shaple
Optimization of the Regression Ensemble Size
Ensemble learning algorithms such as bagging often generate unnecessarily large models, which consume extra computational resources and may degrade the generalization ability. Pruning can potentially reduce ensemble size as well as improve performance; however, researchers have previously focused more on pruning classifiers rather than regressors. This is because, in general, ensemble pruning is based on two metrics: diversity and accuracy. Many diversity metrics are known for problems dealing with a finite set of classes defined by discrete labels. Therefore, most of the work on ensemble pruning is focused on such problems: classification, clustering, and feature selection. For the regression problem, it is much more difficult to introduce a diversity metric. In fact, the only such metric known to date is a correlation matrix based on regressor predictions. This study seeks to address this gap. First, we introduce the mathematical condition that allows checking whether the regression ensemble includes redundant estimators, i.e., estimators, whose removal improves the ensemble performance. Developing this approach, we propose a new ambiguity-based pruning (AP) algorithm that bases on error-ambiguity decomposition formulated for a regression problem. To check the quality of AP, we compare it with the two methods that directly minimize the error by sequentially including and excluding regressors, as well as with the state-of-art Ordered Aggregation algorithm. Experimental studies confirm that the proposed approach allows reducing the size of the regression ensemble with simultaneous improvement in its performance and surpasses all compared methods
Increasing Fairness in Compromise on Accuracy via Weighted Vote with Learning Guarantees
As the bias issue is being taken more and more seriously in widely applied
machine learning systems, the decrease in accuracy in most cases deeply
disturbs researchers when increasing fairness. To address this problem, we
present a novel analysis of the expected fairness quality via weighted vote,
suitable for both binary and multi-class classification. The analysis takes the
correction of biased predictions by ensemble members into account and provides
learning bounds that are amenable to efficient minimisation. We further propose
a pruning method based on this analysis and the concepts of domination and
Pareto optimality, which is able to increase fairness under a prerequisite of
little or even no accuracy decline. The experimental results indicate that the
proposed learning bounds are faithful and that the proposed pruning method can
indeed increase ensemble fairness without much accuracy degradation.Comment: 18 pages, 15 figures, and 6 table
Möbius-strip-like columnar functional connections are revealed in somato-sensory receptive field centroids
Receptive fields of neurons in the forelimb region of areas 3b and 1 of primary somatosensory cortex, in cats and monkeys, were mapped using extracellular recordings obtained sequentially from nearly radial penetrations. Locations of the field centroids indicated the presence of a functional system, in which cortical homotypic representations of the limb surfaces are entwined in three-dimensional Mobius-strip-like patterns of synaptic connections. Boundaries of somatosensory receptive field in nested groups irregularly overlie the centroid order, and are interpreted as arising from the superposition of learned connections upon the embryonic order. Since the theory of embryonic synaptic self-organisation used to model these results was devised and earlier used to explain findings in primary visual cortex, the present findings suggest the theory may be of general application throughout cortex, and may reveal a modular functional synaptic system, which, only in some parts of the cortex, and in some species, is manifest as anatomical ordering into columns
A Framework for Designing the Architectures of Deep Convolutional Neural Networks
Recent advances in Convolutional Neural Networks (CNNs) have obtained promising results in difficult deep learning tasks. However, the success of a CNN depends on finding an architecture to fit a given problem. A hand-crafted architecture is a challenging, time-consuming process that requires expert knowledge and effort, due to a large number of architectural design choices. In this article, we present an efficient framework that automatically designs a high-performing CNN architecture for a given problem. In this framework, we introduce a new optimization objective function that combines the error rate and the information learnt by a set of feature maps using deconvolutional networks (deconvnet). The new objective function allows the hyperparameters of the CNN architecture to be optimized in a way that enhances the performance by guiding the CNN through better visualization of learnt features via deconvnet. The actual optimization of the objective function is carried out via the Nelder-Mead Method (NMM). Further, our new objective function results in much faster convergence towards a better architecture. The proposed framework has the ability to explore a CNN architecture’s numerous design choices in an efficient way and also allows effective, distributed execution and synchronization via web services. Empirically, we demonstrate that the CNN architecture designed with our approach outperforms several existing approaches in terms of its error rate. Our results are also competitive with state-of-the-art results on the MNIST dataset and perform reasonably against the state-of-the-art results on CIFAR-10 and CIFAR-100 datasets. Our approach has a significant role in increasing the depth, reducing the size of strides, and constraining some convolutional layers not followed by pooling layers in order to find a CNN architecture that produces a high recognition performance.https://doi.org/10.3390/e1906024
- …