179 research outputs found

    Building ensembles of adaptive nested dichotomies with random-pair selection

    Get PDF
    A system of nested dichotomies is a method of decomposing a multi-class problem into a collection of binary problems. Such a system recursively applies binary splits to divide the set of classes into two subsets, and trains a binary classifier for each split. Although ensembles of nested dichotomies with random structure have been shown to perform well in practice, using a more sophisticated class subset selection method can be used to improve classification accuracy. We investigate an approach to this problem called random-pair selection, and evaluate its effectiveness compared to other published methods of subset selection. We show that our method outperforms other methods in many cases when forming ensembles of nested dichotomies, and is at least on par in all other cases. The software related to this paper is available at https://svn.cms.waikato.ac.nz/svn/weka/trunk/packages/ internal/ensemblesOfNestedDichotomies/

    A study of hierarchical and flat classification of proteins

    Get PDF
    Automatic classification of proteins using machine learning is an important problem that has received significant attention in the literature. One feature of this problem is that expert-defined hierarchies of protein classes exist and can potentially be exploited to improve classification performance. In this article we investigate empirically whether this is the case for two such hierarchies. We compare multi-class classification techniques that exploit the information in those class hierarchies and those that do not, using logistic regression, decision trees, bagged decision trees, and support vector machines as the underlying base learners. In particular, we compare hierarchical and flat variants of ensembles of nested dichotomies. The latter have been shown to deliver strong classification performance in multi-class settings. We present experimental results for synthetic, fold recognition, enzyme classification, and remote homology detection data. Our results show that exploiting the class hierarchy improves performance on the synthetic data, but not in the case of the protein classification problems. Based on this we recommend that strong flat multi-class methods be used as a baseline to establish the benefit of exploiting class hierarchies in this area

    Tree-structured multiclass probability estimators

    Get PDF
    Nested dichotomies are used as a method of transforming a multiclass classification problem into a series of binary problems. A binary tree structure is constructed over the label space that recursively splits the set of classes into subsets, and a binary classification model learns to discriminate between the two subsets of classes at each node. Several distinct nested dichotomy structures can be built in an ensemble for superior performance. In this thesis, we introduce two new methods for constructing more accurate nested dichotomies. Random-pair selection is a subset selection method that aims to group similar classes together in a non-deterministic fashion to easily enable the construction of accurate ensembles. Multiple subset evaluation takes this, and other subset selection methods, further by evaluating several different splits and choosing the best performing one. Finally, we also discuss the calibration of the probability estimates produced by nested dichotomies. We observe that nested dichotomies systematically produce under-confident predictions, even if the binary classifiers are well calibrated, and especially when the number of classes is high. Furthermore, substantial performance gains can be made when probability calibration methods are also applied to the internal models

    Ensembles of nested dichotomies with multiple subset evaluation

    Get PDF
    A system of nested dichotomies (NDs) is a method of decomposing a multiclass problem into a collection of binary problems. Such a system recursively applies binary splits to divide the set of classes into two subsets, and trains a binary classifier for each split. Many methods have been proposed to perform this split, each with various advantages and disadvantages. In this paper, we present a simple, general method for improving the predictive performance of NDs produced by any subset selection techniques that employ randomness to construct the subsets. We provide a theoretical expectation for performance improvements, as well as empirical results showing that our method improves the root mean squared error of NDs, regardless of whether they are employed as an individual model or in an ensemble setting

    Automated Machine Learning for Multi-Label Classification

    Get PDF

    On calibration of nested dichotomies

    Get PDF
    Nested dichotomies (NDs) are used as a method of transforming a multiclass classification problem into a series of binary problems. A tree structure is induced that recursively splits the set of classes into subsets, and a binary classification model learns to discriminate between the two subsets of classes at each node. In this paper, we demonstrate that these NDs typically exhibit poor probability calibration, even when the binary base models are well-calibrated. We also show that this problem is exacerbated when the binary models are poorly calibrated. We discuss the effectiveness of different calibration strategies and show that accuracy and log-loss can be significantly improved by calibrating both the internal base models and the full ND structure, especially when the number of classes is high

    Improving human movement sensing with micro models and domain knowledge

    Get PDF
    Human sensing is concerned with techniques for inferring information about humans from various sensing modalities. Examples of human sensing applications include human activity (or action) recognition, emotion recognition, tracking and localisation, identification, presence and motion detection, occupancy estimation, gesture recognition, and breath rate estimation. The first question addressed in this thesis is whether micro or macro models are a better design choice for human sensing systems. Micro models are models exclusively trained with data from a single entity, such as a Wi-Fi link, user, or other identifiable data-generating component. We consider micro and macro models in two human sensing applications, viz. Human Activity Recognition (HAR) from wearable inertial sensor data and device-free human presence detection from Wi-Fi signal data. The HAR literature is dominated by person-independent macro models. The few empirical studies that consider both micro and macro models evaluate them with either only one data-set or only one HAR algorithm, and report contradictory results. The device-free sensing literature is dominated by link-specific micro models, and the few papers that do use macro models do not evaluate their micro counterparts. Given the little and contradictory evidence, it remains an open question whether micro or macro models are a better design choice. We evaluate person-specific micro and person-independent macro models across seven HAR benchmark data-sets and four learning algorithms. We show that person-specific models (PSMs) significantly outperform the corresponding person-independent model (PIM) when evaluated with known users. To apply PSMs to data from new users, we propose ensembles of PSMs, which are improved by weighting their constituent PSMs according to their performance on other training users. We propose link-specific micro models to detect human presence from ambient Wi-Fi signal data. We select a link-specific model from the available training links, and show that this approach outperforms multi-link macro models. The second question addressed in this thesis is whether human sensing methods can be improved with domain knowledge. Specifically, we propose expert hierarchies (EHs) as an intuitive way to encode domain knowledge and simplify multi-class HAR, without negatively affecting predictive performance. The advantages of EHs are that they have lower time complexity than domain-agnostic methods and that their constituent classifiers are statistically independent. This property enables targeted tuning, and modular and iterative development of increasingly fine-grained HAR. Although this has inspired several uses of domain-specific hierarchical classification for HAR applications, these have been ad-hoc and without comparison to standard domain-agnostic methods. Therefore, it remains unclear whether they carry a penalty on predictive performance. We design five EHs and compare them to the best-known domain-agnostic methods. Our results show that EHs indeed can compete with more popular multi-class classification methods, both on the original multi-class problem and on the EHs' topmost levels
    corecore