2,392 research outputs found
Multilabel Consensus Classification
In the era of big data, a large amount of noisy and incomplete data can be
collected from multiple sources for prediction tasks. Combining multiple models
or data sources helps to counteract the effects of low data quality and the
bias of any single model or data source, and thus can improve the robustness
and the performance of predictive models. Out of privacy, storage and bandwidth
considerations, in certain circumstances one has to combine the predictions
from multiple models or data sources to obtain the final predictions without
accessing the raw data. Consensus-based prediction combination algorithms are
effective for such situations. However, current research on prediction
combination focuses on the single label setting, where an instance can have one
and only one label. Nonetheless, data nowadays are usually multilabeled, such
that more than one label have to be predicted at the same time. Direct
applications of existing prediction combination methods to multilabel settings
can lead to degenerated performance. In this paper, we address the challenges
of combining predictions from multiple multilabel classifiers and propose two
novel algorithms, MLCM-r (MultiLabel Consensus Maximization for ranking) and
MLCM-a (MLCM for microAUC). These algorithms can capture label correlations
that are common in multilabel classifications, and optimize corresponding
performance metrics. Experimental results on popular multilabel classification
tasks verify the theoretical analysis and effectiveness of the proposed
methods
A triple-random ensemble classification method for mining multi-label data
This paper presents a triple-random ensemble learning method for handling multi-label classification problems. The proposed method integrates and develops the concepts of random subspace, bagging and random k-label sets ensemble learning methods to form an approach to classify multi-label data. It applies the random subspace method to feature space, label space as well as instance space. The devised subsets selection procedure is executed iteratively. Each multi-label classifier is trained using the randomly selected subsets. At the end of the iteration, optimal parameters are selected and the ensemble MLC classifiers are constructed. The proposed method is implemented and its performance compared against that of popular multi-label classification methods. The experimental results reveal that the proposed method outperforms the examined counterparts in most occasions when tested on six small to larger multi-label datasets from different domains. This demonstrates that the developed method possesses general applicability for various multi-label classification problems.<br /
- …