Search CORE

4,462 research outputs found

A triple-random ensemble classification method for mining multi-label data

Author: Kouzani Abbas Z.
Nasierding Gulisong
Tsoumakas Grigorios
Publication venue: IEEE Computer Society
Publication date: 01/01/2010
Field of study

This paper presents a triple-random ensemble learning method for handling multi-label classification problems. The proposed method integrates and develops the concepts of random subspace, bagging and random k-label sets ensemble learning methods to form an approach to classify multi-label data. It applies the random subspace method to feature space, label space as well as instance space. The devised subsets selection procedure is executed iteratively. Each multi-label classifier is trained using the randomly selected subsets. At the end of the iteration, optimal parameters are selected and the ensemble MLC classifiers are constructed. The proposed method is implemented and its performance compared against that of popular multi-label classification methods. The experimental results reveal that the proposed method outperforms the examined counterparts in most occasions when tested on six small to larger multi-label datasets from different domains. This demonstrates that the developed method possesses general applicability for various multi-label classification problems.<br /

Deakin Research Online

Triple random ensemble method for multi-label classification

Author: Kouzani Abbas Z.
Nasierding Gulisong
Tsoumakas Grigorios
Publication venue: 'Deakin University'
Publication date: 01/01/2010
Field of study

Deakin Research Online

Empirical study of multi-label classification methods for image annotation and retrieval

Author: Kouzani Abbas Z.
Nasierding Gulisong
Publication venue: DICTA
Publication date: 01/01/2010
Field of study

This paper presents an empirical study of multi-label classification methods, and gives suggestions for multi-label classification that are effective for automatic image annotation applications. The study shows that triple random ensemble multi-label classification algorithm (TREMLC) outperforms among its counterparts, especially on scene image dataset. Multi-label k-nearest neighbor (ML-kNN) and binary relevance (BR) learning algorithms perform well on Corel image dataset. Based on the overall evaluation results, examples are given to show label prediction performance for the algorithms using selected image examples. This provides an indication of the suitability of different multi-label classification methods for automatic image annotation under different problem settings.<br /

Deakin Research Online

Learning from Imbalanced Multi-label Data Sets by Using Ensemble Strategies

Author: Javidi Mohammad Masoud
Shamsezat Fatemeh
Publication venue: 'Faculty of Computer Science, Sriwijaya University'
Publication date: 18/02/2015
Field of study

Multi-label classification is an extension of conventional classification in which a single instance can be associated with multiple labels. Problems of this type are ubiquitous in everyday life. Such as, a movie can be categorized as action, crime, and thriller. Most algorithms on multi-label classification learning are designed for balanced data and donâ€™t work well on imbalanced data. On the other hand, in real applications, most datasets are imbalanced. Therefore, we focused to improve multi-label classification performance on imbalanced datasets. In this paper, a state-of-the-art multi-label classification algorithm, which called IBLR_ML, is employed. This algorithm is produced from combination of k-nearest neighbor and logistic regression algorithms. Logistic regression part of this algorithm is combined with two ensemble learning algorithms, Bagging and Boosting. My approach is called IB-ELR. In this paper, for the first time, the ensemble bagging method whit stable learning as the base learner and imbalanced data sets as the training data is examined. Finally, to evaluate the proposed methods; they are implemented in JAVA language. Experimental results show the effectiveness of proposed methods. Keywords: Multi-label classification, Imbalanced data set, Ensemble learning, Stable algorithm, Logistic regression, Bagging, Boostin

ComEngApp-Journal

Computer Engineering and Applications Journal (ComEngApp, Universitas Sriwijaya)

CHIRPS: Explaining random forest classification

Author: Azad R. Muhammad Atif
Gaber Mohamed Medhat
Hatwell Julian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/06/2020
Field of study

Modern machine learning methods typically produce “black box” models that are opaque to interpretation. Yet, their demand has been increasing in the Human-in-the-Loop pro-cesses, that is, those processes that require a human agent to verify, approve or reason about the automated decisions before they can be applied. To facilitate this interpretation, we propose Collection of High Importance Random Path Snippets (CHIRPS); a novel algorithm for explaining random forest classification per data instance. CHIRPS extracts a decision path from each tree in the forest that contributes to the majority classification, and then uses frequent pattern mining to identify the most commonly occurring split conditions. Then a simple, conjunctive form rule is constructed where the antecedent terms are derived from the attributes that had the most influence on the classification. This rule is returned alongside estimates of the rule’s precision and coverage on the training data along with counter-factual details. An experimental study involving nine data sets shows that classification rules returned by CHIRPS have a precision at least as high as the state of the art when evaluated on unseen data (0.91–0.99) and offer a much greater coverage (0.04–0.54). Furthermore, CHIRPS uniquely controls against under- and over-fitting solutions by maximising novel objective functions that are better suited to the local (per instance) explanation setting

Birmingham City University Open Access Repository

BCU Open Access