Search CORE

23 research outputs found

KFHE-HOMER: A multi-label ensemble classification algorithm exploiting sensor fusion properties of the Kalman filter

Author: Mac Namee Brian
Pakrashi Arjun
Publication venue
Publication date: 11/11/2019
Field of study

Multi-label classification allows a datapoint to be labelled with more than one class at the same time. In spite of their success in multi-class classification problems, ensemble methods based on approaches other than bagging have not been widely explored for multi-label classification problems. The Kalman Filter-based Heuristic Ensemble (KFHE) is a recent ensemble method that exploits the sensor fusion properties of the Kalman filter to combine several classifier models, and that has been shown to be very effective. This article proposes KFHE-HOMER, an extension of the KFHE ensemble approach to the multi-label domain. KFHE-HOMER sequentially trains multiple HOMER multi-label classifiers and aggregates their outputs using the sensor fusion properties of the Kalman filter. Experiments described in this article show that KFHE-HOMER performs consistently better than existing multi-label methods including existing approaches based on ensembles.Comment: The paper is under consideration at Pattern Recognition Letters, Elsevie

arXiv.org e-Print Archive

Patient classification of hypertension in Traditional Chinese Medicine using multi-label learning techniques

Author: Ai-Hua Ou
Feng-Feng Shao
Guo-Zheng Li
Xiao-Zhong Lin
Zehui He
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector

A triple-random ensemble classification method for mining multi-label data

Author: Kouzani Abbas Z.
Nasierding Gulisong
Tsoumakas Grigorios
Publication venue: IEEE Computer Society
Publication date: 01/01/2010
Field of study

This paper presents a triple-random ensemble learning method for handling multi-label classification problems. The proposed method integrates and develops the concepts of random subspace, bagging and random k-label sets ensemble learning methods to form an approach to classify multi-label data. It applies the random subspace method to feature space, label space as well as instance space. The devised subsets selection procedure is executed iteratively. Each multi-label classifier is trained using the randomly selected subsets. At the end of the iteration, optimal parameters are selected and the ensemble MLC classifiers are constructed. The proposed method is implemented and its performance compared against that of popular multi-label classification methods. The experimental results reveal that the proposed method outperforms the examined counterparts in most occasions when tested on six small to larger multi-label datasets from different domains. This demonstrates that the developed method possesses general applicability for various multi-label classification problems.<br /

Deakin Research Online

An Ensemble Multilabel Classification for Disease Risk Prediction

Author: Li Runzhi
Lin Yusong
Liu Wei
Zhang Chaoyang
Zhao Hongling
Publication venue: The Aquila Digital Community
Publication date: 01/01/2017
Field of study

It is important to identify and prevent disease risk as early as possible through regular physical examinations. We formulate the disease risk prediction into a multilabel classification problem. A novel Ensemble Label Power-set Pruned datasets Joint Decomposition (ELPPJD) method is proposed in this work. First, we transform the multilabel classification into a multiclass classification. Then, we propose the pruned datasets and joint decomposition methods to deal with the imbalance learning problem. Two strategies size balanced (SB) and label similarity (LS) are designed to decompose the training dataset. In the experiments, the dataset is from the real physical examination records. We contrast the performance of the ELPPJD method with two different decomposition strategies. Moreover, the comparison between ELPPJD and the classic multilabel classification methods RAkEL and HOMER is carried out. The experimental results show that the ELPPJD method with label similarity strategy has outstanding performance

Aquila Digital Community (University of Southern Mississippi, USM)

Directory of Open Access Journals

Multi-label learning by extended multi-tier stacked ensemble method with label correlated feature subset augmentation

Author: Aparna Ramalingappa
Devi Visweswariah Susheela
Hemavati Hemavati
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/06/2023
Field of study

Classification is one of the basic and most important operations that can be used in data science and machine learning applications. Multi-label classification is an extension of the multi-class problem where a set of class labels are associated with a particular instance at a time. In a multiclass problem, a single class label is associated with an instance at a time. However, there are many different stacked ensemble methods that have been proposed and because of the complexity associated with the multi-label problems, there is still a lot of scope for improving the prediction accuracy. In this paper, we are proposing the novel extended multi-tier stacked ensemble (EMSTE) method with label correlationby feature subset selection technique and then augmenting those feature subsets while constructing the intermediate dataset for improving the prediction accuracy in the generalization phase of the stacking. The performance effect of the proposed method has been compared with existing methods and showed that our proposed method outperforms the other methods

ZENODO

Institute of Advanced Engineering and Science

Large scale biomedical texts classification: a kNN and an ESA-based approaches

Author: Diallo Gayo
Dramé Khadim
Mougin Fleur
Publication venue
Publication date: 09/06/2016
Field of study

With the large and increasing volume of textual data, automated methods for identifying significant topics to classify textual documents have received a growing interest. While many efforts have been made in this direction, it still remains a real challenge. Moreover, the issue is even more complex as full texts are not always freely available. Then, using only partial information to annotate these documents is promising but remains a very ambitious issue. MethodsWe propose two classification methods: a k-nearest neighbours (kNN)-based approach and an explicit semantic analysis (ESA)-based approach. Although the kNN-based approach is widely used in text classification, it needs to be improved to perform well in this specific classification problem which deals with partial information. Compared to existing kNN-based methods, our method uses classical Machine Learning (ML) algorithms for ranking the labels. Additional features are also investigated in order to improve the classifiers' performance. In addition, the combination of several learning algorithms with various techniques for fixing the number of relevant topics is performed. On the other hand, ESA seems promising for this classification task as it yielded interesting results in related issues, such as semantic relatedness computation between texts and text classification. Unlike existing works, which use ESA for enriching the bag-of-words approach with additional knowledge-based features, our ESA-based method builds a standalone classifier. Furthermore, we investigate if the results of this method could be useful as a complementary feature of our kNN-based approach.ResultsExperimental evaluations performed on large standard annotated datasets, provided by the BioASQ organizers, show that the kNN-based method with the Random Forest learning algorithm achieves good performances compared with the current state-of-the-art methods, reaching a competitive f-measure of 0.55% while the ESA-based approach surprisingly yielded reserved results.ConclusionsWe have proposed simple classification methods suitable to annotate textual documents using only partial information. They are therefore adequate for large multi-label classification and particularly in the biomedical domain. Thus, our work contributes to the extraction of relevant information from unstructured documents in order to facilitate their automated processing. Consequently, it could be used for various purposes, including document indexing, information retrieval, etc.Comment: Journal of Biomedical Semantics, BioMed Central, 201

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

HAL-Inserm

PubMed Central

Gravitation Theory Based Model for Multi-Label Classification

Author: Liu Yongguo
Peng Liwen
Publication venue: Agora University Press
Publication date: 10/09/2017
Field of study

The past decade has witnessed the growing popularity in multi-label classification algorithms in the fields like text categorization, music information retrieval, and the classification of videos and medical proteins. In the meantime, the methods based on the principle of universal gravitation have been extensively used in the classification of machine learning owing to simplicity and high performance. In light of the above, this paper proposes a novel multi-label classification algorithm called the interaction and data gravitation-based model for multi-label classification (ITDGM). The algorithm replaces the interaction between two objects with the attraction between two particles. The author carries out a series of experiments on five multi-label datasets. The experimental results show that the ITDGM performs better than some well-known multi-label classification algorithms. The effect of the proposed model is assessed by the example-based F1-Measure and Label-based micro F1-measure

Agora University Editing House: Journals

DRABAL: novel method to mine large high-throughput screening assays using Bayesian active learning

Author
Publication venue: Springer
Publication date: 10/11/2016
Field of study

Springer - Publisher Connector

Why do Sequence Signatures Predict Enzyme Mechanism?:Homology versus Chemistry

Author: IUBMB.
Nikolskaya A.N.
Tsoumakas G.
Witten I.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2015
Field of study

We identify, firstly, InterPro sequence signatures representing evolutionary relatedness and, secondly, signatures identifying specific chemical machinery. Thus, we predict the chemical mechanisms of enzyme catalysed reactions from “catalytic” and “non-catalytic” subsets of InterPro signatures. We first scanned our 249 sequences with InterProScan and then used the MACiE database to identify those amino acid residues which are important for catalysis. The sequences were mutated in silico to replace these catalytic residues with glycine, and then again scanned with InterProScan. Those signature matches from the original scan which disappeared on mutation were called “catalytic”. Mechanism was predicted using all signatures, only the 78 “catalytic” signatures, or only the 519 “non-catalytic” signatures. The noncatalytic signatures gave results indistinguishable from those for the whole feature set, with precision of 0.991 and sensitivity of 0.970. The catalytic signatures alone gave less impressive predictivity, with precision and sensitivity of 0.791 and 0.735, respectively. These results show that our successful prediction of enzyme mechanism is mostly by homology rather than by identifying catalytic machinery.Publisher PDFPeer reviewe

Crossref

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

University of St. Andrews - Pure

St Andrews Research Repository