Search CORE

377 research outputs found

Efficient multi-label classification for evolving data streams

Author: Bifet Albert
Holmes Geoffrey
Pfahringer Bernhard
Read Jesse
Publication venue: University of Waikato, Department of Computer Science
Publication date: 01/05/2010
Field of study

Many real world problems involve data which can be considered as multi-label data streams. Efficient methods exist for multi-label classification in non streaming scenarios. However, learning in evolving streaming scenarios is more challenging, as the learners must be able to adapt to change using limited time and memory. This paper proposes a new experimental framework for studying multi-label evolving stream classification, and new efficient methods that combine the best practices in streaming scenarios with the best practices in multi-label classification. We present a Multi-label Hoeffding Tree with multilabel classifiers at the leaves as a base classifier. We obtain fast and accurate methods, that are well suited for this challenging multi-label classification streaming task. Using the new experimental framework, we test our methodology by performing an evaluation study on synthetic and real-world datasets. In comparison to well-known batch multi-label methods, we obtain encouraging results

Research Commons@Waikato

Nuevas tendencias en fundamentos teóricos y aplicaciones de la minería de datos aplicada a la clasificación de textos en lenguaje natural

Author: Carmona Cejudo José María
Publication venue: 'Universidad de Malaga'
Publication date: 01/01/2013
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Málaga

Recommended from our members

Combined supervised and unsupervised learning to identify subclasses of disease for better prediction

Author: Alsaid Alyousef Awad
Publication venue: Brunel University London
Publication date: 01/01/2022
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonDisease subtyping, which aids in the development of personalised treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if I can identify subclasses of disease, this will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. In addition, patients might suffer from multiple disease complications. Models that are tailored to individuals could improve both prediction of multiple complications and understanding of underlying disease characteristics. However, AI models can become outdated over time due to either sudden changes in the underlying data, such as those caused by new measurement methods, or incremental changes, such as the ageing of the study population. This thesis proposes a new algorithm that integrates consensus clustering methods with classification in order to overcome issues with sample bias. The method was tested on a freely available dataset of real-world breast cancer cases and data from a London hospital on systemic sclerosis, a rare and potentially fatal condition. The results show that nearest consensus clustering classification improves accuracy and prediction significantly when this algorithm is compared with competitive similar methods. In addition, this thesis proposes a new algorithm that integrates latent class models with classification. The new algorithm uses latent class models to cluster patients within groups; this results in improved classification and aids in the understanding of the underlying differences of the discovered groups. The method was tested on data from patients with systemic sclerosis (SSc), a rare and potentially fatal condition, and coronary heart disease. Results show that the latent class multi-label classification (MLC) model improves accuracy when compared with competitive similar methods. Finally, this thesis implemented the updated concept drift method (DDM) to monitor AI models over time and detect drifts when they occur. The method was tested on data from patients with SSc and patients with coronavirus disease (COVID)

Brunel University Research Archive

A Novel Progressive Multi-label Classifier for Classincremental Data

Author: Dave Mihika
Er Meng Joo
Tapiawala Sahil
Venkatesan Rajasekar
Publication venue
Publication date: 22/09/2016
Field of study

In this paper, a progressive learning algorithm for multi-label classification to learn new labels while retaining the knowledge of previous labels is designed. New output neurons corresponding to new labels are added and the neural network connections and parameters are automatically restructured as if the label has been introduced from the beginning. This work is the first of the kind in multi-label classifier for class-incremental learning. It is useful for real-world applications such as robotics where streaming data are available and the number of labels is often unknown. Based on the Extreme Learning Machine framework, a novel universal classifier with plug and play capabilities for progressive multi-label classification is developed. Experimental results on various benchmark synthetic and real datasets validate the efficiency and effectiveness of our proposed algorithm.Comment: 5 pages, 3 figures, 4 table

arXiv.org e-Print Archive

Crossref