Search CORE

480 research outputs found

A multi-label approach for diagnosis problems in energy systems using LAMDA algorithm

Author: Aguilar Castro José Lisandro
Quintero Gull Carlos
Rodríguez Moreno María Dolores
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/07/2022
Field of study

2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 18-23 July 2022, Italia.In this paper, we propose a supervised multilabel algorithm called Learning Algorithm for Multivariate Data Analysis for Multilabel Classification (LAMDA-ML). This algorithm is based on the algorithms of the LAMDA family, in particular, on the LAMDA-HAD (Higher Adequacy Grade) algorithm. Unlike previous algorithms in a multi-label context, LAMDA-ML is based on the Global Adequacy Degree (GAD) of an individual in multiple classes. In our proposal, we define a membership threshold (Gt), such that for all GAD values above this threshold, it implies that an individual will be assigned to the respective classes. For the evaluation of the performance of this proposal, a solar power generation dataset is used, with very encouraging results according to several metrics in the context of multiple labels.European CommissionAgencia Estatal de InvestigaciónJunta de Comunidades de Castilla-La Manch

e_Buah - Biblioteca Digital de la Universidad de Alcalá

Limitations of Transformers on Clinical Text Classification

Author: Alawad Mohammed
Coyle Linda
Doherty Jennifer
Durbin Eric B.
Gao Shang
Gounley John
Schaefferkoetter Noah
Stroup Antoinette
Tourassi Georgia D.
Wu Xiao-Cheng
Yoon Hong-Jun
Young Michael Todd
Publication venue: UKnowledge
Publication date: 26/02/2021
Field of study

Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several thousand words long. We compare these methods against two much simpler architectures -- a word-level convolutional neural network and a hierarchical self-attention network -- and show that BERT often cannot beat these simpler baselines when classifying MIMIC-III discharge summaries and SEER cancer pathology reports. In our analysis, we show that two key components of BERT -- pretraining and WordPiece tokenization -- may actually be inhibiting BERT\u27s performance on clinical text classification tasks where the input document is several thousand words long and where correctly identifying labels may depend more on identifying a few key words or phrases rather than understanding the contextual meaning of sequences of text

PubMed Central

University of Kentucky

Recommended from our members

Combined supervised and unsupervised learning to identify subclasses of disease for better prediction

Author: Alsaid Alyousef Awad
Publication venue: Brunel University London
Publication date: 01/01/2022
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonDisease subtyping, which aids in the development of personalised treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if I can identify subclasses of disease, this will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. In addition, patients might suffer from multiple disease complications. Models that are tailored to individuals could improve both prediction of multiple complications and understanding of underlying disease characteristics. However, AI models can become outdated over time due to either sudden changes in the underlying data, such as those caused by new measurement methods, or incremental changes, such as the ageing of the study population. This thesis proposes a new algorithm that integrates consensus clustering methods with classification in order to overcome issues with sample bias. The method was tested on a freely available dataset of real-world breast cancer cases and data from a London hospital on systemic sclerosis, a rare and potentially fatal condition. The results show that nearest consensus clustering classification improves accuracy and prediction significantly when this algorithm is compared with competitive similar methods. In addition, this thesis proposes a new algorithm that integrates latent class models with classification. The new algorithm uses latent class models to cluster patients within groups; this results in improved classification and aids in the understanding of the underlying differences of the discovered groups. The method was tested on data from patients with systemic sclerosis (SSc), a rare and potentially fatal condition, and coronary heart disease. Results show that the latent class multi-label classification (MLC) model improves accuracy when compared with competitive similar methods. Finally, this thesis implemented the updated concept drift method (DDM) to monitor AI models over time and detect drifts when they occur. The method was tested on data from patients with SSc and patients with coronavirus disease (COVID)

Brunel University Research Archive

Learning Interpretable Rules for Multi-label Classification

Author: A Gabriel
AA Freitas
AJ Knobbe
B Liu
B Minnaert
D Malerba
E Gibaja
E Gibaja
E Loza Mencía
E Montañés
F Charte
F Herrera
F Janssen
F Thabtah
G Bosc
G Tsoumakas
Grigorios Tsoumakas
H Allahyari
J Arunadevi
J Demšar
J Fürnkranz
J Han
J Hipp
J Read
JN Sulzmann
K Dembczyński
K Dembczyński
L Chekina
L Raedt De
LE Sucar
M Atzmüller
M Beckerle
M Friedman
M Zhang
Miltiadis Allamanis
MR Boutell
P Kralj Novak
PJ Hayes
R Senge
RM Cameron-Jones
Shantanu Godbole
W Duivesteijn
W Waegeman
WW Cohen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2018
Field of study

Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer (2018). See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further informatio

arXiv.org e-Print Archive

TUbiblio

Crossref

Multilabel classification via calibrated label ranking

Author: A. Elisseeff
B.-L. Lu
C. W. Coakley
C.-W. Hsu
D. D. Lewis
D. Price
E. Loza Mencía
Eneldo Loza Mencía
Eyke Hüllermeier
G. Salton
G. Tsoumakas
I. Tsochantaridis
J. Demšar
J. Fürnkranz
J. Fürnkranz
J. Fürnkranz
J. Putter
J. Rousu
Johannes Fürnkranz
K. Crammer
Klaus Brinker
M. R. Boutell
N. Weskamp
O. Dekel
Q. McNemar
R. A. Bradley
R. E. Schapire
R. E. Schapire
S. Har-Peled
S. Knerr
S. Knerr
S. Shalev-Shwartz
S.-H. Park
T. Gärtner
T. Hastie
U. H.-G. Kreßel
Y. Altun
Y. Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref