480 research outputs found

    A multi-label approach for diagnosis problems in energy systems using LAMDA algorithm

    Get PDF
    2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 18-23 July 2022, Italia.In this paper, we propose a supervised multilabel algorithm called Learning Algorithm for Multivariate Data Analysis for Multilabel Classification (LAMDA-ML). This algorithm is based on the algorithms of the LAMDA family, in particular, on the LAMDA-HAD (Higher Adequacy Grade) algorithm. Unlike previous algorithms in a multi-label context, LAMDA-ML is based on the Global Adequacy Degree (GAD) of an individual in multiple classes. In our proposal, we define a membership threshold (Gt), such that for all GAD values above this threshold, it implies that an individual will be assigned to the respective classes. For the evaluation of the performance of this proposal, a solar power generation dataset is used, with very encouraging results according to several metrics in the context of multiple labels.European CommissionAgencia Estatal de InvestigaciónJunta de Comunidades de Castilla-La Manch

    Limitations of Transformers on Clinical Text Classification

    Get PDF
    Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several thousand words long. We compare these methods against two much simpler architectures -- a word-level convolutional neural network and a hierarchical self-attention network -- and show that BERT often cannot beat these simpler baselines when classifying MIMIC-III discharge summaries and SEER cancer pathology reports. In our analysis, we show that two key components of BERT -- pretraining and WordPiece tokenization -- may actually be inhibiting BERT\u27s performance on clinical text classification tasks where the input document is several thousand words long and where correctly identifying labels may depend more on identifying a few key words or phrases rather than understanding the contextual meaning of sequences of text

    Learning Interpretable Rules for Multi-label Classification

    Full text link
    Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer (2018). See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further informatio
    corecore