Search CORE

396,772 research outputs found

Learning Interpretable Rules for Multi-label Classification

Author: A Gabriel
AA Freitas
AJ Knobbe
B Liu
B Minnaert
D Malerba
E Gibaja
E Gibaja
E Loza Mencía
E Montañés
F Charte
F Herrera
F Janssen
F Thabtah
G Bosc
G Tsoumakas
Grigorios Tsoumakas
H Allahyari
J Arunadevi
J Demšar
J Fürnkranz
J Han
J Hipp
J Read
JN Sulzmann
K Dembczyński
K Dembczyński
L Chekina
L Raedt De
LE Sucar
M Atzmüller
M Beckerle
M Friedman
M Zhang
Miltiadis Allamanis
MR Boutell
P Kralj Novak
PJ Hayes
R Senge
RM Cameron-Jones
Shantanu Godbole
W Duivesteijn
W Waegeman
WW Cohen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2018
Field of study

Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer (2018). See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further informatio

arXiv.org e-Print Archive

TUbiblio

Crossref

Ontology-based explanation of classifiers

Author: Catarci T.
Cima G.
Croce F.
Lenzerini M.
Publication venue: CEUR-WS
Publication date: 01/01/2020
Field of study

The rise of data mining and machine learning use in many applications has brought new challenges related to classification. Here, we deal with the following challenge: how to interpret and understand the reason behind a classifier's prediction. Indeed, understanding the behaviour of a classifier is widely recognized as a very important task for wide and safe adoption of machine learning and data mining technologies, especially in high-risk domains, and in dealing with bias.We present a preliminary work on a proposal of using the Ontology-Based Data Management paradigm for explaining the behavior of a classifier in terms of the concepts and the relations that are meaningful in the domain that is relevant for the classifier

Archivio della ricerca- Università di Roma La Sapienza

Machine Learning Techniques for Cervigram Image Analysis

Author: Xin Cheng
Publication venue: Lehigh Preserve
Publication date
Field of study

Machine learning is a popular technology widely used to solve a lot of problems in various areas in recent decades. In this work, we applied machine learning techniques to the problems of medical image analysis, especially cervigram image analysis. Combined with techniques developed in computer vision, we represent cervigram image data in the form of a combination of texture feature vector and color feature vector. We treat the task of detecting Cervical Intraepithelial Neoplasia (CIN) level as a classification problem in the view of machine learning and apply several popular machine learning classifiers to predict the categories. Furthermore, under receiver operating characteristic (ROC) curve as our performance measure, we do a comprehensive comparison among seven machine learning classification algorithms to see which ones might be suitable models for this kind of problems. From our experiments, we conjecture that the machine learning techniques can be a useful tool and ensemble-tree based models like Random Forest, Gradient Boosting Decision Tree and Adaboost outperform other algorithms for this task

Lehigh University: Lehigh Preserve

Automatic detection of procedural knowledge in robotic-assisted surgical texts

Author: Diego Dall'Alba
Marco Bombieri
Marco Rospocher
Paolo Fiorini
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Purpose The automatic extraction of knowledge about intervention execution from surgical manuals would be of the utmost importance to develop expert surgical systems and assistants. In this work we assess the feasibility of automatically identifying the sentences of a surgical intervention text containing procedural information, a subtask of the broader goal of extracting intervention workflows from surgical manuals. Methods We frame the problem as a binary classification task. We first introduce a new public dataset of 1958 sentences from robotic surgery texts, manually annotated as procedural or non-procedural. We then apply different classification methods, from classical machine learning algorithms, to more recent neural-network approaches and classification methods exploiting transformers (e.g., BERT, ClinicalBERT). We also analyze the benefits of applying balancing techniques to the dataset. Results The architectures based on neural-networks fed with FastText’s embeddings and the one based on ClinicalBERT outperform all the tested methods, empirically confirming the feasibility of the task. Adopting balancing techniques does not lead to substantial improvements in classification. Conclusion This is the first work experimenting with machine / deep learning algorithms for automatically identifying procedural sentences in surgical texts. It also introduces the first public dataset that can be used for benchmarking different classification methods for the task

PubMed Central

Catalogo dei prodotti della ricerca

Large-Scale Online Semantic Indexing of Biomedical Articles via an Ensemble of Multi-Label Classification Models

Author: Laliotis Manos
Markantonatos Nikos
Papanikolaou Yannis
Tsoumakas Grigorios
Vlahavas Ioannis
Publication venue
Publication date: 18/04/2017
Field of study

Background: In this paper we present the approaches and methods employed in order to deal with a large scale multi-label semantic indexing task of biomedical papers. This work was mainly implemented within the context of the BioASQ challenge of 2014. Methods: The main contribution of this work is a multi-label ensemble method that incorporates a McNemar statistical significance test in order to validate the combination of the constituent machine learning algorithms. Some secondary contributions include a study on the temporal aspects of the BioASQ corpus (observations apply also to the BioASQ's super-set, the PubMed articles collection) and the proper adaptation of the algorithms used to deal with this challenging classification task. Results: The ensemble method we developed is compared to other approaches in experimental scenarios with subsets of the BioASQ corpus giving positive results. During the BioASQ 2014 challenge we obtained the first place during the first batch and the third in the two following batches. Our success in the BioASQ challenge proved that a fully automated machine-learning approach, which does not implement any heuristics and rule-based approaches, can be highly competitive and outperform other approaches in similar challenging contexts

arXiv.org e-Print Archive

Directory of Open Access Journals

An optimized multi-layer ensemble framework for sentiment analysis

Author: Alfred Rayner
Lai Po Hung
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Public opinion plays an important role in decision making tasks of various fields. Sentiment Analysis is a key task in summarizing sentiment opinions as it classifies opinion documents according to its sentiment group of positive and negative. Machine learning based classification is efficient and versatile. The ensemble concept is used to improve classification accuracy by combining the decision of multiple classifiers. In this work, a framework for sentiment analysis is designed to extend the concept of ensemble upon all subtasks of machine learning classification in order to achieve better analysis. There are 3 subtasks in machine learning based sentiment analysis which are feature extraction, feature selection and classification. The ensemble concept is applied to all 3 tasks by combining different methods to perform the tasks and combine their results. optimization is performed by using Genetic Algorithm to find the combination of methods that could perform better. The proposed framework is tested on 4 different domain datasets and the sentiment analysis accuracy is shown to be very high. Future works includes testing the framework on different domains of classification and different optimization algorithm

UMS Institutional Repository