Search CORE

9 research outputs found

Mining Frequent Distributions in Time Series

Author: Coutinho José Carlos
de Sá Cláudio Rebelo
Moreira João Mendes
Publication venue: Springer
Publication date: 01/01/2019
Field of study

Crossref

University of Twente Research Information

Discovering a taste for the unusual: exceptional models for preference mining

Author: Alípio Mário Jorge
Arno Knobbe
Carlos Soares
Cláudio Rebelo de Sá
CR Sá de
CR Sá de
E Hüllermeier
F Chiclana
F M Harper
J Chomicki
L Umek
M Leeuwen van
N Jin
N Lavrac
P Brazdil
Paulo Azevedo
PJ Azevedo
V Svendová
W Duivesteijn
WD Cook
WD Cook
Wouter Duivesteijn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Exceptional preferences mining (EPM) is a crossover between two subfields of data mining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where some preference relations between labels significantly deviate from the norm. It is a variant of subgroup discovery, with rankings of labels as the target concept. We employ several quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes exceptional' varies with the quality measure: two measures look for exceptional overall ranking behavior, one measure indicates whether a particular label stands out from the rest, and a fourth measure highlights subgroups with unusual pairwise label ranking behavior. We explore a few datasets and compare with existing techniques. The results confirm that the new task EPM can deliver interesting knowledge.This research has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation Horizon 2020 (2014-2020) under Grant Agreement Number 662189-MANTIS-2014-1

Universidade do Minho: RepositoriUM

Repository TU/e

Crossref

Pure OAI Repository

Leiden University Scholary Publications

Variance-Based Feature Importance in Neural Networks

Author: de Sá Cláudio Rebelo
Džeroski Sašo
Kralj Novak Petra
Šmuc Tomislav
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/10/2019
Field of study

This paper proposes a new method to measure the relative importance of features in Artificial Neural Networks (ANN) models. Its underlying principle assumes that the more important a feature is, the more the weights, connected to the respective input neuron, will change during the training of the model. To capture this behavior, a running variance of every weight connected to the input layer is measured during training. For that, an adaptation of Welford’s online algorithm for computing the online variance is proposed. When the training is finished, for each input, the variances of the weights are combined with the final weights to obtain the measure of relative importance for each feature. This method was tested with shallow and deep neural network architectures on several well-known classification and regression problems. The results obtained confirm that this approach is making meaningful measurements. Moreover, results showed that the importance scores are highly correlated with the variable importance method from Random Forests (RF)

An Ensemble of Autonomous Auto-Encoders for Human Activity Recognition

Author: Cardoso João M.P.
Carvalho Tiago
de Carvalho André C.P.L.F.
Dearo Garcia Kemilly
Kok Joost N.
Mendes-Moreira João
Poel Mannes
Rebelo de Sá Cláudio
Publication venue: 'Elsevier BV'
Publication date: 07/06/2021
Field of study

Human Activity Recognition is focused on the use of sensing technology to classify human activities and to infer human behavior. While traditional machine learning approaches use hand-crafted features to train their models, recent advancements in neural networks allow for automatic feature extraction. Auto-encoders are a type of neural network that can learn complex representations of the data and are commonly used for anomaly detection. In this work we propose a novel multi-class algorithm which consists of an ensemble of auto-encoders where each auto-encoder is associated with a unique class. We compared the proposed approach with other state-of-the-art approaches in the context of human activity recognition. Experimental results show that ensembles of auto-encoders can be efficient, robust and competitive. Moreover, this modular classifier structure allows for more flexible models. For example, the extension of the number of classes, by the inclusion of new auto-encoders, without the necessity to retrain the whole model

University of Twente Research Information

Mining Frequent Distributions in Time Series

Author: Allmendinger Richard
Camacho David
Coutinho José Carlos
de Sá Cláudio Rebelo
Menezes Ronaldo
Moreira João Mendes
Tallón-Ballesteros Antonio J.
Tino Peter
Yin Hujun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Time series data is composed of observations of one or more variables along a time period. By analyzing the variability of the variables we can reveal patterns that repeat or that are correlated, which helps to understand the behaviour of the variables over time. Our method finds frequent distributions of a target variable in time series data and discovers relationships between frequent distributions in consecutive time intervals. The frequent distributions are found using a new method, and relationships between them are found using association rules mining

Ensemble Clustering for Novelty Detection in Data Streams

Author: Aggarwal Charu C.
de Carvalho André C.P.L.F.
de Faria Elaine Ribeiro
de Sá Cláudio Rebelo
Džeroski Sašo
Garcia Kemilly Dearo
Kok Joost N.
Kralj Novak Petra
Mendes-Moreira João
Šmuc Tomislav
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In data streams new classes can appear over time due to changes in the data statistical distribution. Consequently, models can become outdated, which requires the use of incremental learning algorithms capable of detecting and learning the changes over time. However, when a single classification model is used for novelty detection, there is a risk that its bias may not be suitable for new data distributions. A solution could be the combination of several models into an ensemble. Besides, because models can only be updated when labeled data arrives, we propose two unsupervised ensemble approaches: one combining clustering partitions using the same clustering technique; and other using different clustering techniques. We compare the performance of the proposed methods with well known novelty detection algorithms. The methods were tested on datasets commonly used in the novelty detection literature. The experimental results show that proposed ensembles have competitive performance for novelty detection in data streams

Providing proactiveness: Data analysis techniques portfolios

Pure OAI Repository