Search CORE

259 research outputs found

Versatile Decision Trees for Learning Over Multiple Contexts

Author: J Alcalá-Fdez
J Demšar
JG Moreno-Torres
JG Moreno-Torres
M Sugiyama
S Bickel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2015
Field of study

Crossref

Explore Bristol Research

A Genetic Tuning to Improve the Performance of Fuzzy Rule-Based Classification Systems with Interval-Valued Fuzzy Sets: Degree of Ignorance and Lateral Position

Author: A. Fernández
Akbarzadeh-Totonchi
Alcalá
Alcalá
Alcalá
Alcalá
Alcalá-Fdez
Alcalá-Fdez
Antonelli
Bustince
Bustince
Bustince
Casillas
Cazarez-Castro
Celikyilmaz
Cordón
Cordón
Cordón
Cortes
Coupland
delaOssa
Demšar
Deschrijver
F. Herrera
Fernández
Fernández
Gacto
García
García
García
Gorlzakczany
H. Bustince
Herrera
Herrera
Herrera
Hidalgo
Ishibuchi
Ishibuchi
Ishibuchi
Ishibuchi
J. Sanz
Kaya
Liang
Luengo
Mansoori
Miller
Nojima
Palacios
Park
Sanz
Schaefer
Sheskin
Starczewski
Walker
Wu
Wu
Zarandi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Fuzzy Rule-Based Systems are appropriate tools to deal with classification problems due to their good properties. However, they can suffer a lack of system accuracy as a result of the uncertainty inherent in the definition of the membership functions and the limitation of the homogeneous distribution of the linguistic labels. The aim of the paper is to improve the performance of Fuzzy Rule-Based Classification Systems by means of the Theory of Interval-Valued Fuzzy Sets and a post-processing genetic tuning step. In order to build the Interval-Valued Fuzzy Sets we define a new function called weak ignorance for modeling the uncertainty associated with the definition of the membership functions. Next, we adapt the fuzzy partitions to the problem in an optimal way through a cooperative evolutionary tuning in which we handle both the degree of ignorance and the lateral position (based on the 2-tuples fuzzy linguistic representation) of the linguistic labels. The experimental study is carried out over a large collection of data-sets and it is supported by a statistical analysis. Our results show empirically that the use of our methodology outperforms the initial Fuzzy Rule-Based Classification System. The application of our cooperative tuning enhances the results provided by the use of the isolated tuning approaches and also improves the behavior of the genetic tuning based on the 3-tuples fuzzy linguistic representation.Spanish Government TIN2008-06681-C06-01 TIN2010-1505

Elsevier - Publisher Connector

Crossref

Repositorio Institucional Universidad de Granada

Academica-e

A Sensitivity Analysis for Quality Measures of Quantitative Association Rules

Author: B. Alatas
B. Alatas
D. Li
J. Alcalá-Fdez
M. Martínez-Ballesteros
M. Martínez-Ballesteros
R. Pears
V. Pachón Álvarez
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2013
Field of study

There exist several fitness function proposals based on a combination of weighted objectives to optimize the discovery of association rules. Nevertheless, some differences in the measures used to assess the quality of association rules could be obtained according to the values of such weights. Therefore, in such proposals it is very important the user’s decision in order to specify the weights or coefficients of the optimized objectives. Thus, this work presents an analysis on the sensitivity of several quality measures when the weights included in the fitness function of the existing QARGA algorithm are modified. Finally, a comparative analysis of the results obtained according to the weights setup is provided.MICYT TIN2011-28956-C02-00Junta de Andalucía P11-TIC-752

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

KEEL 3.0: an open source software for multi-stage analysis in data mining

Author: Alcalá-Fdez Jesús
del Jesús Maria José
Fernández Alberto
García Salvador
González Sergio
Herrera Francisco
Luengo Julian
Moyano Jose M.
Sánchez Luciano
Triguero Isaac
Publication venue: 'Atlantis Press'
Publication date: 01/01/2017
Field of study

This paper introduces the 3rd major release of the KEEL Software. KEEL is an open source Java framework (GPLv3 license) that provides a number of modules to perform a wide variety of data mining tasks. It includes tools to performdata management, design of multiple kind of experiments, statistical analyses, etc. This framework also contains KEEL-dataset, a data repository for multiple learning tasks featuring data partitions and algorithms’ results over these problems. In this work, we describe the most recent components added to KEEL 3.0, including new modules for semi-supervised learning, multi-instance learning, imbalanced classification and subgroup discovery. In addition, a new interface in R has been incorporated to execute algorithms included in KEEL. These new features greatly improve the versatility of KEEL to deal with more modern data mining problems

Nottingham ePrints

Nottingham eTheses

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repository@Nottingham

Repositorio Institucional de la Universidad de Oviedo

Directory of Open Access Journals

Repositorio Institucional Universidad de Granada

An ant colony-based semi-supervised approach for learning classification rules

Author: A Halder
AA Freitas
C Ginestet
C Hsu
D Angus
D Martens
F Otero
Fernando E. B. Otero
Gisele L. Pappa
I Triguero
J Alcalá-Fdez
J Wang
Julio Albinati
L Rokach
M Li
Samuel E. L. Oliveira
X Zhu
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2015
Field of study

Semi-supervised learning methods create models from a few labeled instances and a great number of unlabeled instances. They appear as a good option in scenarios where there is a lot of unlabeled data and the process of labeling instances is expensive, such as those where most Web applications stand. This paper proposes a semi-supervised self-training algorithm called Ant-Labeler. Self-training algorithms take advantage of supervised learning algorithms to iteratively learn a model from the labeled instances and then use this model to classify unlabeled instances. The instances that receive labels with high confidence are moved from the unlabeled to the labeled set, and this process is repeated until a stopping criteria is met, such as labeling all unlabeled instances. Ant-Labeler uses an ACO algorithm as the supervised learning method in the self-training procedure to generate interpretable rule-based models—used as an ensemble to ensure accurate predictions. The pheromone matrix is reused across different executions of the ACO algorithm to avoid rebuilding the models from scratch every time the labeled set is updated. Results showed that the proposed algorithm obtains better predictive accuracy than three state-of-the-art algorithms in roughly half of the datasets on which it was tested, and the smaller the number of labeled instances, the better the Ant-Labeler performance

Crossref

Kent Academic Repository

EPRENNID: An evolutionary prototype reduction based ensemble for nearest neighbor classification of imbalanced data

Author: Alcalá-Fdez
Alpaydin
Barua
Batista
Blaszczynski
Breiman
Cano
Castro
Chawla
Chris Cornelis
Cover
Das
Datta
Demšar
Díez-Pastor
Fawcett
Friedman
Galar
García
García
García
García
García-Pedrajas
Hand
He
Hido
Isaac Triguero
Khoshgoftaar
Kononenko
Krawczyk
Krawczyk
Kuncheva
Lee
Lin
López
López
Neri
Pawlak
Ramentol
Sarah Vluymans
Schapire
Seiffert
Storn
Ting
Triguero
Triguero
Triguero
Triguero
Wang
Wilson
Wilson
Yijing
Yu
Yule
Yvan Saeys
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Classification problems with an imbalanced class distribution have received an increased amount of attention within the machine learning community over the last decade. They are encountered in a growing number of real-world situations and pose a challenge to standard machine learning techniques. We propose a new hybrid method specifically tailored to handle class imbalance, called EPRENNID. It performs an evolutionary prototype reduction focused on providing diverse solutions to prevent the method from overfitting the training set. It also allows us to explicitly reduce the underrepresented class, which the most common preprocessing solutions handling class imbalance usually protect. As part of the experimental study, we show that the proposed prototype reduction method outperforms state-of-the-art preprocessing techniques. The preprocessing step yields multiple prototype sets that are later used in an ensemble, performing a weighted voting scheme with the nearest neighbor classifier. EPRENNID is experimentally shown to significantly outperform previous proposals

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

Ghent University Academic Bibliography

Feature based multivariate data imputation

Author: A Petrozziello
B Frènay
C Valdiviezo
CJ Willmott
CK Enders
I Jordanov
J Alcalá-Fdez
J Bartlett
J Cohen
J Osborne
JW Graham
M Gòmez-Carracedo
MC Lee
O Troyanskaya
P Schmitt
PA Whigham
S Oba
T Chai
X-Y Pan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2019
Field of study

Crossref

Portsmouth University Research Portal (Pure)

Instance selection of linear complexity for big data

Over recent decades, database sizes have grown considerably. Larger sizes present new challenges, because machine learning algorithms are not prepared to process such large volumes of information. Instance selection methods can alleviate this problem when the size of the data set is medium to large. However, even these methods face similar problems with very large-to-massive data sets. In this paper, two new algorithms with linear complexity for instance selection purposes are presented. Both algorithms use locality-sensitive hashing to find similarities between instances. While the complexity of conventional methods (usually quadratic, O(n2), or log-linear, O(nlogn)) means that they are unable to process large-sized data sets, the new proposal shows competitive results in terms of accuracy. Even more remarkably, it shortens execution time, as the proposal manages to reduce complexity and make it linear with respect to the data set size. The new proposal has been compared with some of the best known instance selection methods for testing and has also been evaluated on large data sets (up to a million instances).Supported by the Research Projects TIN 2011-24046 and TIN 2015-67534-P from the Spanish Ministry of Economy and Competitiveness

Elsevier - Publisher Connector

Crossref

Repositorio Institucional de la Universidad de Burgos

Ensemble and fuzzy techniques applied to imbalanced traffic congestion datasets a comparative study

Author: A Jurek
C Seiffert
D Mokeddem
D Pescaru
E Bauer
F Harandi
H Finner
J Alcala-Fdez
J Alcalá-Fdez
J Cervantes
J Otero
JR Quinlan
K Savetratanakaree
L Breiman
L Guo
L Rokach
LA Zadeh
M Antonelli
M Galar
M Jesus Del
M Lango
MJ Jesus Del
NV Chawla
P Lim
P Lopez-Garcia
S García
S Holm
S Kotsiantis
S Nama
S Wang
SB Kotsiantis
V López
Y Fang
Y Freund
Z Xu
Z Zhao
Publication venue
Publication date: 11/05/2018
Field of study

Class imbalance is among the most persistent complications which may confront the traditional supervised learning task in real-world applications. Among the different kind of classification problems that have been studied in the literature, the imbalanced ones, particularly those that represents real-world problems, have attracted the interest of many researchers in recent years. In order to face this problems, different approaches have been used or proposed in the literature, between then, soft computing and ensemble techniques. In this work, ensembles and fuzzy techniques have been applied to real-world traffic datasets in order to study their performance in imbalanced real-world scenarios. KEEL platform is used to carried out this study. The results show that different ensemble techniques obtain the best results in the proposed datasets. Document type: Part of book or chapter of boo

Crossref

Scipedia

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Generalized additive and fuzzy models in environmental flow assessment: A comparison employing the West Balkan trout (Salmo farioides; Karaman, 1938)

Author: Ahmadi-Nedushan
Akaike
Alcalá-Fdez
Allouche
Anderson
Arthington
Austin
Ayllón
Ayllón
Beakes
Belgiorno
Benejam
Boavida
Casillas
Castro
Ch. Papadaki
Costa
Delling
Döll
E. Dimitriou
F. Martínez-Capel
Fukuda
Fukuda
Gibson
Guay
Hagen-Zanker
Hastie
Heggenes
Heggenes
Heggenes
Johnson
Jorde
Jowett
Jowett
Kalogeropoulos
Katopodis
Kottelat
L. Ntoanidis
Lamouroux
Li
Liaw
Lorenz
Maddock
Maggini
Mamdani
Martínez-Capel
Mathews
McClain
Mitaim
Mouton
Mouton
Mouton
Mouton
Mouton
Muñoz-Mas
Muñoz-Mas
Muñoz-Mas
Muñoz-Mas
Orth
Ovidio
Papadaki
Payne
Platts
R. Muñoz-Mas
Rincón
Riza
Rose
S. Zogaris
Schindler
Schneider
Strakosh
Sánchez-Hernández
Takagi
Tharme
Tomsic
Vismara
Visser
Waters
Wood
Wood
Wood
Woodward
Yao
Yi
Zadeh
Zika
Zogaris
Zogaris
Publication venue: 'Elsevier BV'
Publication date: 01/06/2016
Field of study

Human activities have altered flow regimes resulting in increased pressures and threats on river biota. Physical habitat simulation has been established as a standard approach among the methods for Environmental Flow Assessment (EFA). Traditionally, in EFA, univariate habitat suitability curves have been used to evaluate the habitat suitability at the microhabitat scale whereas Generalized Additive Models (GAMs) and fuzzy logic are considered the most common multivariate approaches to do so. The assessment of the habitat suitability for three size classes of the West Balkan trout (Salmo farioides; Karaman, 1938) inferred with these multivariate approaches was compared at three different levels. First the modelled patterns of habitat selection were compared by developing partial dependence plots. Then, the habitat assessment was spatially explicitly compared by calculating the fuzzy kappa statistic and finally, the habitat quantity and quality was compared broadly and at relevant flows under a hypothetical flow regulation, based on the Weighted Usable Area (WUA) vs. flow curves. The GAMs were slightly more accurate and the WUA-flow curves demonstrated that they were more optimistic in the habitat assessment with larger areas assessed with low to intermediate suitability (0.2 0.6). Nevertheless, both approaches coincided in the habitat assessment (the optimal areas were spatially coincident) and in the modelled patterns of habitat selection; large trout selected microhabitats with low flow velocity, large depth, coarse substrate and abundant cover. Medium sized trout selected microhabitats with low flow velocity, middle-to-large depth, any kind of substrate but bedrock and some elements of cover. Finally small trout selected microhabitats with low flow velocity, small depth, and light cover only avoiding bedrock substrate. Furthermore, both approaches also rendered similar WUA-flow curves and coincided in the predicted increases and decreases of the WUA under the hypothetical flow regulation. Although on an equal footing, GAMs performed slightly better, they do not automatically account for variables interactions. Conversely, fuzzy models do so and can be easily modified by experts to include new insights or to cover a wider range of environmental conditions. Therefore, as a consequence of the agreement between both approaches, we would advocate for combinations of GAMs and fuzzy models in fish-based EFA.This study was supported by the ECOFLOW project funded by the Hellenic General Secretariat of Research and Technology in the framework of the NSRF 2007-2013. We are grateful for field assistance of Dimitris Kommatas, Orfeas Triantafillou and Martin Palt and to Alcibiades N. Economou for assistance in discussions on trout biology and ecology.Muñoz Mas, R.; Papadaki, C.; Martinez-Capel, F.; Zogaris, S.; Ntoanidis, L.; Dimitriou, E. (2016). Generalized additive and fuzzy models in environmental flow assessment: A comparison employing the West Balkan trout (Salmo farioides; Karaman, 1938). Ecological Engineering. 91:365-377. doi:10.1016/j.ecoleng.2016.03.009S3653779

Crossref

RiuNet