Search CORE

12,620 research outputs found

Multilabel Prototype Generation for data reduction in K-Nearest Neighbour classification

Author: Alonso-Jiménez Pablo
Gallego Antonio-Javier
Serra Xavier
Valero-Mas Jose J.
Publication venue: 'Elsevier BV'
Publication date: 15/11/2022
Field of study

Prototype Generation (PG) methods are typically considered for improving the efficiency of the k-Nearest Neighbour (kNN) classifier when tackling high-size corpora. Such approaches aim at generating a reduced version of the corpus without decreasing the classification performance when compared to the initial set. Despite their large application in multiclass scenarios, very few works have addressed the proposal of PG methods for the multilabel space. In this regard, this work presents the novel adaptation of four multiclass PG strategies to the multilabel case. These proposals are evaluated with three multilabel kNN-based classifiers, 12 corpora comprising a varied range of domains and corpus sizes, and different noise scenarios artificially induced in the data. The results obtained show that the proposed adaptations are capable of significantly improving—both in terms of efficiency and classification performance—the only reference multilabel PG work in the literature as well as the case in which no PG method is applied, also presenting statistically superior robustness in noisy scenarios. Moreover, these novel PG strategies allow prioritising either the efficiency or efficacy criteria through its configuration depending on the target scenario, hence covering a wide area in the solution space not previously filled by other works.This research was partially funded by the Spanish Ministerio de Ciencia e Innovación through the MultiScore (PID2020-118447RA-I00) and DOREMI (TED2021-132103A-I00) projects. The first author is supported by grant APOSTD/2020/256 from “Programa I+D+i de la Generalitat Valenciana”

Repositorio Institucional de la Universidad de Alicante

Multilabel Prototype Generation for Data Reduction in k-Nearest Neighbour classification

Author: Alonso-Jiménez Pablo
Gallego Antonio Javier
Serra Xavier
Valero-Mas Jose J.
Publication venue
Publication date: 22/07/2022
Field of study

Prototype Generation (PG) methods are typically considered for improving the efficiency of the

k

-Nearest Neighbour (

k

NN) classifier when tackling high-size corpora. Such approaches aim at generating a reduced version of the corpus without decreasing the classification performance when compared to the initial set. Despite their large application in multiclass scenarios, very few works have addressed the proposal of PG methods for the multilabel space. In this regard, this work presents the novel adaptation of four multiclass PG strategies to the multilabel case. These proposals are evaluated with three multilabel

k

NN-based classifiers, 12 corpora comprising a varied range of domains and corpus sizes, and different noise scenarios artificially induced in the data. The results obtained show that the proposed adaptations are capable of significantly improving -- both in terms of efficiency and classification performance -- the only reference multilabel PG work in the literature as well as the case in which no PG method is applied, also presenting a statistically superior robustness in noisy scenarios. Moreover, these novel PG strategies allow prioritising either the efficiency or efficacy criteria through its configuration depending on the target scenario, hence covering a wide area in the solution space not previously filled by other works

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad de Alicante

UPF Digital Repository

Fuzzy Modeling of Client Preference in Data-Rich Marketing Environments

Author: Kaymak U.
Setnes M.
Publication venue
Publication date
Field of study

Advances in computational methods have led, in the world of financial services, to huge databases of client and market information. In the past decade, various computational intelligence (CI) techniques have been applied in mining this data for obtaining knowledge and in-depth information about the clients and the markets. This paper discusses the application of fuzzy clustering in target selection from large databases for direct marketing (DM) purposes. Actual data from the campaigns of a large financial services provider are used as a test case. The results obtained with the fuzzy clustering approach are compared with those resulting from the current practice of using statistical tools for target selection.fuzzy clustering;direct marketing;client segmentation;fuzzy systems

Research Papers in Economics

Trends in Nearest Feature Classification for Face RecognitionAchievements and Perspectives

Author: C&#233
Mauricio Orozco-Alzate
Publication venue: 'IntechOpen'
Publication date: 01/01/2009
Field of study

IntechOpen

Enabling resource-awareness for in-network data processing in wireless sensor networks

Author: Gaber M.
Rohm U.
Tse Q.
Publication venue
Publication date: 01/01/2008
Field of study

Portsmouth University Research Portal (Pure)

SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates

Author: Ahmetcik Emre
Curtarolo Stefano
Ghiringhelli Luca M.
Ouyang Runhai
Scheffler Matthias
Publication venue: 'American Physical Society (APS)'
Publication date: 27/06/2018
Field of study

The lack of reliable methods for identifying descriptors - the sets of parameters capturing the underlying mechanisms of a materials property - is one of the key factors hindering efficient materials development. Here, we propose a systematic approach for discovering descriptors for materials properties, within the framework of compressed-sensing based dimensionality reduction. SISSO (sure independence screening and sparsifying operator) tackles immense and correlated features spaces, and converges to the optimal solution from a combination of features relevant to the materials' property of interest. In addition, SISSO gives stable results also with small training sets. The methodology is benchmarked with the quantitative prediction of the ground-state enthalpies of octet binary materials (using ab initio data) and applied to the showcase example of predicting the metal/insulator classification of binaries (with experimental data). Accurate, predictive models are found in both cases. For the metal-insulator classification model, the predictive capability are tested beyond the training data: It rediscovers the available pressure-induced insulator->metal transitions and it allows for the prediction of yet unknown transition candidates, ripe for experimental validation. As a step forward with respect to previous model-identification methods, SISSO can become an effective tool for automatic materials development.Comment: 11 pages, 5 figures, in press in Phys. Rev. Material

arXiv.org e-Print Archive

MPG.PuRe