Search CORE

52,329 research outputs found

An experimental study on rank methods for prototype selection

Author: Calvo-Zaragoza Jorge
Iñesta José M.
Rico-Juan Juan Ramón
Valero-Mas Jose J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Prototype selection is one of the most popular approaches for addressing the low efficiency issue typically found in the well-known k-Nearest Neighbour classification rule. These techniques select a representative subset from an original collection of prototypes with the premise of maintaining the same classification accuracy. Most recently, rank methods have been proposed as an alternative to develop new selection strategies. Following a certain heuristic, these methods sort the elements of the initial collection according to their relevance and then select the best possible subset by means of a parameter representing the amount of data to maintain. Due to the relative novelty of these methods, their performance and competitiveness against other strategies is still unclear. This work performs an exhaustive experimental study of such methods for prototype selection. A representative collection of both classic and sophisticated algorithms are compared to the aforementioned techniques in a number of datasets, including different levels of induced noise. Results report the remarkable competitiveness of these rank methods as well as their excellent trade-off between prototype reduction and achieved accuracy.This work has been supported by the Vicerrectorado de Investigación, Desarrollo e Innovación de la Universidad de Alicante through the FPU programme (UAFPU2014-5883), the Spanish Ministerio de Educación, Cultura y Deporte through a FPU Fellowship (Ref. AP2012-0939) and the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R, supported by UE FEDER funds) and Consejería de Educación de la Comunidad Valenciana through project PROMETEO/2012/017

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Efficient Data Representation by Selecting Prototypes with Importance Weights

Author: Aggarwal Charu
Cecchi Guillermo
Dhurandhar Amit
Gurumoorthy Karthik S.
Publication venue
Publication date: 12/08/2019
Field of study

Prototypical examples that best summarizes and compactly represents an underlying complex data distribution communicate meaningful insights to humans in domains where simple explanations are hard to extract. In this paper we present algorithms with strong theoretical guarantees to mine these data sets and select prototypes a.k.a. representatives that optimally describes them. Our work notably generalizes the recent work by Kim et al. (2016) where in addition to selecting prototypes, we also associate non-negative weights which are indicative of their importance. This extension provides a single coherent framework under which both prototypes and criticisms (i.e. outliers) can be found. Furthermore, our framework works for any symmetric positive definite kernel thus addressing one of the key open questions laid out in Kim et al. (2016). By establishing that our objective function enjoys a key property of that of weak submodularity, we present a fast ProtoDash algorithm and also derive approximation guarantees for the same. We demonstrate the efficacy of our method on diverse domains such as retail, digit recognition (MNIST) and on publicly available 40 health questionnaires obtained from the Center for Disease Control (CDC) website maintained by the US Dept. of Health. We validate the results quantitatively as well as qualitatively based on expert feedback and recently published scientific studies on public health, thus showcasing the power of our technique in providing actionability (for retail), utility (for MNIST) and insight (on CDC datasets) which arguably are the hallmarks of an effective data mining method.Comment: Accepted for publication in International Conference on Data Mining (ICDM) 201

arXiv.org e-Print Archive

Crossref

Combination of linear classifiers using score function -- analysis of possible combination strategies

Author: AH Ko
AS Britto
B Cyganek
B. Bergmann
C Cortes
CD Manning
D Yekutieli
E Hüllermeier
F Wilcoxon
G Giacinto
Geoffrey J. McLachlan
H Drucker
J Demšar
Karl Pearson
L Xu
L.I. Kuncheva
Luc Devroye
M Friedman
M Hall
M Przybyła-Kasperek
M Przybyła-Kasperek
M Reif
M Skurichina
M Woźniak
Marina Sokolova
S Garcia
S Holm
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/05/2019
Field of study

In this work, we addressed the issue of combining linear classifiers using their score functions. The value of the scoring function depends on the distance from the decision boundary. Two score functions have been tested and four different combination strategies were investigated. During the experimental study, the proposed approach was applied to the heterogeneous ensemble and it was compared to two reference methods -- majority voting and model averaging respectively. The comparison was made in terms of seven different quality criteria. The result shows that combination strategies based on simple average, and trimmed average are the best combination strategies of the geometrical combination

arXiv.org e-Print Archive

Crossref

Evolving Non-Dominated Parameter Sets for Computational Models from Multiple Experiments

Author: Gobet Fernand
Lane Peter
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 31/07/2013
Field of study

© Peter C. R. Lane, Fernand Gobet. This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY-NC 3.0)Creating robust, reproducible and optimal computational models is a key challenge for theorists in many sciences. Psychology and cognitive science face particular challenges as large amounts of data are collected and many models are not amenable to analytical techniques for calculating parameter sets. Particular problems are to locate the full range of acceptable model parameters for a given dataset, and to confirm the consistency of model parameters across different datasets. Resolving these problems will provide a better understanding of the behaviour of computational models, and so support the development of general and robust models. In this article, we address these problems using evolutionary algorithms to develop parameters for computational models against multiple sets of experimental data; in particular, we propose the ‘speciated non-dominated sorting genetic algorithm’ for evolving models in several theories. We discuss the problem of developing a model of categorisation using twenty-nine sets of data and models drawn from four different theories. We find that the evolutionary algorithms generate high quality models, adapted to provide a good fit to all available data.Peer reviewedFinal Published versio

Crossref

University of Hertfordshire Research Archive

Recruitment and selection processes through an effective GDSS

Author: Alter
Ashton
Ball
Barber
Basak
Bellone
Benbasat
Binbasioglu
Brice
Bryson
Bui
Chen
Condor
Dale
Davey
Delbecq
DeSanctis
Dessler
Einhorn
Galanaki
Gallupe
Hatcher
Holsapple
Holsapple
Hsu-Shih Shih
Huan-Jyh Shyur
Huff
Hwang
Hwang
Iz
Iz
Kavanagh
Korhonen
Liang-Chih Huang
Lin
Madu
Mallach
Marakas
Matheson
Mohanty
Mondy
Murray
Ngwenyama
Niehaus
Nunamaker
O'Brein
Raghunathan
Saaty
Saaty
Sage
SAP
Schmidt
Shih
Sprague
Taylor
Turban
Vincke
Vitodo
Publication venue: 'Elsevier BV'
Publication date: 31/12/2005
Field of study

[[abstract]]This study proposes a group decision support system (GDSS), with multiple criteria to assist in recruitment and selection (R&S) processes of human resources. A two-phase decision-making procedure is first suggested; various techniques involving multiple criteria and group participation are then defined corresponding to each step in the procedure. A wide scope of personnel characteristics is evaluated, and the concept of consensus is enhanced. The procedure recommended herein is expected to be more effective than traditional approaches. In addition, the procedure is implemented on a network-based PC system with web interfaces to support the R&S activities. In the final stage, key personnel at a human resources department of a chemical company in southern Taiwan authenticated the feasibility of the illustrated example.[[notice]]補正完畢[[journaltype]]國內[[incitationindex]]SCI[[incitationindex]]E

Elsevier - Publisher Connector

Crossref

Tamkang University Institutional Repository