Search CORE

5 research outputs found

Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data

Author: Aha
Al-Fuqaha
Andoni
Angiulli
Arnaiz-González
Arya
Batista
Bertino
Bezdek
Biau
Cano
Chang
Chen
Chen
Cover
Datta
Dean
Derrac
Derrac
Dutta
Eiben
Fadili
Fan
Fan
Fernández
Fernández
Figueredo
Friedman
Frénay
Garcia
Garcia
García
García
García
García
García-Laencina
Gupta
Hart
Hernández
Iafrate
Iguyon
Keller
Kim
Kononenko
Lenk
Little
Little
Liu
Liu
Luengo
Luengo
Maillo
Maillo
Marx
Meng
Navot
Nguyen
Palma-Mendoza
Pan
Peralta
Philip-Chen
Quinlan
Raja
Ramírez-Gallego
Ramírez-Gallego
Ramírez-Gallego
Rastogi
Royston
Río
Schneider
Skalak
Snir
Sun
Sánchez
Sánchez
Tan
Tomek
Triguero
Triguero
Triguero
Triguero
Triguero
Triguero
Uhlmann
Weinberger
Wettschereck
White
Wilson
Xue
Zaharia
Zerhari
Zhang
Zhong
Zhu
Zou
Publication venue: 'Wiley'
Publication date: 01/03/2019
Field of study

The k-nearest neighbours algorithm is characterised as a simple yet effective data mining technique. The main drawback of this technique appears when massive amounts of data -likely to contain noise and imperfections - are involved, turning this algorithm into an imprecise and especially inefficient technique. These disadvantages have been subject of research for many years, and among others approaches, data preprocessing techniques such as instance reduction or missing values imputation have targeted these weaknesses. As a result, these issues have turned out as strengths and the k-nearest neighbours rule has become a core algorithm to identify and correct imperfect data, removing noisy and redundant samples, or imputing missing values, transforming Big Data into Smart Data - which is data of sufficient quality to expect a good outcome from any data mining algorithm. The role of this smart data gleaning algorithm in a supervised learning context will be investigated. This will include a brief overview of Smart Data, current and future trends for the k-nearest neighbour algorithm in the Big Data context, and the existing data preprocessing techniques based on this algorithm. We present the emerging big data-ready versions of these algorithms and develop some new methods to cope with Big Data. We carry out a thorough experimental analysis in a series of big datasets that provide guidelines as to how to use the k-nearest neighbour algorithm to obtain Smart/Quality Data for a high quality data mining process. Moreover, multiple Spark Packages have been developed including all the Smart Data algorithms analysed

Crossref

Repository@Nottingham

Differential evolution for optimizing the positioning of prototypes in nearest neighbor classification

Author: Abramowitz
Aha
Alcalá-Fdez
Alpaydin
Ben-David
Bezdek
Brighton
Cano
Cervantes
Chang
Chaudhuri
Chen
Chen
Cover
Das
David
de Castro
Demšar
Derrac
Devijver
Eiben
Fayed
Fayed
Fernández
Francisco Herrera
Freitas
Garain
Garcia
García
García
García
García
Hart
Hodges
Holm
Isaac Triguero
Jahromi
Kennedy
Kim
Kohonen
Kononenko
Krasnogor
Kruskal
Lam
Li
Liaw
Liu
Lozano
Marchiori
Marchiori
Nanni
Neri
Nisbet
Papadopoulos
Pappa
Paredes
Poli
Pyle
Qin
Rahnamayan
Rothenberg
Salvador García
Sheskin
Steele
Storn
Sánchez
Triguero
Weinberger
Wilson
Wilson
Wilson
Witten
Zhang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

ACG125/13: Creación del Instituto Andaluz Interuniversitario en Data Science and Computational Intelligence

Author: Universidad de Granada
Publication venue: 'Editorial de la Universidad de Granada'
Publication date: 08/11/2017
Field of study

Repositorio Institucional Universidad de Granada