Search CORE

8 research outputs found

Clustering-based k-nearest neighbor classification for large-scale data with neural codes representation

Author: Calvo-Zaragoza Jorge
Gallego Antonio-Javier
Rico-Juan Juan Ramón
Valero-Mas Jose J.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

While standing as one of the most widely considered and successful supervised classification algorithms, the k-nearest Neighbor (kNN) classifier generally depicts a poor efficiency due to being an instance-based method. In this sense, Approximated Similarity Search (ASS) stands as a possible alternative to improve those efficiency issues at the expense of typically lowering the performance of the classifier. In this paper we take as initial point an ASS strategy based on clustering. We then improve its performance by solving issues related to instances located close to the cluster boundaries by enlarging their size and considering the use of Deep Neural Networks for learning a suitable representation for the classification task at issue. Results using a collection of eight different datasets show that the combined use of these two strategies entails a significant improvement in the accuracy performance, with a considerable reduction in the number of distances needed to classify a sample in comparison to the basic kNN rule.This work has been supported by the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R supported by EU FEDER funds), the Spanish Ministerio de Educación, Cultura y Deporte through an FPU Fellowship (Ref. AP2012–0939), and by the Universidad de Alicante through the FPU program (UAFPU2014–5883 ) and the Instituto Universitario de Investigación Informática (IUII)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Support Vector Machine optimization with fractional gradient descent for data classification

Author: Hapsari Dian Puspita
Purnami Santi Wulan
Utoyo Imam
Publication venue: 'Institut Teknologi Adhi Tama Surabaya'
Publication date: 31/03/2021
Field of study

Data classification has several problems one of which is a large amount of data that will reduce computing time. SVM is a reliable linear classifier for linear or non-linear data, for large-scale data, there are computational time constraints. The Fractional gradient descent method is an unconstrained optimization algorithm to train classifiers with support vector machines that have convex problems. Compared to the classic integer-order model, a model built with fractional calculus has a significant advantage to accelerate computing time. In this research, it is to conduct investigate the current state of this new optimization method fractional derivatives that can be implemented in the classifier algorithm. The results of the SVM Classifier with fractional gradient descent optimization, it reaches a convergence point of approximately 50 iterations smaller than SVM-SGD. The process of updating or fixing the model is smaller in fractional because the multiplier value is less than 1 or in the form of fractions. The SVM-Fractional SGD algorithm is proven to be an effective method for rainfall forecast decisions

e-Jurnal ITATS (Institut Teknologi Adhi Tama Surabay)

Fast and exact fixed-radius neighbor search based on sorting

Author: Chen Xinye
Güttel Stefan
Publication venue
Publication date: 29/01/2024
Field of study

Fixed-radius near neighbor search is a fundamental data operation that retrieves all data points within a user-specified distance to a query point. There are efficient algorithms that can provide fast approximate query responses, but they often have a very compute-intensive indexing phase and require careful parameter tuning. Therefore, exact brute force and tree-based search methods are still widely used. Here we propose a new fixed-radius near neighbor search method, called SNN, that significantly improves over brute force and tree-based methods in terms of index and query time, provably returns exact results, and requires no parameter tuning. SNN exploits a sorting of the data points by their first principal component to prune the query search space. Further speedup is gained from an efficient implementation using high-level Basic Linear Algebra Subprograms (BLAS). We provide theoretical analysis of our method and demonstrate its practical performance when used stand-alone and when applied within the DBSCAN clustering algorithm.Comment: arXiv admin note: text overlap with arXiv:2202.0145

arXiv.org e-Print Archive

Clustering-based k-nearest neighbor classification for large-scale data with neural codes representation

Author: Angiulli
Antonio-Javier Gallego
Arthur
Azizpour
Bawa
Bishop
Bottou
Brighton
Caliński
Calvo-Zaragoza
Calvo-Zaragoza
Cano
Ciaccia
Cover
Dasarathy
Demsar
Deng
Derrac
Devijver
Duda
Friedman
Garcia
García
García-Pedrajas
Hamidzadeh
Hart
Hsu
Huang
Hull
Jain
Jegou
Jorge Calvo-Zaragoza
Jose J. Valero-Mas
Juan R. Rico-Juan
LeCun
Lichman
Liu
Maillo
Mitchell
Muja
Nanni
Ougiaroglou
Ramírez-Gallego
Razavian
Ren
Rico-Juan
Rokach
Rosenberg
Rousseeuw
Tang
Theodoridis
Tsai
Vidal
Wang
Weiss
Wilkinson
Wilson
Wilson
Wu
Yang
Yosinski
Zeiler
Zhang
Zhu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref