8 research outputs found
Clustering-based k-nearest neighbor classification for large-scale data with neural codes representation
While standing as one of the most widely considered and successful supervised classification algorithms, the k-nearest Neighbor (kNN) classifier generally depicts a poor efficiency due to being an instance-based method. In this sense, Approximated Similarity Search (ASS) stands as a possible alternative to improve those efficiency issues at the expense of typically lowering the performance of the classifier. In this paper we take as initial point an ASS strategy based on clustering. We then improve its performance by solving issues related to instances located close to the cluster boundaries by enlarging their size and considering the use of Deep Neural Networks for learning a suitable representation for the classification task at issue. Results using a collection of eight different datasets show that the combined use of these two strategies entails a significant improvement in the accuracy performance, with a considerable reduction in the number of distances needed to classify a sample in comparison to the basic kNN rule.This work has been supported by the Spanish Ministerio de EconomÃa y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R supported by EU FEDER funds), the Spanish Ministerio de Educación, Cultura y Deporte through an FPU Fellowship (Ref. AP2012–0939), and by the Universidad de Alicante through the FPU program (UAFPU2014–5883 ) and the Instituto Universitario de Investigación Informática (IUII)
Support Vector Machine optimization with fractional gradient descent for data classification
Data classification has several problems one of which is a large amount of data that will reduce computing time. SVM is a reliable linear classifier for linear or non-linear data, for large-scale data, there are computational time constraints. The Fractional gradient descent method is an unconstrained optimization algorithm to train classifiers with support vector machines that have convex problems. Compared to the classic integer-order model, a model built with fractional calculus has a significant advantage to accelerate computing time. In this research, it is to conduct investigate the current state of this new optimization method fractional derivatives that can be implemented in the classifier algorithm. The results of the SVM Classifier with fractional gradient descent optimization, it reaches a convergence point of approximately 50 iterations smaller than SVM-SGD. The process of updating or fixing the model is smaller in fractional because the multiplier value is less than 1 or in the form of fractions. The SVM-Fractional SGD algorithm is proven to be an effective method for rainfall forecast decisions
Fast and exact fixed-radius neighbor search based on sorting
Fixed-radius near neighbor search is a fundamental data operation that
retrieves all data points within a user-specified distance to a query point.
There are efficient algorithms that can provide fast approximate query
responses, but they often have a very compute-intensive indexing phase and
require careful parameter tuning. Therefore, exact brute force and tree-based
search methods are still widely used. Here we propose a new fixed-radius near
neighbor search method, called SNN, that significantly improves over brute
force and tree-based methods in terms of index and query time, provably returns
exact results, and requires no parameter tuning. SNN exploits a sorting of the
data points by their first principal component to prune the query search space.
Further speedup is gained from an efficient implementation using high-level
Basic Linear Algebra Subprograms (BLAS). We provide theoretical analysis of our
method and demonstrate its practical performance when used stand-alone and when
applied within the DBSCAN clustering algorithm.Comment: arXiv admin note: text overlap with arXiv:2202.0145