Search CORE

3 research outputs found

Impact of the initialization in tree-based fast similarity search techniques

Author: Micó Luisa
Oncina Jose
Serrano Díaz-Carrasco Aureo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen. This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity. Moreover, most pivot tree-based indexes emphasizes in building balanced trees. We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.The authors thank the Spanish CICyT for partial support of this work through projects TIN2009-14205-C04-C1, the Ist Programme of the European Community, under the Pascal Network of Excellence, (Ist– 2006-216886), and the program Consolider Ingenio 2010 (Csd2007-00018)

Repositorio Institucional de la Universidad de Alicante

Crossref

InstanceRank: Bringing order to datasets

Author: García Vallejo Carlos Antonio
Ortega Rodríguez Francisco Javier
Troyano Jiménez José Antonio
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

In this paper we present InstanceRank, a ranking algorithm that reflects the relevance of the instances within a dataset. InstanceRank applies a similar solution to that used by PageRank, the web pages ranking algorithm in the Google search engine. We also present ISR, an instance selection technique that uses InstanceRank. This algorithm chooses the most representative instances from a learning database. Experiments show that ISR algorithm, with InstanceRank as ranking criteria, obtains similar results in accuracy to other instance reduction techniques, noticeably reducing the size of the instance set.Ministerio de Educación y Ciencia HUM2007-66607-C04-0

idUS. Depósito de Investigación Universidad de Sevilla