Search CORE

2 research outputs found

Comparing representative selection strategies for dissimilarity representations

Author: Bunke Horst
Kandel Abraham
Last Mark
Reynolds Zane
Publication venue: 'Wiley'
Publication date: 01/01/2006
Field of study

Bern Open Repository and Information System (BORIS)

Comparing Representative Selection Strategies for Dissimilarity Representations

Author: Abraham K
Horst Bunke
Mark Last
Zane Reynolds
Publication venue
Publication date
Field of study

Abstract — Many of the computational intelligence techniques currently used do not scale well in data type or computational performance, so selecting the right dimensionality reduction technique for the data is essential. By employing a dimensionality reduction technique called representative dissimilarity to create an embedded space, large spaces of complex patterns can be simplified to a fixed-dimensional Euclidean space of points. The only current suggestions as to how the representatives should be selected are principal component analysis, projection pursuit, and factor analysis. Several alternative representative strategies are proposed and empirically evaluated on a set of term vectors constructed from HTML documents. The results indicate that using a representative dissimilarity representation with at least 50 representatives can achieve a significant increase in classification speed, with a minimal sacrifice in accuracy, and when the representatives are selected randomly, the time required to create the embedded space is significantly reduced, also with a small penalty in accuracy. Index Terms—Dimensionality reduction, dissimilarity representation, document classification, representative selection

CiteSeerX