23 research outputs found

    Self-Tuning Spectral Clustering

    Get PDF
    We study a number of open issues in spectral clustering: (i) Selecting the appropriate scale of analysis, (ii) Handling multi-scale data, (iii) Clustering with irregular background clutter, and, (iv) Finding automatically the number of groups. We first propose that a ‘local’ scale should be used to compute the affinity between each pair of points. This local scaling leads to better clustering especially when the data includes multiple scales and when the clusters are placed within a cluttered background. We further suggest exploiting the structure of the eigenvectors to infer automatically the number of groups. This leads to a new algorithm in which the final randomly initialized k-means stage is eliminated

    Intrinsic dimensionality detection criterion based on Locally Linear Embedding

    Get PDF
    We revisit in this work the Locally Linear Embedding (LLE) algorithm which is a widely employed technique in dimensionality reduction. With a particular interest on the correspondences of nearest neighbors in the original and em- bedded spaces, we observe that, when prescribing low-dimensional embedding spaces, LLE remains merely a weight preserving, rather than a neighborhood preserving algorithm. We propose thus a ”neighborhood preserving ratio” crite- rion to estimate a minimal intrinsic dimensionality required for neighbourhood preservation. We validate its efficiency on a set of synthetic data, including S-curve, swiss roll, as well as a dataset of grayscale images

    Comparación de métodos de reducción de dimensión basados en análisis por localidades

    Get PDF
    En este trabajo se realiza una comparación de las principales técnicas de reducción de dimensión no lineal basadas en análisis por localidades, tales como: Locally linear embedding, Isometric feature mapping y Maximum variance unfolding. El estudio pretende determinar, bajo criterios objetivos, cuál de las técnicas consideradas conserva de mejor manera las propiedades locales de la variedad, y la estructura global de los datos de entrada al realizar un mapeo a un espacio de menor dimensión. Los métodos son especialmente analizados en aplicaciones de visualización. Las inmersiones obtenidas son evaluadas por medio de dos criterios: Error de Conservación de Vecindarios y Promedio de Vecinos Conservados. Para la validación experimental se utilizan bases de datos artificiales y reales que permiten confirmar visualmente la calidad de las inmersiones obtenidas. Con base en los resultados se observa que la técnica Maximum variance unfolding presenta inmersiones de mejor calidad, debido a que la técnica de optimización de este algoritmo preserva exactamente las distancias entre puntos cercanos en el espacio de baja dimensión, conservando la estructura global de la variedad analizada.In this paper, a comparison of methods for nonlinear dimensionality reduction is proposed in order to determine which technique preserves better the local properties, without losing the overall structure of the original data. We seek to establish which of these methods is the most appropriate for visualization tasks. The embeddings obtained with each technique are evaluated by two criteria Preservation Neighborhood Error and Preserved Neighbors Average. The methodologies were tested on artificial and real-world data sets which allow us to visually confirm the quality of the embedding. The results obtained show that Maximum variance unfolding computes high quality embeddings, because the optimization problem pretends to preserve exactly the local pair-wise distance between neighbors and conserve the global manifold structure
    corecore