58,109 research outputs found
Natural data structure extracted from neighborhood-similarity graphs
'Big' high-dimensional data are commonly analyzed in low-dimensions, after
performing a dimensionality-reduction step that inherently distorts the data
structure. For the same purpose, clustering methods are also often used. These
methods also introduce a bias, either by starting from the assumption of a
particular geometric form of the clusters, or by using iterative schemes to
enhance cluster contours, with uncontrollable consequences. The goal of data
analysis should, however, be to encode and detect structural data features at
all scales and densities simultaneously, without assuming a parametric form of
data point distances, or modifying them. We propose a novel approach that
directly encodes data point neighborhood similarities as a sparse graph. Our
non-iterative framework permits a transparent interpretation of data, without
altering the original data dimension and metric. Several natural and synthetic
data applications demonstrate the efficacy of our novel approach
- …