1,718 research outputs found
Dimension Detection with Local Homology
Detecting the dimension of a hidden manifold from a point sample has become
an important problem in the current data-driven era. Indeed, estimating the
shape dimension is often the first step in studying the processes or phenomena
associated to the data. Among the many dimension detection algorithms proposed
in various fields, a few can provide theoretical guarantee on the correctness
of the estimated dimension. However, the correctness usually requires certain
regularity of the input: the input points are either uniformly randomly sampled
in a statistical setting, or they form the so-called
-sample which can be neither too dense nor too sparse.
Here, we propose a purely topological technique to detect dimensions. Our
algorithm is provably correct and works under a more relaxed sampling
condition: we do not require uniformity, and we also allow Hausdorff noise. Our
approach detects dimension by determining local homology. The computation of
this topological structure is much less sensitive to the local distribution of
points, which leads to the relaxation of the sampling conditions. Furthermore,
by leveraging various developments in computational topology, we show that this
local homology at a point can be computed \emph{exactly} for manifolds
using Vietoris-Rips complexes whose vertices are confined within a local
neighborhood of . We implement our algorithm and demonstrate the accuracy
and robustness of our method using both synthetic and real data sets
Magnification Control in Self-Organizing Maps and Neural Gas
We consider different ways to control the magnification in self-organizing
maps (SOM) and neural gas (NG). Starting from early approaches of magnification
control in vector quantization, we then concentrate on different approaches for
SOM and NG. We show that three structurally similar approaches can be applied
to both algorithms: localized learning, concave-convex learning, and winner
relaxing learning. Thereby, the approach of concave-convex learning in SOM is
extended to a more general description, whereas the concave-convex learning for
NG is new. In general, the control mechanisms generate only slightly different
behavior comparing both neural algorithms. However, we emphasize that the NG
results are valid for any data dimension, whereas in the SOM case the results
hold only for the one-dimensional case.Comment: 24 pages, 4 figure
Intrinsic Dimension Estimation: Relevant Techniques and a Benchmark Framework
When dealing with datasets comprising high-dimensional points, it is usually advantageous to discover some data structure. A fundamental information needed to this aim is the minimum number of parameters required to describe the data while minimizing the information loss. This number, usually called intrinsic dimension, can be interpreted as the dimension of the manifold from which the input data are supposed to be drawn. Due to its usefulness in many theoretical and practical problems, in the last decades the concept of intrinsic dimension has gained considerable attention in the scientific community, motivating the large number of intrinsic dimensionality estimators proposed in the literature. However, the problem is still open since most techniques cannot efficiently deal with datasets drawn from manifolds of high intrinsic dimension and nonlinearly embedded in higher dimensional spaces. This paper surveys some of the most interesting, widespread used, and advanced state-of-the-art methodologies. Unfortunately, since no benchmark database exists in this research field, an objective comparison among different techniques is not possible. Consequently, we suggest a benchmark framework and apply it to comparatively evaluate relevant state-of-the-art estimators
A scale-based approach to finding effective dimensionality in manifold learning
The discovering of low-dimensional manifolds in high-dimensional data is one
of the main goals in manifold learning. We propose a new approach to identify
the effective dimension (intrinsic dimension) of low-dimensional manifolds. The
scale space viewpoint is the key to our approach enabling us to meet the
challenge of noisy data. Our approach finds the effective dimensionality of the
data over all scale without any prior knowledge. It has better performance
compared with other methods especially in the presence of relatively large
noise and is computationally efficient.Comment: Published in at http://dx.doi.org/10.1214/07-EJS137 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …