29,920 research outputs found
Intrinsic Dimensionality
This entry for the SIGSPATIAL Special July 2010 issue on Similarity Searching
in Metric Spaces discusses the notion of intrinsic dimensionality of data in
the context of similarity search.Comment: 4 pages, 4 figures, latex; diagram (c) has been correcte
Fast Clustering with Lower Bounds: No Customer too Far, No Shop too Small
We study the \LowerBoundedCenter (\lbc) problem, which is a clustering
problem that can be viewed as a variant of the \kCenter problem. In the \lbc
problem, we are given a set of points P in a metric space and a lower bound
\lambda, and the goal is to select a set C \subseteq P of centers and an
assignment that maps each point in P to a center of C such that each center of
C is assigned at least \lambda points. The price of an assignment is the
maximum distance between a point and the center it is assigned to, and the goal
is to find a set of centers and an assignment of minimum price. We give a
constant factor approximation algorithm for the \lbc problem that runs in O(n
\log n) time when the input points lie in the d-dimensional Euclidean space
R^d, where d is a constant. We also prove that this problem cannot be
approximated within a factor of 1.8-\epsilon unless P = \NP even if the input
points are points in the Euclidean plane R^2.Comment: 14 page
Ramified rectilinear polygons: coordinatization by dendrons
Simple rectilinear polygons (i.e. rectilinear polygons without holes or
cutpoints) can be regarded as finite rectangular cell complexes coordinatized
by two finite dendrons. The intrinsic -metric is thus inherited from the
product of the two finite dendrons via an isometric embedding. The rectangular
cell complexes that share this same embedding property are called ramified
rectilinear polygons. The links of vertices in these cell complexes may be
arbitrary bipartite graphs, in contrast to simple rectilinear polygons where
the links of points are either 4-cycles or paths of length at most 3. Ramified
rectilinear polygons are particular instances of rectangular complexes obtained
from cube-free median graphs, or equivalently simply connected rectangular
complexes with triangle-free links. The underlying graphs of finite ramified
rectilinear polygons can be recognized among graphs in linear time by a
Lexicographic Breadth-First-Search. Whereas the symmetry of a simple
rectilinear polygon is very restricted (with automorphism group being a
subgroup of the dihedral group ), ramified rectilinear polygons are
universal: every finite group is the automorphism group of some ramified
rectilinear polygon.Comment: 27 pages, 6 figure
Indexability, concentration, and VC theory
Degrading performance of indexing schemes for exact similarity search in high
dimensions has long since been linked to histograms of distributions of
distances and other 1-Lipschitz functions getting concentrated. We discuss this
observation in the framework of the phenomenon of concentration of measure on
the structures of high dimension and the Vapnik-Chervonenkis theory of
statistical learning.Comment: 17 pages, final submission to J. Discrete Algorithms (an expanded,
improved and corrected version of the SISAP'2010 invited paper, this e-print,
v3
- …