6,047 research outputs found
Nonparametric ridge estimation
We study the problem of estimating the ridges of a density function. Ridge
estimation is an extension of mode finding and is useful for understanding the
structure of a density. It can also be used to find hidden structure in point
cloud data. We show that, under mild regularity conditions, the ridges of the
kernel density estimator consistently estimate the ridges of the true density.
When the data are noisy measurements of a manifold, we show that the ridges are
close and topologically similar to the hidden manifold. To find the estimated
ridges in practice, we adapt the modified mean-shift algorithm proposed by
Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical
experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Multi-Step Processing of Spatial Joins
Spatial joins are one of the most important operations for combining spatial objects of several relations. In this paper, spatial join processing is studied in detail for extended spatial objects in twodimensional data space. We present an approach for spatial join processing that is based on three steps. First, a spatial join is performed on the minimum bounding rectangles of the objects returning a set of candidates. Various approaches for accelerating this step of join processing have been examined at the last yearâs conference [BKS 93a]. In this paper, we focus on the problem how to compute the answers from the set of candidates which is handled by
the following two steps. First of all, sophisticated approximations
are used to identify answers as well as to filter out false hits from
the set of candidates. For this purpose, we investigate various types
of conservative and progressive approximations. In the last step, the
exact geometry of the remaining candidates has to be tested against
the join predicate. The time required for computing spatial join
predicates can essentially be reduced when objects are adequately
organized in main memory. In our approach, objects are first decomposed
into simple components which are exclusively organized
by a main-memory resident spatial data structure. Overall, we
present a complete approach of spatial join processing on complex
spatial objects. The performance of the individual steps of our approach
is evaluated with data sets from real cartographic applications.
The results show that our approach reduces the total execution
time of the spatial join by factors
Fast Fencing
We consider very natural "fence enclosure" problems studied by Capoyleas,
Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a
set of points in the plane, we aim at finding a set of closed curves
such that (1) each point is enclosed by a curve and (2) the total length of the
curves is minimized. We consider two main variants. In the first variant, we
pay a unit cost per curve in addition to the total length of the curves. An
equivalent formulation of this version is that we have to enclose unit
disks, paying only the total length of the enclosing curves. In the other
variant, we are allowed to use at most closed curves and pay no cost per
curve.
For the variant with at most closed curves, we present an algorithm that
is polynomial in both and . For the variant with unit cost per curve, or
unit disks, we present a near-linear time algorithm.
Capoyleas, Rote, and Woeginger solved the problem with at most curves in
time. Arkin, Khuller, and Mitchell used this to solve the unit cost
per curve version in exponential time. At the time, they conjectured that the
problem with curves is NP-hard for general . Our polynomial time
algorithm refutes this unless P equals NP
A Density-Based Approach to the Retrieval of Top-K Spatial Textual Clusters
Keyword-based web queries with local intent retrieve web content that is
relevant to supplied keywords and that represent points of interest that are
near the query location. Two broad categories of such queries exist. The first
encompasses queries that retrieve single spatial web objects that each satisfy
the query arguments. Most proposals belong to this category. The second
category, to which this paper's proposal belongs, encompasses queries that
support exploratory user behavior and retrieve sets of objects that represent
regions of space that may be of interest to the user. Specifically, the paper
proposes a new type of query, namely the top-k spatial textual clusters (k-STC)
query that returns the top-k clusters that (i) are located the closest to a
given query location, (ii) contain the most relevant objects with regard to
given query keywords, and (iii) have an object density that exceeds a given
threshold. To compute this query, we propose a basic algorithm that relies on
on-line density-based clustering and exploits an early stop condition. To
improve the response time, we design an advanced approach that includes three
techniques: (i) an object skipping rule, (ii) spatially gridded posting lists,
and (iii) a fast range query algorithm. An empirical study on real data
demonstrates that the paper's proposals offer scalability and are capable of
excellent performance
Fitting Voronoi Diagrams to Planar Tesselations
Given a tesselation of the plane, defined by a planar straight-line graph
, we want to find a minimal set of points in the plane, such that the
Voronoi diagram associated with "fits" \ . This is the Generalized
Inverse Voronoi Problem (GIVP), defined in \cite{Trin07} and rediscovered
recently in \cite{Baner12}. Here we give an algorithm that solves this problem
with a number of points that is linear in the size of , assuming that the
smallest angle in is constant.Comment: 14 pages, 8 figures, 1 table. Presented at IWOCA 2013 (Int. Workshop
on Combinatorial Algorithms), Rouen, France, July 201
- âŠ