155,055 research outputs found
Net and Prune: A Linear Time Algorithm for Euclidean Distance Problems
We provide a general framework for getting expected linear time constant
factor approximations (and in many cases FPTAS's) to several well known
problems in Computational Geometry, such as -center clustering and farthest
nearest neighbor. The new approach is robust to variations in the input
problem, and yet it is simple, elegant and practical. In particular, many of
these well studied problems which fit easily into our framework, either
previously had no linear time approximation algorithm, or required rather
involved algorithms and analysis. A short list of the problems we consider
include farthest nearest neighbor, -center clustering, smallest disk
enclosing points, th largest distance, th smallest -nearest
neighbor distance, th heaviest edge in the MST and other spanning forest
type problems, problems involving upward closed set systems, and more. Finally,
we show how to extend our framework such that the linear running time bound
holds with high probability
Fault Tolerant Clustering Revisited
In discrete k-center and k-median clustering, we are given a set of points P
in a metric space M, and the task is to output a set C \subseteq ? P, |C| = k,
such that the cost of clustering P using C is as small as possible. For
k-center, the cost is the furthest a point has to travel to its nearest center,
whereas for k-median, the cost is the sum of all point to nearest center
distances. In the fault-tolerant versions of these problems, we are given an
additional parameter 1 ?\leq \ell \leq ? k, such that when computing the cost
of clustering, points are assigned to their \ell-th nearest-neighbor in C,
instead of their nearest neighbor. We provide constant factor approximation
algorithms for these problems that are both conceptually simple and highly
practical from an implementation stand-point
Efficient Nearest Neighbor Classification Using a Cascade of Approximate Similarity Measures
Nearest neighbor classification using shape context can yield highly accurate results in a number of recognition problems. Unfortunately, the approach can be too slow for practical applications, and thus approximation strategies are needed to make shape context practical. This paper proposes a method for efficient and accurate nearest neighbor classification in non-Euclidean spaces, such as the space induced by the shape context measure. First, a method is introduced for constructing a Euclidean embedding that is optimized for nearest neighbor classification accuracy. Using that embedding, multiple approximations of the underlying non-Euclidean similarity measure are obtained, at different levels of accuracy and efficiency. The approximations are automatically combined to form a cascade classifier, which applies the slower approximations only to the hardest cases. Unlike typical cascade-of-classifiers approaches, that are applied to binary classification problems, our method constructs a cascade for a multiclass problem. Experiments with a standard shape data set indicate that a two-to-three order of magnitude speed up is gained over the standard shape context classifier, with minimal losses in classification accuracy.National Science Foundation (IIS-0308213, IIS-0329009, EIA-0202067); Office of Naval Research (N00014-03-1-0108
Computing recurrence coefficients of multiple orthogonal polynomials
Multiple orthogonal polynomials satisfy a number of recurrence relations, in
particular there is a -term recurrence relation connecting the type II
multiple orthogonal polynomials near the diagonal (the so-called step-line
recurrence relation) and there is a system of recurrence relations
connecting the nearest neighbors (the so-called nearest neighbor recurrence
relations). In this paper we deal with two problems. First we show how one can
obtain the nearest neighbor recurrence coefficients (and in particular the
recurrence coefficients of the orthogonal polynomials for each of the defining
measures) from the step-line recurrence coefficients. Secondly we show how one
can compute the step-line recurrence coefficients from the recurrence
coefficients of the orthogonal polynomials of each of the measures defining the
multiple orthogonality.Comment: 22 pages, 2 figures in Numerical Algorithms (2015
Modified Large Margin Nearest Neighbor Metric Learning for Regression
The main objective of this letter is to formulate a new approach of learning a Mahalanobis distance metric for nearest neighbor regression from a training sample set. We propose a modified version of the large margin nearest neighbor metric learning method to deal with regression problems. As an application, the prediction of post-operative trunk 3-D shapes in scoliosis surgery using nearest neighbor regression is described. Accuracy of the proposed method is quantitatively evaluated through experiments on real medical data.IRSC / CIH
Effect of Neighborhood Approximation on Downstream Analytics
Nearest neighbor search algorithms have been successful in finding practically useful solutions to computationally difficult problems. In the nearest neighbor search problem, the brute force approach is often more efficient than other algorithms for high-dimensional spaces. A special case exists for objects represented as sparse vectors, where algorithms take advantage of the fact that an object has a zero value for most features. In general, since exact nearest neighbor search methods suffer from the “curse of dimensionality,” many practitioners use approximate nearest neighbor search algorithms when faced with high dimensionality or large datasets. To a reasonable degree, it is known that relying on approximate nearest neighbors leads to some error in the solutions to the underlying data mining problems the neighbors are used to solve. However, no one has attempted to quantify this error or provide practitioners with guidance in choosing appropriate search methods for their task. In this thesis, we conduct several experiments on recommender systems with a goal to find the degree to which approximate nearest neighbor algorithms are subject to these types of error propagation problems. Additionally, we provide persuasive evidence on the trade-off between search performance and analytics effectiveness. Our experimental evaluation demonstrates that a state-of-the-art approximate nearest neighbor search method (L2KNNGApprox) is not an effective solution in most cases. When tuned to achieve high search recall (80% or higher), it provides a fairly competitive recommendation performance compared to an efficient exact search method but offers no advantage in terms of efficiency (0.1x—1.5x speedup). Low search recall (\u3c60%) leads to poor recommendation performance. Finally, medium recall values (60%—80%) lead to reasonable recommendation performance but are hard to achieve and offer only a modest gain in efficiency (1.5x—2.3x)
- …