155,055 research outputs found

    Net and Prune: A Linear Time Algorithm for Euclidean Distance Problems

    Full text link
    We provide a general framework for getting expected linear time constant factor approximations (and in many cases FPTAS's) to several well known problems in Computational Geometry, such as kk-center clustering and farthest nearest neighbor. The new approach is robust to variations in the input problem, and yet it is simple, elegant and practical. In particular, many of these well studied problems which fit easily into our framework, either previously had no linear time approximation algorithm, or required rather involved algorithms and analysis. A short list of the problems we consider include farthest nearest neighbor, kk-center clustering, smallest disk enclosing kk points, kkth largest distance, kkth smallest mm-nearest neighbor distance, kkth heaviest edge in the MST and other spanning forest type problems, problems involving upward closed set systems, and more. Finally, we show how to extend our framework such that the linear running time bound holds with high probability

    Fault Tolerant Clustering Revisited

    Full text link
    In discrete k-center and k-median clustering, we are given a set of points P in a metric space M, and the task is to output a set C \subseteq ? P, |C| = k, such that the cost of clustering P using C is as small as possible. For k-center, the cost is the furthest a point has to travel to its nearest center, whereas for k-median, the cost is the sum of all point to nearest center distances. In the fault-tolerant versions of these problems, we are given an additional parameter 1 ?\leq \ell \leq ? k, such that when computing the cost of clustering, points are assigned to their \ell-th nearest-neighbor in C, instead of their nearest neighbor. We provide constant factor approximation algorithms for these problems that are both conceptually simple and highly practical from an implementation stand-point

    Efficient Nearest Neighbor Classification Using a Cascade of Approximate Similarity Measures

    Full text link
    Nearest neighbor classification using shape context can yield highly accurate results in a number of recognition problems. Unfortunately, the approach can be too slow for practical applications, and thus approximation strategies are needed to make shape context practical. This paper proposes a method for efficient and accurate nearest neighbor classification in non-Euclidean spaces, such as the space induced by the shape context measure. First, a method is introduced for constructing a Euclidean embedding that is optimized for nearest neighbor classification accuracy. Using that embedding, multiple approximations of the underlying non-Euclidean similarity measure are obtained, at different levels of accuracy and efficiency. The approximations are automatically combined to form a cascade classifier, which applies the slower approximations only to the hardest cases. Unlike typical cascade-of-classifiers approaches, that are applied to binary classification problems, our method constructs a cascade for a multiclass problem. Experiments with a standard shape data set indicate that a two-to-three order of magnitude speed up is gained over the standard shape context classifier, with minimal losses in classification accuracy.National Science Foundation (IIS-0308213, IIS-0329009, EIA-0202067); Office of Naval Research (N00014-03-1-0108

    Computing recurrence coefficients of multiple orthogonal polynomials

    Full text link
    Multiple orthogonal polynomials satisfy a number of recurrence relations, in particular there is a (r+2)(r+2)-term recurrence relation connecting the type II multiple orthogonal polynomials near the diagonal (the so-called step-line recurrence relation) and there is a system of rr recurrence relations connecting the nearest neighbors (the so-called nearest neighbor recurrence relations). In this paper we deal with two problems. First we show how one can obtain the nearest neighbor recurrence coefficients (and in particular the recurrence coefficients of the orthogonal polynomials for each of the defining measures) from the step-line recurrence coefficients. Secondly we show how one can compute the step-line recurrence coefficients from the recurrence coefficients of the orthogonal polynomials of each of the measures defining the multiple orthogonality.Comment: 22 pages, 2 figures in Numerical Algorithms (2015

    Modified Large Margin Nearest Neighbor Metric Learning for Regression

    Full text link
    The main objective of this letter is to formulate a new approach of learning a Mahalanobis distance metric for nearest neighbor regression from a training sample set. We propose a modified version of the large margin nearest neighbor metric learning method to deal with regression problems. As an application, the prediction of post-operative trunk 3-D shapes in scoliosis surgery using nearest neighbor regression is described. Accuracy of the proposed method is quantitatively evaluated through experiments on real medical data.IRSC / CIH

    Effect of Neighborhood Approximation on Downstream Analytics

    Get PDF
    Nearest neighbor search algorithms have been successful in finding practically useful solutions to computationally difficult problems. In the nearest neighbor search problem, the brute force approach is often more efficient than other algorithms for high-dimensional spaces. A special case exists for objects represented as sparse vectors, where algorithms take advantage of the fact that an object has a zero value for most features. In general, since exact nearest neighbor search methods suffer from the “curse of dimensionality,” many practitioners use approximate nearest neighbor search algorithms when faced with high dimensionality or large datasets. To a reasonable degree, it is known that relying on approximate nearest neighbors leads to some error in the solutions to the underlying data mining problems the neighbors are used to solve. However, no one has attempted to quantify this error or provide practitioners with guidance in choosing appropriate search methods for their task. In this thesis, we conduct several experiments on recommender systems with a goal to find the degree to which approximate nearest neighbor algorithms are subject to these types of error propagation problems. Additionally, we provide persuasive evidence on the trade-off between search performance and analytics effectiveness. Our experimental evaluation demonstrates that a state-of-the-art approximate nearest neighbor search method (L2KNNGApprox) is not an effective solution in most cases. When tuned to achieve high search recall (80% or higher), it provides a fairly competitive recommendation performance compared to an efficient exact search method but offers no advantage in terms of efficiency (0.1x—1.5x speedup). Low search recall (\u3c60%) leads to poor recommendation performance. Finally, medium recall values (60%—80%) lead to reasonable recommendation performance but are hard to achieve and offer only a modest gain in efficiency (1.5x—2.3x)
    • …
    corecore