31,360 research outputs found

    Result-Sensitive Binary Search with Noisy Information

    Get PDF
    We describe new algorithms for the predecessor problem in the Noisy Comparison Model. In this problem, given a sorted list L of n (distinct) elements and a query q, we seek the predecessor of q in L: denoted by u, the largest element less than or equal to q. In the Noisy Comparison Model, the result of a comparison between two elements is non-deterministic. Moreover, multiple comparisons of the same pair of elements might have different results: each is generated independently, and is correct with probability p > 1/2. Given an overall error tolerance Q, the cost of an algorithm is measured by the total number of noisy comparisons; these must guarantee the predecessor is returned with probability at least 1 - Q. Feige et al. showed that predecessor queries can be answered by a modified binary search with Theta(log (n/Q)) noisy comparisons. We design result-sensitive algorithms for answering predecessor queries. The query cost is related to the index, k, of the predecessor u in L. Our first algorithm answers predecessor queries with O(log ((log^{*(c)} n)/Q) + log (k/Q)) noisy comparisons, for an arbitrarily large constant c. The function log^{*(c)} n iterates c times the iterated-logarithm function, log^* n. Our second algorithm is a genuinely result-sensitive algorithm whose expected query cost is bounded by O(log (k/Q)), and is guaranteed to terminate after at most O(log((log n)/Q)) noisy comparisons. Our results strictly improve the state-of-the-art bounds when k is in omega(1) intersected with o(n^epsilon), where epsilon > 0 is some constant. Moreover, we show that our result-sensitive algorithms immediately improve not only predecessor-query algorithms, but also binary-search-like algorithms for solving key applications

    k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)

    Get PDF
    Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data. This paper is the second edition of a paper previously published as a technical report. Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN
    • …
    corecore