3 research outputs found

    Aspects of Metric Spaces in Computation

    Get PDF
    Metric spaces, which generalise the properties of commonly-encountered physical and abstract spaces into a mathematical framework, frequently occur in computer science applications. Three major kinds of questions about metric spaces are considered here: the intrinsic dimensionality of a distribution, the maximum number of distance permutations, and the difficulty of reverse similarity search. Intrinsic dimensionality measures the tendency for points to be equidistant, which is diagnostic of high-dimensional spaces. Distance permutations describe the order in which a set of fixed sites appears while moving away from a chosen point; the number of distinct permutations determines the amount of storage space required by some kinds of indexing data structure. Reverse similarity search problems are constraint satisfaction problems derived from distance-based index structures. Their difficulty reveals details of the structure of the space. Theoretical and experimental results are given for these three questions in a wide range of metric spaces, with commentary on the consequences for computer science applications and additional related results where appropriate

    Measuring the Difficulty of Distance-Based Indexing

    No full text
    Abstract. Data structures for similarity search are commonly evalu-ated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimen-sionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equiv-alent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
    corecore