11,439 research outputs found

    Ptolemaic Indexing

    Full text link
    This paper discusses a new family of bounds for use in similarity search, related to those used in metric indexing, but based on Ptolemy's inequality, rather than the metric axioms. Ptolemy's inequality holds for the well-known Euclidean distance, but is also shown here to hold for quadratic form metrics in general, with Mahalanobis distance as an important special case. The inequality is examined empirically on both synthetic and real-world data sets and is also found to hold approximately, with a very low degree of error, for important distances such as the angular pseudometric and several Lp norms. Indexing experiments demonstrate a highly increased filtering power compared to existing, triangular methods. It is also shown that combining the Ptolemaic and triangular filtering can lead to better results than using either approach on its own

    A geometric framework for modelling similarity search

    Full text link
    The aim of this paper is to propose a geometric framework for modelling similarity search in large and multidimensional data spaces of general nature, which seems to be flexible enough to address such issues as analysis of complexity, indexability, and the `curse of dimensionality.' Such a framework is provided by the concept of the so-called similarity workload, which is a probability metric space Ω\Omega (query domain) with a distinguished finite subspace XX (dataset), together with an assembly of concepts, techniques, and results from metric geometry. They include such notions as metric transform, \e-entropy, and the phenomenon of concentration of measure on high-dimensional structures. In particular, we discuss the relevance of the latter to understanding the curse of dimensionality. As some of those concepts and techniques are being currently reinvented by the database community, it seems desirable to try and bridge the gap between database research and the relevant work already done in geometry and analysis.Comment: 11 pages, LaTeX 2.

    Multidimensional Binning Techniques for a Two Parameter Trilinear Gauge Coupling Estimation at LEP II

    Get PDF
    This paper describes two generalization schemes of the Optimal Variables technique in estimating simultaneously two Trilinear Gauge Couplings. The first is an iterative procedure to perform a 2-dimensional fit using the linear terms of the expansion of the probability density function with respect to the corresponding couplings, whilst the second is a clustering method of probability distribution representation in five dimensions. The pair production of W's at 183 GeV center of mass energy, where one W decays leptonically and the other hadronically, was used to demonstrate the optimal properties of the proposed estimation techniques.Comment: (25 pages, 11 figures
    corecore