30 research outputs found

    An LSH Index for Computing Kendall's Tau over Top-k Lists

    Full text link
    We consider the problem of similarity search within a set of top-k lists under the Kendall's Tau distance function. This distance describes how related two rankings are in terms of concordantly and discordantly ordered items. As top-k lists are usually very short compared to the global domain of possible items to be ranked, creating an inverted index to look up overlapping lists is possible but does not capture tight enough the similarity measure. In this work, we investigate locality sensitive hashing schemes for the Kendall's Tau distance and evaluate the proposed methods using two real-world datasets.Comment: 6 pages, 8 subfigures, presented in Seventeenth International Workshop on the Web and Databases (WebDB 2014) co-located with ACM SIGMOD201

    Labeling Faces Victimization Bunch Primarily Based Internet Pictures Annotation to Produce Authentication in Security

    Get PDF
    Auto face annotation is important in abounding absolute apple advice administration systems. Face tagging in images and videos enjoys abounding abeyant applications in multimedia advice retrieval. Face comment is a meadow of face apprehension and recognition. Mining abominably labeled facial images on the internet shows abeyant classic appear auto face annotation. This blazon of classic motivates the new assay botheration of defended authentication. The ambition of the arrangement is to comment disregarded faces in images and videos with the words that best alarm the image. A framework called seek based face comment (SBFA) provides the way to abundance abominably labeled facial images. Facial images that are accessible on Apple Wide Web (WWW) or the angel database created by the aegis administration can be annotated. A one arduous botheration with the seek based face comment arrangement is how finer accomplish comment by advertisement agnate facial images and their anemic labels which are blatant and incomplete. To affected this botheration proposed admission uses unsupervised characterization clarification (ULR) to clarify the labels of web facial images. To acceleration up the proposed arrangement a absorption based approximation algorithm is used. Uses of comment will advice for user to seek admiration angel and video. As well if arrangement gets implemented in amusing arrangement again it will affected the check of accepted absolute arrangement which tags manually

    Accelerating Nearest Neighbor Search on Manycore Systems

    Full text link
    We develop methods for accelerating metric similarity search that are effective on modern hardware. Our algorithms factor into easily parallelizable components, making them simple to deploy and efficient on multicore CPUs and GPUs. Despite the simple structure of our algorithms, their search performance is provably sublinear in the size of the database, with a factor dependent only on its intrinsic dimensionality. We demonstrate that our methods provide substantial speedups on a range of datasets and hardware platforms. In particular, we present results on a 48-core server machine, on graphics hardware, and on a multicore desktop

    Automatic Solar Tracking System and Fault Detection Using Wireless Technology

    Get PDF
    Renewable energy solutions are becoming increasingly popular. Photovoltaic (solar) systems are but one example. Maximizing power output from a solar system is desirable to increase efficiency. In order to maximize power output from the solar panels, one needs to keep the panels aligned with the sun. As such, a means of tracking the sun is required. This is a far more cost effective solution than purchasing additional solar panels. It has been estimated that the yield from solar panels can be increased by 30 to 60 percent by utilizing a tracking system instead of a stationary array [1]. This project develops an automatic tracking system which will keep the solar panels aligned with the sun in order to maximize efficiency. In this paper we proposed automatic solar tracking system and fault detection using wireless technology as a key to the new era

    Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

    Full text link
    Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online

    Link and code: Fast indexing with graphs and compact regression codes

    Get PDF
    Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements. In this paper, we revisit these approaches by considering, additionally, the memory constraint required to index billions of images on a single server. This leads us to propose a method based both on graph traversal and compact representations. We encode the indexed vectors using quantization and exploit the graph structure to refine the similarity estimation. In essence, our method takes the best of these two worlds: the search strategy is based on nested graphs, thereby providing high precision with a relatively small set of comparisons. At the same time it offers a significant memory compression. As a result, our approach outperforms the state of the art on operating points considering 64-128 bytes per vector, as demonstrated by our results on two billion-scale public benchmarks
    corecore