22 research outputs found

    An LSH Index for Computing Kendall's Tau over Top-k Lists

    Full text link
    We consider the problem of similarity search within a set of top-k lists under the Kendall's Tau distance function. This distance describes how related two rankings are in terms of concordantly and discordantly ordered items. As top-k lists are usually very short compared to the global domain of possible items to be ranked, creating an inverted index to look up overlapping lists is possible but does not capture tight enough the similarity measure. In this work, we investigate locality sensitive hashing schemes for the Kendall's Tau distance and evaluate the proposed methods using two real-world datasets.Comment: 6 pages, 8 subfigures, presented in Seventeenth International Workshop on the Web and Databases (WebDB 2014) co-located with ACM SIGMOD201

    Sign Language and Computing in a Developing Country: A Research Roadmap for the Next Two Decades in the Philippines

    Get PDF
    PACLIC / The University of the Philippines Visayas Cebu College Cebu City, Philippines / November 20-22, 200

    Efficient and Robust Detection of Duplicate Videos in a Database

    Get PDF
    In this paper, the duplicate detection method is to retrieve the best matching model video for a given query video using fingerprint. We have used the Color Layout Descriptor method and Opponent Color Space to extract feature from frame and perform k-means based clustering to generate fingerprints which are further encoded by Vector Quantization. The model-to-query video distance is computed using a new distance measure to find the similarity. To perform efficient search coarse-to-fine matching scheme is used to retrieve best match. We perform experiments on query videos and real time video with an average duration of 60 sec; the duplicate video is detected with high similarity

    Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

    Full text link
    Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online

    PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest

    Full text link
    Latent user representations are widely adopted in the tech industry for powering personalized recommender systems. Most prior work infers a single high dimensional embedding to represent a user, which is a good starting point but falls short in delivering a full understanding of the user's interests. In this work, we introduce PinnerSage, an end-to-end recommender system that represents each user via multi-modal embeddings and leverages this rich representation of users to provides high quality personalized recommendations. PinnerSage achieves this by clustering users' actions into conceptually coherent clusters with the help of a hierarchical clustering method (Ward) and summarizes the clusters via representative pins (Medoids) for efficiency and interpretability. PinnerSage is deployed in production at Pinterest and we outline the several design decisions that makes it run seamlessly at a very large scale. We conduct several offline and online A/B experiments to show that our method significantly outperforms single embedding methods.Comment: 10 pages, 7 figure

    Hotel and Flight Online Booking System for UTP Human Resourse Department

    Get PDF
    This report present a new developed web based system that help to solve the tasks of Human Resource and Administration department (HRMA) of Universiti Teknologi PETRONAS (UTP) in the bookings of hotel and flight of its staffs. Booking of hotel or flight are very common tasks that need to be done by the department almost everyday. However current systems are manual systems and not being integrated yet. These will lead to inefficiency of data management especially in monitoring the status of requests. This report will gives an overview of the underpinning project developed especially in its objectives, problem encountered, scope of study, methodology as well as the findings in developing this project. This project developed purposely to enable staffs to book hotel rooms and flight tickets as requested, generate guarantee letter automatically, send notification to respected endorsers, able to store and retrieve data and allow applicants to access and request the system via personal computer or mobile device by eliminating manual works and increase the flexibility of the system itself. By using Prototyping Methodology as its methodology, this system will be developed to use case-based reasoning method in order for it to complete the task independently. The case-based reasoning functionality embedded in the system provides the system and its users with the options to review past similar request and reuse the solution for the new request to be handled. A realistic case demonstrates an idea on how the overall system will help HRMA staff to reduce time needed for every request to be handled. This approach shall help HRMA to have a better enhancement in their database management system. Implementation of this new system has been proven reduced the time taken needed for every cases handled by HRMA staff. It also helps improve the efficiency and effectiveness in storing the data into this structured system as time taken to apply and endorse can be reduced. Besides speed up the overall process and increase its flexibility, paper usage also can be reduced as the no more hardcopy form being involved. Testing of the project shows that the system able to provide a desired solution which allow users to book the hotel and flight according to their needs. This shows that the developed system will be having net advantages once implemented

    Document Image Indexing Using Edit Distance Based Hashing

    Full text link

    Modelagem de um sistema para apoio à tomada de decisão com uso de técnicas de raciocínio baseado em casos

    Get PDF
    This paper introduces the development process of a knowledge management system built to support decision making processes. This study takes into account different topics as knowledge, intellectual capital, strategies used by companies and a system using case based reasoning techniques. Artificial Intelligence was applied in order to meet the challenges of making knowledge part of the company's culture through Case Based Reasoning (CBR). The proposal developed in this study was applied in the business sector of a company. This action offered the organization detailed analysis of the results, helping the managers to make strategic decisions. Three tools were created and adapted in order to make this application possible and distinguished, they are: linear range of interest for each attribute, drawing of a symmetric matrix of similarity and definition table with values compiled per attributes. This allowed them to have a significant improvement in the accuracy of real-time information, helping to make decisions and generating effective results.Este artigo apresenta o processo de desenvolvimento de um sistema de gestão do conhecimento para apoio à tomada de decisões percorrendo os temas que abordam o conhecimento, seu capital intelectual, as estratégias utilizadas nas empresas e o uso de um sistema que utiliza técnicas de raciocínio baseado em casos. A Inteligência Artificial através da técnica de Raciocínio Baseado em Casos (RBC) foi utilizada atendendo o desafio da Empresa de tornar o conhecimento parte da sua cultura. A proposta desenvolvida foi aplicada no setor comercial da Empresa. Esta ação proporcionou à empresa análises detalhadas dos resultados, auxiliando os gestores na tomada das decisões estratégicas. Três ferramentas foram criadas e adaptadas, para tornar esta aplicação possível e diferenciada, que são: linear da faixa de interesse para cada atributo, elaboração de uma matriz simétrica de similaridade e tabela de definição dos valores por atributos. Isto possibilitou melhoria significativa na acurácia das informações em tempo real, auxiliando na tomada de decisões e gerando resultados eficazes