4 research outputs found

    Improving metric access methods with bucket files

    Get PDF
    Modern applications deal with complex data, where retrieval by similarity plays an important role in most of them. Complex data whose primary comparison mechanisms are similarity predicates are usually immersed in metric spaces. Metric Access Methods (MAMs) exploit the metric space properties to divide the metric space into regions and conquer efficiency on the processing of similarity queries, like range and k-nearest neighbor queries. \ud Existing MAM use homogeneous data structures to improve query execution, pursuing the same techniques employed by traditional methods developed to retrieve scalar and multidimensional data. In this paper, we combine hashing and hierarchical ball partitioning approaches to achieve a hybrid index that is tuned to improve similarity queries targeting complex data sets, with search algorithms that reduce total execution time by aggressively reducing the number of distance calculations. We applied our technique in the Slim-tree and performed experiments over real data sets showing that the proposed technique is able to reduce the execution time of both range and k-nearest queries to at least half of the Slim-tree. Moreover, this technique is general to be applied over many existing MAM.CAPESCNPqFAPESPInternational Conference on Similarity Search and Applications - SISAP (8. 2015 Glasgow

    3D oceanographic data compression using 3D-ODETLAP

    Get PDF
    This paper describes a 3D environmental data compression technique for oceanographic datasets. With proper point selection, our method approximates uncompressed marine data using an over-determined system of linear equations based on, but essentially different from, the Laplacian partial differential equation. Then this approximation is refined via an error metric. These two steps work alternatively until a predefined satisfying approximation is found. Using several different datasets and metrics, we demonstrate that our method has an excellent compression ratio. To further evaluate our method, we compare it with 3D-SPIHT. 3D-ODETLAP averages 20% better compression than 3D-SPIHT on our eight test datasets, from World Ocean Atlas 2005. Our method provides up to approximately six times better compression on datasets with relatively small variance. Meanwhile, with the same approximate mean error, we demonstrate a significantly smaller maximum error compared to 3D-SPIHT and provide a feature to keep the maximum error under a user-defined limit

    Uma proposta para execução de consultas complexas em uma grande base de dados de imagens horizontalmente fragmentada

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2014.Sistemas de recuperação de informação têm se tornado cada vez mais populares e eficientes. Porém, a recuperação de objetos complexos (e.g., imagens, vídeos, séries temporais) ainda apresenta enormes desafios, principalmente quando envolve similaridade de conteúdo. O problema se torna ainda mais intrincado se as condições de busca incluem predicados convencionais conectados logicamente à predicados baseados em similaridade. A otimização de tais consultas é um problema em aberto hoje em dia. Este trabalho valida uma proposta para melhorar o desempenho de consultas que podem ser expressas por conjunções de predicados convencionais e baseados em similaridade. Tal proposta utiliza fragmentação de dados, segundo predicados diversos e compatíveis com predicados utilizados em consultas. A validação da proposta é feita sobre uma grande base de dados chamada CoPhIR a respeito de imagens, com dados convencionais a elas relacionados. Esta base é manipulada em um sistema de banco de dados relacional com extensões para o tratamento de predicados baseados em similaridade, caracterizada segundo a distribuição do seu conteúdo, fragmentada e indexada, com métodos de acesso convencionais e métricos. Verificou-se um melhor desempenho na execução de algumas consultas com cláusulas conjuntivas para filtragem de dados utilizando os fragmentos propostos do que sobre a base completa.Abstract : Information retrieval systems are growing in popularity and efficiency. However, the retrieval of complex data (e.g., images, video, temporal series) presents huge challenges yet, particularly when it involves content similarity. The problem becomes even more intricate if the search condition includes conventional predicates logically connected to similarity-based predicates. The optimization of such queries is an open problem nowadays. This work validates a proposal for improving the performance of queries that can be expressed by conjunctions of conventional predicates and similarity-based predicates. This proposal employs data fragmentation, according to diverse predicates, that are compatible with the predicates used in queries. The validation of this proposal is done on a large image database, named CoPhIR with conventional data associated with the images. This database is handled in a relational database system with extensions for coping with similarity-based predicates, characterized according to contents distribution, fragmented and indexed, for efficient access with conventional methods and metric methods. The result of the experiments shows that for some queries with conjunctive filtering clauses were executed more efficiently on fragments than by accessing the complete database
    corecore