6 research outputs found

    Progressive Similarity Search on Time Series Data

    Get PDF
    International audienceTime series data are increasing at a dramatic rate, yet their analysis remains highly relevant in a wide range of human activities. Due to their volume, existing systems dealing with time series data cannot guarantee interactive response times, even for fundamental tasks such as similarity search. Therefore , in this paper, we present our vision to develop analytic approaches that support exploration and decision making by providing progressive results, before the final and exact ones have been computed. We demonstrate through experiments that providing first approximate and then progressive answers is useful (and necessary) for similarity search queries on very large time series data. Our findings indicate that there is a gap between the time the most similar answer is found and the time when the search algorithm terminates, resulting in inflated waiting times without any improvement. We present preliminary ideas on computing probabilistic estimates of the final results that could help users decide when to stop the search process, i.e., deciding when improvement in the final answer is unlikely, thus eliminating waiting time. Finally, we discuss two additional challenges: how to compute efficiently these probabilistic estimates, and how to communicate them to users

    Hercules Against Data Series Similarity Search

    Full text link
    We propose Hercules, a parallel tree-based technique for exact similarity search on massive disk-based data series collections. We present novel index construction and query answering algorithms that leverage different summarization techniques, carefully schedule costly operations, optimize memory and disk accesses, and exploit the multi-threading and SIMD capabilities of modern hardware to perform CPU-intensive calculations. We demonstrate the superiority and robustness of Hercules with an extensive experimental evaluation against state-of-the-art techniques, using many synthetic and real datasets, and query workloads of varying difficulty. The results show that Hercules performs up to one order of magnitude faster than the best competitor (which is not always the same). Moreover, Hercules is the only index that outperforms the optimized scan on all scenarios, including the hard query workloads on disk-based datasets. This paper was published in the Proceedings of the VLDB Endowment, Volume 15, Number 10, June 2022

    Intelligence artificielle: Les défis actuels et l'action d'Inria - Livre blanc Inria

    Get PDF
    Livre blanc Inria N°01International audienceInria white papers look at major current challenges in informatics and mathematics and show actions conducted by our project-teams to address these challenges. This document is the first produced by the Strategic Technology Monitoring & Prospective Studies Unit. Thanks to a reactive observation system, this unit plays a lead role in supporting Inria to develop its strategic and scientific orientations. It also enables the institute to anticipate the impact of digital sciences on all social and economic domains. It has been coordinated by Bertrand Braunschweig with contributions from 45 researchers from Inria and from our partners. Special thanks to Peter Sturm for his precise and complete review.Les livres blancs d’Inria examinent les grands défis actuels du numérique et présentent les actions menées par noséquipes-projets pour résoudre ces défis. Ce document est le premier produit par la cellule veille et prospective d’Inria. Cette unité, par l’attention qu’elle porte aux évolutions scientifiques et technologiques, doit jouer un rôle majeur dans la détermination des orientations stratégiques et scientifiques d’Inria. Elle doit également permettre à l’Institut d’anticiper l’impact des sciences du numérique dans tous les domaines sociaux et économiques. Ce livre blanc a été coordonné par Bertrand Braunschweig avec des contributions de 45 chercheurs d’Inria et de ses partenaires. Un grand merci à Peter Sturm pour sa relecture précise et complète. Merci également au service STIP du centre de Saclay – Île-de-France pour la correction finale de la version française

    Progressive Similarity Search on Time Series Data

    Get PDF
    International audienceTime series data are increasing at a dramatic rate, yet their analysis remains highly relevant in a wide range of human activities. Due to their volume, existing systems dealing with time series data cannot guarantee interactive response times, even for fundamental tasks such as similarity search. Therefore , in this paper, we present our vision to develop analytic approaches that support exploration and decision making by providing progressive results, before the final and exact ones have been computed. We demonstrate through experiments that providing first approximate and then progressive answers is useful (and necessary) for similarity search queries on very large time series data. Our findings indicate that there is a gap between the time the most similar answer is found and the time when the search algorithm terminates, resulting in inflated waiting times without any improvement. We present preliminary ideas on computing probabilistic estimates of the final results that could help users decide when to stop the search process, i.e., deciding when improvement in the final answer is unlikely, thus eliminating waiting time. Finally, we discuss two additional challenges: how to compute efficiently these probabilistic estimates, and how to communicate them to users
    corecore