132,457 research outputs found
Progressive Similarity Search on Time Series Data
International audienceTime series data are increasing at a dramatic rate, yet their analysis remains highly relevant in a wide range of human activities. Due to their volume, existing systems dealing with time series data cannot guarantee interactive response times, even for fundamental tasks such as similarity search. Therefore , in this paper, we present our vision to develop analytic approaches that support exploration and decision making by providing progressive results, before the final and exact ones have been computed. We demonstrate through experiments that providing first approximate and then progressive answers is useful (and necessary) for similarity search queries on very large time series data. Our findings indicate that there is a gap between the time the most similar answer is found and the time when the search algorithm terminates, resulting in inflated waiting times without any improvement. We present preliminary ideas on computing probabilistic estimates of the final results that could help users decide when to stop the search process, i.e., deciding when improvement in the final answer is unlikely, thus eliminating waiting time. Finally, we discuss two additional challenges: how to compute efficiently these probabilistic estimates, and how to communicate them to users
ProS: Data Series Progressive k-NN Similarity Search and Classification with Probabilistic Quality Guarantees
Existing systems dealing with the increasing volume of data series cannot
guarantee interactive response times, even for fundamental tasks such as
similarity search. Therefore, it is necessary to develop analytic approaches
that support exploration and decision making by providing progressive results,
before the final and exact ones have been computed. Prior works lack both
efficiency and accuracy when applied to large-scale data series collections. We
present and experimentally evaluate ProS, a new probabilistic learning-based
method that provides quality guarantees for progressive Nearest Neighbor (NN)
query answering. We develop our method for k-NN queries and demonstrate how it
can be applied with the two most popular distance measures, namely, Euclidean
and Dynamic Time Warping (DTW). We provide both initial and progressive
estimates of the final answer that are getting better during the similarity
search, as well suitable stopping criteria for the progressive queries.
Moreover, we describe how this method can be used in order to develop a
progressive algorithm for data series classification (based on a k-NN
classifier), and we additionally propose a method designed specifically for the
classification task. Experiments with several and diverse synthetic and real
datasets demonstrate that our prediction methods constitute the first practical
solutions to the problem, significantly outperforming competing approaches.
This paper was published in the VLDB Journal (2022)
DROP: Dimensionality Reduction Optimization for Time Series
Dimensionality reduction is a critical step in scaling machine learning
pipelines. Principal component analysis (PCA) is a standard tool for
dimensionality reduction, but performing PCA over a full dataset can be
prohibitively expensive. As a result, theoretical work has studied the
effectiveness of iterative, stochastic PCA methods that operate over data
samples. However, termination conditions for stochastic PCA either execute for
a predetermined number of iterations, or until convergence of the solution,
frequently sampling too many or too few datapoints for end-to-end runtime
improvements. We show how accounting for downstream analytics operations during
DR via PCA allows stochastic methods to efficiently terminate after operating
over small (e.g., 1%) subsamples of input data, reducing whole workload
runtime. Leveraging this, we propose DROP, a DR optimizer that enables speedups
of up to 5x over Singular-Value-Decomposition-based PCA techniques, and exceeds
conventional approaches like FFT and PAA by up to 16x in end-to-end workloads
Context-driven progressive enhancement of mobile web applications: a multicriteria decision-making approach
Personal computing has become all about mobile and embedded devices. As a result, the adoption rate of smartphones is rapidly increasing and this trend has set a need for mobile applications to be available at anytime, anywhere and on any device. Despite the obvious advantages of such immersive mobile applications, software developers are increasingly facing the challenges related to device fragmentation. Current application development solutions are insufficiently prepared for handling the enormous variety of software platforms and hardware characteristics covering the mobile eco-system. As a result, maintaining a viable balance between development costs and market coverage has turned out to be a challenging issue when developing mobile applications. This article proposes a context-aware software platform for the development and delivery of self-adaptive mobile applications over the Web. An adaptive application composition approach is introduced, capable of autonomously bypassing context-related fragmentation issues. This goal is achieved by incorporating and validating the concept of fine-grained progressive application enhancements based on a multicriteria decision-making strategy
- …