228 research outputs found
Exact and Heuristic Approaches to Speeding Up the MSM Time Series Distance Computation
The computation of the distance of two time series is time-consuming for any
elastic distance function that accounts for misalignments. Among those
functions, DTW is the most prominent. However, a recent extensive evaluation
has shown that the move-split merge (MSM) metric is superior to DTW regarding
the analytical accuracy of the 1-NN classifier. Unfortunately, the running time
of the standard dynamic programming algorithm for MSM distance computation is
, where is the length of the longest time series. In this
paper, we provide approaches to reducing the cost of MSM distance computations
by using lower and upper bounds for early pruning paths in the underlying
dynamic programming table. For the case of one time series being a constant, we
present a linear-time algorithm. In addition, we propose new linear-time
heuristics and adapt heuristics known from DTW to computing the MSM distance.
One heuristic employs the metric property of MSM and the previously introduced
linear-time algorithm. Our experimental studies demonstrate substantial
speed-ups in our approaches compared to previous MSM algorithms. In particular,
the running time for MSM is faster than a state-of-the-art DTW distance
computation for a majority of the popular UCR data sets
On the Complexity of Computing Time Series Medians Under the Move-Split-Merge Metric
We initiate a study of the complexity of MSM-Median, the problem of computing a median of a set of k real-valued time series under the move-split-merge distance. This distance measure is based on three operations: moves, which may shift a data point in a time series; splits, which replace one data point in a time series by two consecutive data points of the same value; and merges, which replace two consecutive data points of equal value by a single data point of the same value. The cost of a move operation is the difference of the data point value before and after the operation, the cost of split and merge operations is defined via a given constant c.
Our main results are as follows. First, we show that MSM-Median is NP-hard and W[1]-hard with respect to k for time series with at most three distinct values. Under the Exponential Time Hypothesis (ETH) our reduction implies that a previous dynamic programming algorithm with running time |I|^?(k) [Holznigenkemper et al., Data Min. Knowl. Discov. \u2723] is essentially optimal. Here, |I| denotes the total input size. Second, we show that MSM-Median can be solved in 2^?(d/c)?|I|^?(1) time where d is the total distance of the median to the input time series
Multi-Step Processing of Spatial Joins
Spatial joins are one of the most important operations for combining spatial objects of several relations. In this paper, spatial join processing is studied in detail for extended spatial objects in twodimensional data space. We present an approach for spatial join processing that is based on three steps. First, a spatial join is performed on the minimum bounding rectangles of the objects returning a set of candidates. Various approaches for accelerating this step of join processing have been examined at the last yearâs conference [BKS 93a]. In this paper, we focus on the problem how to compute the answers from the set of candidates which is handled by
the following two steps. First of all, sophisticated approximations
are used to identify answers as well as to filter out false hits from
the set of candidates. For this purpose, we investigate various types
of conservative and progressive approximations. In the last step, the
exact geometry of the remaining candidates has to be tested against
the join predicate. The time required for computing spatial join
predicates can essentially be reduced when objects are adequately
organized in main memory. In our approach, objects are first decomposed
into simple components which are exclusively organized
by a main-memory resident spatial data structure. Overall, we
present a complete approach of spatial join processing on complex
spatial objects. The performance of the individual steps of our approach
is evaluated with data sets from real cartographic applications.
The results show that our approach reduces the total execution
time of the spatial join by factors
Efficient Processing of Spatial Joins Using R-Trees
Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems
Bayesian Experimental Design of Magnetic Resonance Imaging Sequences
We show how improved sequences for magnetic resonance imaging can be found through optimization of Bayesian design scores. Combining approximate Bayesian inference and natural image statistics with high-performance numerical computation, we propose the first Bayesian experimental design framework for this problem of high relevance to clinical and brain research. Our solution requires large-scale approximate inference for dense, non-Gaussian models. We propose a novel scalable variational inference algorithm, and show how powerful methods of numerical mathematics can be modified to compute primitives in our framework. Our approach is evaluated on raw data from a 3T MR scanner
- âŠ