On the Complexity of Computing Time Series Medians Under the Move-Split-Merge Metric

Abstract

We initiate a study of the complexity of MSM-Median, the problem of computing a median of a set of k real-valued time series under the move-split-merge distance. This distance measure is based on three operations: moves, which may shift a data point in a time series; splits, which replace one data point in a time series by two consecutive data points of the same value; and merges, which replace two consecutive data points of equal value by a single data point of the same value. The cost of a move operation is the difference of the data point value before and after the operation, the cost of split and merge operations is defined via a given constant c. Our main results are as follows. First, we show that MSM-Median is NP-hard and W[1]-hard with respect to k for time series with at most three distinct values. Under the Exponential Time Hypothesis (ETH) our reduction implies that a previous dynamic programming algorithm with running time |I|^?(k) [Holznigenkemper et al., Data Min. Knowl. Discov. \u2723] is essentially optimal. Here, |I| denotes the total input size. Second, we show that MSM-Median can be solved in 2^?(d/c)?|I|^?(1) time where d is the total distance of the median to the input time series

    Similar works