Search CORE

31 research outputs found

A Better Alternative to Piecewise Linear Time Series Segmentation

Author: Lemire Daniel
Publication venue
Publication date: 01/01/2005
Field of study

Time series are difficult to monitor, summarize and predict. Segmentation organizes time series into few intervals having uniform characteristics (flatness, linearity, modality, monotonicity and so on). For scalability, we require fast linear time algorithms. The popular piecewise linear model can determine where the data goes up or down and at what rate. Unfortunately, when the data does not follow a linear model, the computation of the local slope creates overfitting. We propose an adaptive time series model where the polynomial degree of each interval vary (constant, linear and so on). Given a number of regressors, the cost of each interval is its polynomial degree: constant intervals cost 1 regressor, linear intervals cost 2 regressors, and so on. Our goal is to minimize the Euclidean (l_2) error for a given model complexity. Experimentally, we investigate the model where intervals can be either constant or linear. Over synthetic random walks, historical stock market prices, and electrocardiograms, the adaptive model provides a more accurate segmentation than the piecewise linear model without increasing the cross-validation error or the running time, while providing a richer vocabulary to applications. Implementation issues, such as numerical stability and real-world performance, are discussed.Comment: to appear in SIAM Data Mining 200

arXiv.org e-Print Archive

CiteSeerX

R-libre

Archipel - Université du Québec à Montréal

clustering.sc.dp: Optimal Clustering with Sequential Constraint by Using Dynamic Programming

Author: Szkaliczki Tibor
Publication venue
Publication date: 01/01/2016
Field of study

SZTAKI Publication Repository

Fast likelihood-based change point detection

Author: Tatti Nikolaj
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/01/2023
Field of study

O(n)

time, where

n

is the number of entries since the last change point. This is too expensive for large

n

. To combat this we propose an approximation scheme that yields

(1 - \epsilon)

approximation in

O(\epsilon^{-1} \log^2 n)

time. The speed-up consists of several steps: First we reduce the number of possible candidates by adopting a known result from segmentation problems. We then show that for fixed bernoulli parameters we can find the optimal change point in logarithmic time. Finally, we show how to construct a candidate list of size

O(\epsilon^{-1} \log n)

for model parameters. We demonstrate empirically the approximation quality and the running time of our algorithm, showing that we can gain a significant speed-up with a minimal average loss in optimality

arXiv.org e-Print Archive

Fast Likelihood-Based Change Point Detection

Author: A Dries
GJ Ross
J de Leeuw
M Basseville
N Tatti
N Tatti
R Bellman
S Aminikhanghahi
S Guha
S Guha
T Calders
Y Kawahara
Publication venue: Springer International Publishing AG
Publication date: 01/01/2020
Field of study

Change point detection plays a fundamental role in many real-world applications, where the goal is to analyze and monitor the behaviour of a data stream. In this paper, we study change detection in binary streams. To this end, we use a likelihood ratio between two models as a measure for indicating change. The first model is a single bernoulli variable while the second model divides the stored data in two segments, and models each segment with its own bernoulli variable. Finding the optimal split can be done in O(n) time, where n is the number of entries since the last change point. This is too expensive for large n. To combat this we propose an approximation scheme that yields (1 - epsilon) approximation in O(epsilon(-1) log(2) n) time. The speed-up consists of several steps: First we reduce the number of possible candidates by adopting a known result from segmentation problems. We then show that for fixed bernoulli parameters we can find the optimal change point in logarithmic time. Finally, we show how to construct a candidate list of size O(epsilon(-1) log n) formodel parameters. We demonstrate empirically the approximation quality and the running time of our algorithm, showing that we can gain a significant speed-up with a minimal average loss in optimality.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto