41 research outputs found
On Optimal Multiple Changepoint Algorithms for Large Data
There is an increasing need for algorithms that can accurately detect
changepoints in long time-series, or equivalent, data. Many common approaches
to detecting changepoints, for example based on penalised likelihood or minimum
description length, can be formulated in terms of minimising a cost over
segmentations. Dynamic programming methods exist to solve this minimisation
problem exactly, but these tend to scale at least quadratically in the length
of the time-series. Algorithms, such as Binary Segmentation, exist that have a
computational cost that is close to linear in the length of the time-series,
but these are not guaranteed to find the optimal segmentation. Recently pruning
ideas have been suggested that can speed up the dynamic programming algorithms,
whilst still being guaranteed to find true minimum of the cost function. Here
we extend these pruning methods, and introduce two new algorithms for
segmenting data, FPOP and SNIP. Empirical results show that FPOP is
substantially faster than existing dynamic programming methods, and unlike the
existing methods its computational efficiency is robust to the number of
changepoints in the data. We evaluate the method at detecting Copy Number
Variations and observe that FPOP has a computational cost that is competitive
with that of Binary Segmentation.Comment: 20 page
Efficient analysis of complex changepoint problems
Many time series experience abrupt changes in structure. Detecting where these changes in structure, or changepoints, occur is required for effective modelling of the data. In this thesis we explore the common approaches used for detecting changepoints. We focus in particular on techniques which can be formulated in terms of minimising a cost over segmentations and solved exactly using a class of dynamic programming algorithms. Often implementations of these dynamic programming methods have a computational cost which scales poorly with the length of the time series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal. In this thesis we extend these methods. First we develop two new algorithms for segmenting piecewise constant data: FPOP and SNIP. We evaluate them against other methods in the literature. We then move on to develop the method OPPL for detecting changes in data subject to fitting a continuous piecewise linear model. We evaluate it against similar methods. We finally extend the OPPL method to deal with penalties that depend on the segment length
A pruned dynamic programming algorithm to recover the best segmentations with to change-points
A common computational problem in multiple change-point models is to recover
the segmentations with to change-points of minimal cost with
respect to some loss function. Here we present an algorithm to prune the set of
candidate change-points which is based on a functional representation of the
cost of segmentations. We study the worst case complexity of the algorithm when
there is a unidimensional parameter per segment and demonstrate that it is at
worst equivalent to the complexity of the segment neighbourhood algorithm:
. For a particular loss function we demonstrate that
pruning is on average efficient even if there are no change-points in the
signal. Finally, we empirically study the performance of the algorithm in the
case of the quadratic loss and show that it is faster than the segment
neighbourhood algorithm.Comment: 31 pages, An extended version of the pre-prin