41 research outputs found

    On Optimal Multiple Changepoint Algorithms for Large Data

    Get PDF
    There is an increasing need for algorithms that can accurately detect changepoints in long time-series, or equivalent, data. Many common approaches to detecting changepoints, for example based on penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. Dynamic programming methods exist to solve this minimisation problem exactly, but these tend to scale at least quadratically in the length of the time-series. Algorithms, such as Binary Segmentation, exist that have a computational cost that is close to linear in the length of the time-series, but these are not guaranteed to find the optimal segmentation. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to find true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data, FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method at detecting Copy Number Variations and observe that FPOP has a computational cost that is competitive with that of Binary Segmentation.Comment: 20 page

    Efficient analysis of complex changepoint problems

    Get PDF
    Many time series experience abrupt changes in structure. Detecting where these changes in structure, or changepoints, occur is required for effective modelling of the data. In this thesis we explore the common approaches used for detecting changepoints. We focus in particular on techniques which can be formulated in terms of minimising a cost over segmentations and solved exactly using a class of dynamic programming algorithms. Often implementations of these dynamic programming methods have a computational cost which scales poorly with the length of the time series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal. In this thesis we extend these methods. First we develop two new algorithms for segmenting piecewise constant data: FPOP and SNIP. We evaluate them against other methods in the literature. We then move on to develop the method OPPL for detecting changes in data subject to fitting a continuous piecewise linear model. We evaluate it against similar methods. We finally extend the OPPL method to deal with penalties that depend on the segment length

    A pruned dynamic programming algorithm to recover the best segmentations with 11 to KmaxK_{max} change-points

    Get PDF
    A common computational problem in multiple change-point models is to recover the segmentations with 11 to KmaxK_{max} change-points of minimal cost with respect to some loss function. Here we present an algorithm to prune the set of candidate change-points which is based on a functional representation of the cost of segmentations. We study the worst case complexity of the algorithm when there is a unidimensional parameter per segment and demonstrate that it is at worst equivalent to the complexity of the segment neighbourhood algorithm: O(Kmaxn2)\mathcal{O}(K_{max} n^2). For a particular loss function we demonstrate that pruning is on average efficient even if there are no change-points in the signal. Finally, we empirically study the performance of the algorithm in the case of the quadratic loss and show that it is faster than the segment neighbourhood algorithm.Comment: 31 pages, An extended version of the pre-prin
    corecore