Search CORE

15 research outputs found

Group lassoing change-points in piecewise-constant AR processes

Author: A Puig
A Ruszczynski
A Willsky
B Efron
D Rudoy
E Lebarbier
E Punskaya
EJ Candes
FR Bach
G Winkler
H Zou
I Gorodnitsky
J Fan
J Tropp
J-J Fuchs
JF Sturm
L Boysen
M Basseville
M Lavielle
M Lavielle
M Lavielle
M Yuan
MG Hall
N Dobigeon
P Fearnhead
P Stoica
P Tseng
PJ Brockwell
PM Djuric
R Tibshirani
R Tibshirani
U Appel
Y Nardi
YC Eldar
YC Eldar
Z Harchaoui
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Selective review of offline change point detection methods

Author: Oudre Laurent
Truong Charles
Vayatis Nicolas
Publication venue: 'Elsevier BV'
Publication date: 01/02/2020
Field of study

This article presents a selective survey of algorithms for the offline detection of multiple change points in multivariate time series. A general yet structuring methodological strategy is adopted to organize this vast body of work. More precisely, detection algorithms considered in this review are characterized by three elements: a cost function, a search method and a constraint on the number of changes. Each of those elements is described, reviewed and discussed separately. Implementations of the main algorithms described in this article are provided within a Python package called ruptures

arXiv.org e-Print Archive

HAL-Paris 13

Econometric methods and data Science techniques: A review of two strands of literature and an introduction to hybrid methods

Author: Jun YU
XIE Tian
ZENG Tao
Publication venue: SMU Economics and Statistics Working Paper Series, Paper No. 16-2020
Publication date: 01/05/2020
Field of study

Institutional Knowledge at Singapore Management University

Optimal Change Point Detection by Error Maximization

Author: Serre Aurélien
Publication venue
Publication date: 01/08/2019
Field of study

RÉSUMÉ : L’entreprise PREDICT qui est notre partenaire sur ce projet est entre autres spécialisé dans la surveillance d’installations industrielles. Cela consiste à surveiller certains paramètres mesurés sur un équipement (comme par exemple des mesures de température, pression, vibrations etc...) au cours du temps de manière à détecter d’éventuels signes précurseurs d’une panne. Cette tâche nécessite très souvent de détecter au préalable les instants auxquels des opérations de maintenance ont été effectuées, ou encore où l’usage fait de l’équipement change. De plus, la détection doit être faite simplement à l’aide des mesures surveillées, sans accès à de l’information supplémentaires à propos des événements à détecter. Ce problème peut être formulé comme un problème de détection de ruptures, qui consiste à estimer les instants où les propriétés statistiques d’une série temporelle changent de manière abrupte. La détection de ruptures a énormément été étudié en traitement du signal, et a des applications dans de nombreux domaines tels qu’en bioinformatique, en analyse du climat, en finance, en traitement de la parole, ainsi qu’en maintenance conditionnelle. Dans la littérature, de nombreuse méthodes fréquentistes existent qui consistent à associer un coût à toute configuration possible des positions des ruptures à l’aide d’un modèle statistique de la série temporelle. Le nombre de ruptures et leur positions sont ensuite estimés en maximisant ce coût sur toutes les configurations possibles. En général, le coût est conçu pour représenter l’ajustement du modèle constant par morceaux associé à la configuration. Dans ce mémoire, nous proposons une nouvelle approche du problème de détection de ruptures qui consiste à maximiser la différence des propriétés statistiques entre segments consécutifs séparés par les points de ruptures. Pour cela, nous développons un nouveau type de fonction objectif basé sur la différence de propriétés statistiques, par opposition aux fonctions objectif basées sur l’ajustement utilisées dans la littérature. Étant donné que ce nouveau type de fonction objectif n’est pas compatible avec les algorithmes existants, nous introduisons également deux algorithmes permettant la résolution du problème d’optimisation correspondant à cette nouvelle fonction objectif. Nous comparons les performances de cette nouvelle approche avec trois méthodes issues de la littérature. Deux d’entre elles, appelées Pruned Exact Linear Time and Segment Neighbourhood sont exacte, tandis que la troisième, Sliding Adjacent Windows est une méthode approximative basée sur une fonction objectif similaire à celle que nous proposons. Nous effectuons cette comparaison à l’aide de deux jeux de données empiriques, dont l’un nous a été fourni par PREDICT et correspond à un cas d’application qui les intéresse. Nous montrons que sur ces deux jeux de données montrent que notre méthode est capable d’estimer la position des ruptures de manière plus précise que les trois méthodes concurrentes. Notre approche peut être appliquée à de nombreux types de séries temporelles différentes, grâce au fait qu’elle peut être combinée avec de nombreux modèles différents pour décrire la série temporelle. De plus, cette approche se révèle efficaces dans des cas d’application concrets de notre partenaire PREDICT.----------ABSTRACT : Our partner PREDICT is specialized in condition monitoring of industrial systems, which consists in monitoring certain parameters measured on an equipment (such as temperature, pressure, vibration) through time in order to detect signs indicative of a developing fault. For performing this task, they often need to detect events such as the occurrence of maintenance operations or changes of the conditions in which the equipment is being operated. This detection task needs to be performed using only the monitored measurements, with no additional external information available about the events. This problem can be formulated as a change point detection problem, which consists in detecting and finding the positions of abrupt changes of the statistical properties of a time-series. Change point detection has been extensively studied in signal processing, and has applications in a wide range of fields such as bioinformatics, climate analysis, finance, speech processing and condition based maintenance. In the literature, many frequentist methods have been developed, where a statistical model of the time-series is used to assign a cost to any possible configuration of the change points. The estimated number and positions of the change points is then obtained by minimizing this cost over the set of all possible configurations. The cost is typically a measure of the goodness of fit of a piecewise constant model that changes at each change point. In this work, we propose a new approach to change point detection that consists in maximizing the discrepancy of the statistical properties between consecutive segments delimited by the change points. We do this by developing a new type of discrepancy-based objective function different from the goodness of fit-based cost functions from the literature. We also propose an appropriate algorithm for solving the associated optimization problem, since our new type of objective function is not compatible with the existing algorithms. We compare the performance of this new approach against two exact methods called Pruned Exact Linear Time and Segment Neighbourhood, as well as an approximate method based on a similar objective function called Sliding Adjacent Windows. This comparison is performed on two real-world datasets, one of them being supplied by our partner PREDICT, and corresponding to a use case they would be interested in. On both of these datasets, we show that our approach is able to estimate the positions of the change points more accurately than the three competitors. Our approach can be applied on a wide range of different types time-series, since it can accommodate many different models for the data. Moreover, it proves to be useful to our partner PREDICT on concrete use cases

PolyPublie

Shrinkage Estimation of Regression Models with Multiple Structural Changes

Author: QIAN Junhui
SU Liangjun
Publication venue: SMU Economics and Statistics Working Paper Series, No. 06-2014
Publication date: 01/08/2014
Field of study

Published in Econometric Theory https://doi.org/10.1017/S0266466615000237</p

Institutional Knowledge at Singapore Management University

Shrinkage estimation of regression models with multiple structural changes

Author: QIAN Junhai
SU Liangjun
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/12/2016
Field of study

Ministry of Education, Singapore under its Academic Research Funding Tier

Institutional Knowledge at Singapore Management University

Change-points Estimation in Statistical Inference and Machine Learning Problems

Author: Zhang Bingwen
Publication venue: Digital WPI
Publication date: 14/08/2017
Field of study

Statistical inference plays an increasingly important role in science, finance and industry. Despite the extensive research and wide application of statistical inference, most of the efforts focus on uniform models. This thesis considers the statistical inference in models with abrupt changes instead. The task is to estimate change-points where the underlying models change. We first study low dimensional linear regression problems for which the underlying model undergoes multiple changes. Our goal is to estimate the number and locations of change-points that segment available data into different regions, and further produce sparse and interpretable models for each region. To address challenges of the existing approaches and to produce interpretable models, we propose a sparse group Lasso (SGL) based approach for linear regression problems with change-points. Then we extend our method to high dimensional nonhomogeneous linear regression models. Under certain assumptions and using a properly chosen regularization parameter, we show several desirable properties of the method. We further extend our studies to generalized linear models (GLM) and prove similar results. In practice, change-points inference usually involves high dimensional data, hence it is prone to tackle for distributed learning with feature partitioning data, which implies each machine in the cluster stores a part of the features. One bottleneck for distributed learning is communication. For this implementation concern, we design communication efficient algorithm for feature partitioning data sets to speed up not only change-points inference but also other classes of machine learning problem including Lasso, support vector machine (SVM) and logistic regression

DigitalCommons@WPI