7,817,374 research outputs found
Univariate Mean Change Point Detection: Penalization, CUSUM and Optimality
The problem of univariate mean change point detection and localization based
on a sequence of independent observations with piecewise constant means has
been intensively studied for more than half century, and serves as a blueprint
for change point problems in more complex settings. We provide a complete
characterization of this classical problem in a general framework in which the
upper bound on the noise variance, the minimal spacing
between two consecutive change points and the minimal magnitude of the
changes, are allowed to vary with . We first show that consistent
localization of the change points, when the signal-to-noise ratio , is impossible. In contrast, when
diverges with at the rate of at least
, we demonstrate that two computationally-efficient change
point estimators, one based on the solution to an -penalized least
squares problem and the other on the popular wild binary segmentation
algorithm, are both consistent and achieve a localization rate of the order
. We further show that such rate is minimax
optimal, up to a term
Heterogeneous Change Point Inference
We propose HSMUCE (heterogeneous simultaneous multiscale change-point
estimator) for the detection of multiple change-points of the signal in a
heterogeneous gaussian regression model. A piecewise constant function is
estimated by minimizing the number of change-points over the acceptance region
of a multiscale test which locally adapts to changes in the variance. The
multiscale test is a combination of local likelihood ratio tests which are
properly calibrated by scale dependent critical values in order to keep a
global nominal level alpha, even for finite samples. We show that HSMUCE
controls the error of over- and underestimation of the number of change-points.
To this end, new deviation bounds for F-type statistics are derived. Moreover,
we obtain confidence sets for the whole signal. All results are non-asymptotic
and uniform over a large class of heterogeneous change-point models. HSMUCE is
fast to compute, achieves the optimal detection rate and estimates the number
of change-points at almost optimal accuracy for vanishing signals, while still
being robust. We compare HSMUCE with several state of the art methods in
simulations and analyse current recordings of a transmembrane protein in the
bacterial outer membrane with pronounced heterogeneity for its states. An
R-package is available online
Multiscale Change-Point Inference
We introduce a new estimator SMUCE (simultaneous multiscale change-point
estimator) for the change-point problem in exponential family regression. An
unknown step function is estimated by minimizing the number of change-points
over the acceptance region of a multiscale test at a level \alpha. The
probability of overestimating the true number of change-points K is controlled
by the asymptotic null distribution of the multiscale test statistic. Further,
we derive exponential bounds for the probability of underestimating K. By
balancing these quantities, \alpha will be chosen such that the probability of
correctly estimating K is maximized. All results are even non-asymptotic for
the normal case. Based on the aforementioned bounds, we construct
asymptotically honest confidence sets for the unknown step function and its
change-points. At the same time, we obtain exponential bounds for estimating
the change-point locations which for example yield the minimax rate O(1/n) up
to a log term. Finally, SMUCE asymptotically achieves the optimal detection
rate of vanishing signals. We illustrate how dynamic programming techniques can
be employed for efficient computation of estimators and confidence regions. The
performance of the proposed multiscale approach is illustrated by simulations
and in two cutting-edge applications from genetic engineering and photoemission
spectroscopy
Algebraic Change-Point Detection
Elementary techniques from operational calculus, differential algebra, and
noncommutative algebra lead to a new approach for change-point detection, which
is an important field of investigation in various areas of applied sciences and
engineering. Several successful numerical experiments are presented
Graph-Based Change-Point Detection
We consider the testing and estimation of change-points -- locations where
the distribution abruptly changes -- in a data sequence. A new approach, based
on scan statistics utilizing graphs representing the similarity between
observations, is proposed. The graph-based approach is non-parametric, and can
be applied to any data set as long as an informative similarity measure on the
sample space can be defined. Accurate analytic approximations to the
significance of graph-based scan statistics for both the single change-point
and the changed interval alternatives are provided. Simulations reveal that the
new approach has better power than existing approaches when the dimension of
the data is moderate to high. The new approach is illustrated on two
applications: The determination of authorship of a classic novel, and the
detection of change in a network over time
Optimal change point detection and localization in sparse dynamic networks
We study the problem of change point localization in dynamic networks models. We assume that we observe a sequence of independent adjacency matrices of the same size, each corresponding to a realization of an unknown inhomogeneous Bernoulli model. The underlying distribution of the adjacency matrices are piecewise constant, and may change over a subset of the time points, called change points. We are concerned with recovering the unknown number and positions of the change points. In our model setting, we allow for all the model parameters to change with the total number of time points, including the network size, the minimal spacing between consecutive change points, the magnitude of the smallest change and the degree of sparsity of the networks. We first identify a region of impossibility in the space of the model parameters such that no change point estimator is provably consistent if the data are generated according to parameters falling in that region. We propose a computationally-simple algorithm for network change point localization, called network binary segmentation, that relies on weighted averages of the adjacency matrices. We show that network binary segmentation is consistent over a range of the model parameters that nearly cover the complement of the impossibility region, thus demonstrating the existence of a phase transition for the problem at hand. Next, we devise a more sophisticated algorithm based on singular value thresholding, called local refinement, that delivers more accurate estimates of the change point locations. Under appropriate conditions, local refinement guarantees a minimax optimal rate for network change point localization while remaining computationally feasible
Ratio tests for change point detection
We propose new tests to detect a change in the mean of a time series. Like
many existing tests, the new ones are based on the CUSUM process. Existing
CUSUM tests require an estimator of a scale parameter to make them
asymptotically distribution free under the no change null hypothesis. Even if
the observations are independent, the estimation of the scale parameter is not
simple since the estimator for the scale parameter should be at least
consistent under the null as well as under the alternative. The situation is
much more complicated in case of dependent data, where the empirical spectral
density at 0 is used to scale the CUSUM process. To circumvent these
difficulties, new tests are proposed which are ratios of CUSUM functionals. We
demonstrate the applicability of our method to detect a change in the mean when
the errors are AR(1) and GARCH(1,1) sequences.Comment: Published in at http://dx.doi.org/10.1214/193940307000000220 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
Dynamic change-point detection using similarity networks
From a sequence of similarity networks, with edges representing certain
similarity measures between nodes, we are interested in detecting a
change-point which changes the statistical property of the networks. After the
change, a subset of anomalous nodes which compares dissimilarly with the normal
nodes. We study a simple sequential change detection procedure based on
node-wise average similarity measures, and study its theoretical property.
Simulation and real-data examples demonstrate such a simply stopping procedure
has reasonably good performance. We further discuss the faulty sensor isolation
(estimating anomalous nodes) using community detection.Comment: appeared in Asilomar Conference 201
- …
