15 research outputs found
Authors' reply to the discussion of 'Automatic change-point detection in time series via deep learning' at the discussion meeting on 'Probabilistic and statistical aspects of machine learning'
We would like to thank the proposer, seconder, and all discussants for their time in reading our article and their thought-provoking comments. We are glad to find a broad consensus that neural-network-based approach offers a flexible framework for automatic change-point analysis. There are a number of common themes to the comments, and we have therefore structured our response around the topics of the theory, training, the importance of standardization and possible extensions, before addressing some of the remaining individual comments
Robust mean change point testing in high-dimensional data with heavy tails
We study a mean change point testing problem for high-dimensional data, with
exponentially- or polynomially-decaying tails. In each case, depending on the
-norm of the mean change vector, we separately consider dense and
sparse regimes. We characterise the boundary between the dense and sparse
regimes under the above two tail conditions for the first time in the change
point literature and propose novel testing procedures that attain optimal rates
in each of the four regimes up to a poly-iterated logarithmic factor. Our
results quantify the costs of heavy-tailedness on the fundamental difficulty of
change point testing problems for high-dimensional data by comparing to the
previous results under Gaussian assumptions.
To be specific, when the error vectors follow sub-Weibull distributions, a
CUSUM-type statistic is shown to achieve a minimax testing rate up to
. When the error distributions have polynomially-decaying
tails, admitting bounded -th moments for some , we
introduce a median-of-means-type test statistic that achieves a near-optimal
testing rate in both dense and sparse regimes. In particular, in the sparse
regime, we further propose a computationally-efficient test to achieve the
exact optimality. Surprisingly, our investigation in the even more challenging
case of , unveils a new phenomenon that the minimax testing
rate has no sparse regime, i.e. testing sparse changes is
information-theoretically as hard as testing dense changes. This phenomenon
implies a phase transition of the minimax testing rates at .Comment: 50 pages, 1 figur
Automatic Change-Point Detection in Time Series via Deep Learning
Detecting change-points in data is challenging because of the range of
possible types of change and types of behaviour of data when there is no
change. Statistically efficient methods for detecting a change will depend on
both of these features, and it can be difficult for a practitioner to develop
an appropriate detection method for their application of interest. We show how
to automatically generate new detection methods based on training a neural
network. Our approach is motivated by many existing tests for the presence of a
change-point being able to be represented by a simple neural network, and thus
a neural network trained with sufficient data should have performance at least
as good as these methods. We present theory that quantifies the error rate for
such an approach, and how it depends on the amount of training data. Empirical
results show that, even with limited training data, its performance is
competitive with the standard CUSUM test for detecting a change in mean when
the noise is independent and Gaussian, and can substantially outperform it in
the presence of auto-correlated or heavy-tailed noise. Our method also shows
strong results in detecting and localising changes in activity based on
accelerometer data.Comment: 16 pages, 5 figures and 1 tabl
Automatic Change-Point Detection in Time Series via Deep Learning
Detecting change-points in data is challenging because of the range of possible types of change and types of behaviour of data when there is no change. Statistically efficient methods for detecting a change will depend on both of these features, and it can be difficult for a practitioner to develop an appropriate detection method for their application of interest. We show how to automatically generate new detection methods based on training a neural network. Our approach is motivated by many existing tests for the presence of a change-point being able to be represented by a simple neural network, and thus a neural network trained with sufficient data should have performance at least as good as these methods. We present theory that quantifies the error rate for such an approach, and how it depends on the amount of training data. Empirical results show that, even with limited training data, its performance is competitive with the standard CUSUM test for detecting a change in mean when the noise is independent and Gaussian, and can substantially outperform it in the presence of auto-correlated or heavy-tailed noise. Our method also shows strong results in detecting and localising changes in activity based on accelerometer data
A Survey of Bayesian Statistical Approaches for Big Data
The modern era is characterised as an era of information or Big Data. This
has motivated a huge literature on new methods for extracting information and
insights from these data. A natural question is how these approaches differ
from those that were available prior to the advent of Big Data. We present a
review of published studies that present Bayesian statistical approaches
specifically for Big Data and discuss the reported and perceived benefits of
these approaches. We conclude by addressing the question of whether focusing
only on improving computational algorithms and infrastructure will be enough to
face the challenges of Big Data
Research on the Influence of Euro VI Diesel Engine Assembly Consistency on NOx Emissions
The assembly consistency of a diesel engine will affect its nitrogen oxides (NOx) emission variation. In order to improve the NOx emissions of diesel engines, a study was carried out based on the assembly tolerance variation of the diesel engine’s combustion system. Firstly, a diesel engine which meets the Euro VI standards together with the experimental data is obtained. The mesh model and combustion model of the engine combustion system are built in the Converge software (version 2.4, Tecplot, Bellevue, DC, USA), and the experimental data is used to calibrate the combustion model obtained in the Converge software. Then, the four-factor and three-level orthogonal simulation experiments are carried out on the dimension parameters that include nozzle extension height, throat diameter, shrinkage diameter and combustion chamber depth. Through mathematical analysis on the experimental data, the results show that the variation of nozzle extension height and combustion chamber depth have a strong influence on NOx emission results, and the variation of combustion chamber diameter also has a weak influence on NOx production. According to the regression model obtained from the analysis, there is a quadratic function relating the nozzle extension height and NOx emissions and the amount of NOx increases with increasing nozzle extension height. The relationship between emission performance and size parameters is complex. In the selected size range, the influence of the variation of the chamber diameter on NOx is linear. The variation of the chamber depth also has an effect on NOx production, and the simulation results vary with the change of assembly tolerance variation. Thus, in the engine assembly process, it is necessary to strictly control the nozzle extension height and combustion chamber depth. The research results are useful to improve the NOx emission of diesel engine and provide a basis for the control strategy of selective catalytic reduction (SCR) devices