3,268 research outputs found
Change-point Problem and Regression: An Annotated Bibliography
The problems of identifying changes at unknown times and of estimating the location of changes in stochastic processes are referred to as the change-point problem or, in the Eastern literature, as disorder .
The change-point problem, first introduced in the quality control context, has since developed into a fundamental problem in the areas of statistical control theory, stationarity of a stochastic process, estimation of the current position of a time series, testing and estimation of change in the patterns of a regression model, and most recently in the comparison and matching of DNA sequences in microarray data analysis.
Numerous methodological approaches have been implemented in examining change-point models. Maximum-likelihood estimation, Bayesian estimation, isotonic regression, piecewise regression, quasi-likelihood and non-parametric regression are among the methods which have been applied to resolving challenges in change-point problems. Grid-searching approaches have also been used to examine the change-point problem.
Statistical analysis of change-point problems depends on the method of data collection. If the data collection is ongoing until some random time, then the appropriate statistical procedure is called sequential. If, however, a large finite set of data is collected with the purpose of determining if at least one change-point occurred, then this may be referred to as non-sequential. Not surprisingly, both the former and the latter have a rich literature with much of the earlier work focusing on sequential methods inspired by applications in quality control for industrial processes. In the regression literature, the change-point model is also referred to as two- or multiple-phase regression, switching regression, segmented regression, two-stage least squares (Shaban, 1980), or broken-line regression.
The area of the change-point problem has been the subject of intensive research in the past half-century. The subject has evolved considerably and found applications in many different areas. It seems rather impossible to summarize all of the research carried out over the past 50 years on the change-point problem. We have therefore confined ourselves to those articles on change-point problems which pertain to regression.
The important branch of sequential procedures in change-point problems has been left out entirely. We refer the readers to the seminal review papers by Lai (1995, 2001). The so called structural change models, which occupy a considerable portion of the research in the area of change-point, particularly among econometricians, have not been fully considered. We refer the reader to Perron (2005) for an updated review in this area. Articles on change-point in time series are considered only if the methodologies presented in the paper pertain to regression analysis
Recommended from our members
Sequential and Adaptive Inference Based on Martingale Concentration
Randomized experiments hold a well-deserved place at the top of the hierarchy of scientific evidence, and as such have received a great deal of attention from the statistical research community. In the simplest setting, a fixed group of subjects is available to the experimenter, who assigns one of two treatments to each subject via randomization, then observes corresponding outcomes. The goal is to draw inference about the effect of the experimental treatment on the observed outcome.Classical, frequentist statistical inference provides a powerful set of tools for this fixed-sample setting. We begin with an observed sample of some deterministic size and seek procedures which yield valid hypothesis tests, p-values, and confidence intervals---for example, a t-test of the null hypothesis that the experimental treatment has no effect, on average, or a corresponding confidence interval for the average treatment effect. The fixed-sample paradigm demands that we plan the experiment ahead of time, including the size of the experimental sample and the exact hypotheses to be tested, and that we adhere rigidly to this plan.In contrast, modern data analysis demands adaptivity. In particular, often the sample we choose to analyze is itself selected on the basis of observed data. For example, in an online A/B test, we may observe an ongoing stream of visitors enrolled into an experiment, so that the experimental sample is growing over time. The final experimental sample will include all of the visitors observed up to the time we decide to stop the experiment. The decision to stop could be made adaptively, by monitoring observed results and stopping early if a strong effect is observed, later if not. This is the realm of sequential, as opposed to fixed-sample, analysis.There are many other kinds of adaptivity that arise in practice. A second example is in the analysis of nonrandomized, or observational, studies of causal effects. In testing for statistical evidence of an effect, we may choose to focus on a subpopulation which we believe to be highly affected by the treatment of interest. For example, in studying the effect of fish consumption on mercury levels in the blood, we may focus on individuals whose diets are especially high in fish. Classical statistics requires that we define precisely which diets will be classified as "especially high in fish" before we analyze outcomes, but experimenters may prefer for this choice to be guided by the observed outcomes themselves.In both of the above examples---the sequential stopping of a randomized experiment and the adaptive choice of subgroup in an observational study---the use of fixed-sample methods, which do not account for adaptivity, will lead to violations of statistical guarantees such as false positive control. These violations are commonly included under the label "p-hacking" and have received much blame for the lack of reproducibility in various fields of scientific research. Fortunately, alternative statistical methods are available, methods that explicitly account for adaptivity to yield robust inference while placing fewer restrictions on the researcher. Such methods are the ultimate aim of the present work.This thesis develops a framework for constructing sequential and adaptive statistical procedures by taking advantage of the time-uniform concentration properties of certain martingales. Chapter 1 begins by laying out a mathematical framework for the derivation of time-uniform concentration inequalities for various classes of martingales. This framework unifies and strengthens a plethora of results from the exponential concentration literature and provides a toolbox for developing sequential and adaptive statistical procedures. The remaining three chapters develop such procedures.Chapter 2 builds upon the techniques of Chapter 1 to develop uniform concentration bounds which are somewhat more analytically and computationally complex but are much more useful for statistical applications. We frame these methods in terms of confidence sequences, that is, sequences of confidence intervals that are uniformly valid over an unbounded time horizon. One of the key results of this work is an empirical-Bernstein confidence sequence which provides a time-uniform, nonparametric, and non-asymptotic analogue of the t-test applicable to any distribution with bounded support. We explore applications to sequential estimation of average treatment effects in a randomized experiment, our first example above, as well as sequential estimation of a covariance matrix.Chapter 3 applies ideas from Chapters 1 and 2 to develop methods for the two related problems of estimating quantiles and estimating the entire cumulative distribution function, based on i.i.d. samples. We present confidence sequences for these estimands which are valid uniformly over time for any distribution, and we explore applications to A/B testing and best-arm identification when objectives are based on quantiles rather than means. Finally, Chapter 4 explores an application of uniform martingale concentration to the second example given above, the adaptive choice of subgroup within the analysis of an observational study. We introduce Rosenbaum's sensitivity analysis framework for observational studies, and show how our procedure yields qualitative improvements over existing methods within this framework.The martingale-based inferential methods we explore in this work trace their origins to Abraham Wald's work on the sequential probability ratio test during the 1940s, as well as to pioneering extensions developed in the late 1960s and early 1970s by Herbert Robbins, Donald Darling, David Siegmund, and Tze Leung Lai, not to mention many others. However, despite the decades of relevant literature, we believe most of the potential of the core ideas has yet to be realized. The key to unlocking this potential, we hope, is a fuller understanding of the nonparametric applicability of these methods, a detailed study of their implementation and tuning in practice, and an exploration of their utility beyond the sequential setting. While we propose several procedures that have immediate practical utility, we hope the larger contribution of the work will be as a first step towards a deeper appreciation of the power of martingale-based methods for adaptive inference, and ultimately to the development of a new class of statistical procedures which permit the kinds of adaptivity contemporary data analysts desire
Neural-Kalman Schemes for Non-Stationary Channel Tracking and Learning
This Thesis focuses on channel tracking in Orthogonal Frequency-Division Multiplexing (OFDM), a
widely-used method of data transmission in wireless communications, when abrupt changes occur
in the channel. In highly mobile applications, new dynamics appear that might make channel
tracking non-stationary, e.g. channels might vary with location, and location rapidly varies with
time. Simple examples might be the di erent channel dynamics a train receiver faces when it is
close to a station vs. crossing a bridge vs. entering a tunnel, or a car receiver in a route that
grows more tra c-dense. Some of these dynamics can be modelled as channel taps dying or being
reborn, and so tap birth-death detection is of the essence.
In order to improve the quality of communications, we delved into mathematical methods to
detect such abrupt changes in the channel, such as the mathematical areas of Sequential Analysis/
Abrupt Change Detection and Random Set Theory (RST), as well as the engineering advances
in Neural Network schemes. This knowledge helped us nd a solution to the problem of abrupt
change detection by informing and inspiring the creation of low-complexity implementations for
real-world channel tracking. In particular, two such novel trackers were created: the Simpli-
ed Maximum A Posteriori (SMAP) and the Neural-Network-switched Kalman Filtering (NNKF)
schemes.
The SMAP is a computationally inexpensive, threshold-based abrupt-change detector. It applies
the three following heuristics for tap birth-death detection: a) detect death if the tap gain
jumps into approximately zero (memoryless detection); b) detect death if the tap gain has slowly
converged into approximately zero (memory detection); c) detect birth if the tap gain is far from
zero.
The precise parameters for these three simple rules can be approximated with simple theoretical
derivations and then ne-tuned through extensive simulations. The status detector for each
tap using only these three computationally inexpensive threshold comparisons achieves an error
reduction matching that of a close-to-perfect path death/birth detection, as shown in simulations.
This estimator was shown to greatly reduce channel tracking error in the target Signal-to-Noise
Ratio (SNR) range at a very small computational cost, thus outperforming previously known systems.
The underlying RST framework for the SMAP was then extended to combined death/birth
and SNR detection when SNR is dynamical and may drift. We analyzed how di erent quasi-ideal
SNR detectors a ect the SMAP-enhanced Kalman tracker's performance. Simulations showed
SMAP is robust to SNR drift in simulations, although it was also shown to bene t from an accurate
SNR detection.
The core idea behind the second novel tracker, NNKFs, is similar to the SMAP, but now the tap
birth/death detection will be performed via an arti cial neuronal network (NN). Simulations show
that the proposed NNKF estimator provides extremely good performance, practically identical to a detector with 100% accuracy.
These proposed Neural-Kalman schemes can work as novel trackers for multipath channels,
since they are robust to wide variations in the probabilities of tap birth and death. Such robustness
suggests a single, low-complexity NNKF could be reusable over di erent tap indices and
communication environments.
Furthermore, a di erent kind of abrupt change was proposed and analyzed: energy shifts from
one channel tap to adjacent taps (partial tap lateral hops). This Thesis also discusses how to
model, detect and track such changes, providing a geometric justi cation for this and additional
non-stationary dynamics in vehicular situations, such as road scenarios where re ections on trucks
and vans are involved, or the visual appearance/disappearance of drone swarms. An extensive
literature review of empirically-backed abrupt-change dynamics in channel modelling/measuring
campaigns is included.
For this generalized framework of abrupt channel changes that includes partial tap lateral
hopping, a neural detector for lateral hops with large energy transfers is introduced. Simulation
results suggest the proposed NN architecture might be a feasible lateral hop detector, suitable for
integration in NNKF schemes.
Finally, the newly found understanding of abrupt changes and the interactions between Kalman
lters and neural networks is leveraged to analyze the neural consequences of abrupt changes
and brie y sketch a novel, abrupt-change-derived stochastic model for neural intelligence, extract
some neuro nancial consequences of unstereotyped abrupt dynamics, and propose a new
portfolio-building mechanism in nance: Highly Leveraged Abrupt Bets Against Failing Experts
(HLABAFEOs). Some communication-engineering-relevant topics, such as a Bayesian stochastic
stereotyper for hopping Linear Gauss-Markov (LGM) models, are discussed in the process.
The forecasting problem in the presence of expert disagreements is illustrated with a hopping
LGM model and a novel structure for a Bayesian stereotyper is introduced that might eventually
solve such problems through bio-inspired, neuroscienti cally-backed mechanisms, like dreaming
and surprise (biological Neural-Kalman). A generalized framework for abrupt changes and expert
disagreements was introduced with the novel concept of Neural-Kalman Phenomena. This Thesis
suggests mathematical (Neural-Kalman Problem Category Conjecture), neuro-evolutionary and
social reasons why Neural-Kalman Phenomena might exist and found signi cant evidence for their
existence in the areas of neuroscience and nance.
Apart from providing speci c examples, practical guidelines and historical (out)performance
for some HLABAFEO investing portfolios, this multidisciplinary research suggests that a Neural-
Kalman architecture for ever granular stereotyping providing a practical solution for continual
learning in the presence of unstereotyped abrupt dynamics would be extremely useful in communications
and other continual learning tasks.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Luis Castedo Ribas.- Secretaria: Ana GarcĂa Armada.- Vocal: JosĂ© Antonio Portilla Figuera
- …