12 research outputs found
Sparse modelling and estimation for nonstationary time series and high-dimensional data
Sparse modelling has attracted great attention as an efficient way of
handling statistical problems in high dimensions. This thesis considers
sparse modelling and estimation in a selection of problems such
as breakpoint detection in nonstationary time series, nonparametric
regression using piecewise constant functions and variable selection in
high-dimensional linear regression.
We first propose a method for detecting breakpoints in the secondorder
structure of piecewise stationary time series, assuming that
those structural breakpoints are sufficiently scattered over time. Our
choice of time series model is the locally stationary wavelet process
(Nason et al., 2000), under which the entire second-order structure of a
time series is described by wavelet-based local periodogram sequences.
As the initial stage of breakpoint detection, we apply a binary segmentation
procedure to wavelet periodogram sequences at each scale
separately, which is followed by within-scale and across-scales postprocessing
steps. We show that the combined methodology achieves
consistent estimation of the breakpoints in terms of their total number
and locations, and investigate its practical performance using both
simulated and real data.
Next, we study the problem of nonparametric regression by means of
piecewise constant functions, which are known to be flexible in approximating
a wide range of function spaces. Among many approaches developed
for this purpose, we focus on comparing two well-performing
techniques, the taut string (Davies & Kovac, 2001) and the Unbalanced
Haar (Fryzlewicz, 2007) methods. While the multiscale nature
of the latter is easily observed, it is not so obvious that the former
can also be interpreted as multiscale. We provide a unified, multiscale
representation for both methods, which offers an insight into the relationship
between them as well as suggesting some lessons that both
methods can learn from each other.
Lastly, one of the most widely-studied applications of sparse modelling
and estimation is considered, variable selection in high-dimensional
linear regression. High dimensionality of the data brings in many
complications including (possibly spurious) non-negligible correlations
among the variables, which may result in marginal correlation being
unreliable as a measure of association between the variables and the
response. We propose a new way of measuring the contribution of
each variable to the response, which adaptively takes into account
high correlations among the variables. A key ingredient of the proposed
tilting procedure is hard-thresholding sample correlation of the
design matrix, which enables a data-driven switch between the use of
marginal correlation and tilted correlation for each variable. We study
the conditions under which this measure can discriminate between relevant
and irrelevant variables, and thus be used as a tool for variable
selection. In order to exploit these theoretical properties of tilted correlation,
we construct an iterative variable screening algorithm and
examine its practical performance in a comparative simulation study
Randomised and L1-penalty approaches to segmentation in time series and regression models
It is a common approach in statistics to assume that the parameters of a stochastic model change. The simplest model involves parameters than can be exactly or approximately piecewise constant. In such a model, the aim is the posteriori detection of the number and location in time of the changes in the parameters. This thesis develops segmentation methods for non-stationary time series and regression models using randomised methods or methods that involve L1 penalties which force the coefficients in a regression model to be exactly zero. Randomised techniques are not commonly found in nonparametric statistics, whereas L1 methods draw heavily from the variable selection literature. Considering these two categories together, apart from other contributions, enables a comparison between them by pointing out strengths and weaknesses. This is achieved by organising the thesis into three main parts.
First, we propose a new technique for detecting the number and locations of the change-points in the second-order structure of a time series. The core of the segmentation procedure is the Wild Binary Segmentation method (WBS) of Fryzlewicz (2014), a technique which involves a certain randomised mechanism. The advantage of WBS over the standard Binary Segmentation lies in its localisation feature, thanks to which it works in cases where the spacings between change-points are short. Our main change-point detection statistic is the wavelet periodogram which allows a rigorous estimation of the local autocovariance of a piecewise-stationary process. We provide a proof of consistency and examine the performance of the method on simulated and real data sets.
Second, we study the fused lasso estimator which, in its simplest form, deals with the estimation of a piecewise constant function contaminated with Gaussian noise (Friedman et al. (2007)). We show a fast way of implementing the solution path algorithm of Tibshirani and Taylor (2011) and we make a connection between their algorithm and the taut-string method of Davies and Kovac (2001). In addition, a theoretical result and a simulation study indicate that the fused lasso estimator is suboptimal in detecting the location of a change-point.
Finally, we propose a method to estimate regression models in which the coefficients vary with respect to some covariate such as time. In particular, we present a path algorithm based on Tibshirani and Taylor (2011) and the fused lasso method of Tibshirani et al. (2005). Thanks to the adaptability of the fused lasso penalty, our proposed method goes beyond the estimation of piecewise constant models to models where the underlying coefficient function can be piecewise linear, quadratic or cubic. Our simulation studies show that in most cases the method outperforms smoothing splines, a common approach in estimating this class of models
Tail-greedy bottom-up data decompositions and fast mulitple change-point detection
This article proposes a ‘tail-greedy’, bottom-up transform for one-dimensional data, which results in a nonlinear but conditionally orthonormal, multiscale decomposition of the data with respect to an adaptively chosen Unbalanced Haar wavelet basis. The ‘tail-greediness’of the decomposition algorithm, whereby multiple greedy steps are taken in a single pass through the data, both enables fast computation and makes the algorithm applicable in the problem of consistent estimation of the number and locations of multiple changepoints in data. The resulting agglomerative change-point detection method avoids the disadvantages of the classical divisive binary segmentation, and offers very good practical performance. It is implemented in the R package breakfast, available from CRAN
Methods for change-point detection with additional interpretability
The main purpose of this dissertation is to introduce and critically assess some novel statistical methods for change-point detection that help better understand the nature of processes underlying observable time series.
First, we advocate the use of change-point detection for local trend estimation in financial return data and propose a new approach developed to capture the oscillatory behaviour of financial returns around piecewise-constant trend functions. Core of the method is a data-adaptive hierarchically-ordered basis of Unbalanced Haar vectors which decomposes the piecewise-constant trend underlying observed daily returns into a binary-tree structure of one-step constant functions. We illustrate how this framework can provide a new perspective for the interpretation of change points in financial returns. Moreover, the approach yields a family of forecasting operators for financial return series which can be adjusted flexibly depending on
the forecast horizon or the loss function.
Second, we discuss change-point detection under model misspecification, focusing in particular on normally distributed data with changing mean and variance. We argue that ignoring the presence of changes in mean or variance when testing for changes in, respectively, variance or mean, can affect the application of statistical methods negatively. After illustrating the difficulties arising from this kind of model misspecification we propose a new method to address these using sequential testing on intervals with varying length and show in a simulation study how this approach compares to competitors in mixed-change situations.
The third contribution of this thesis is a data-adaptive procedure to evaluate EEG data, which can improve the understanding of an epileptic seizure recording. This change-point detection method characterizes the evolution of frequencyspecific energy as measured on the human scalp. It provides new insights to this high dimensional high frequency data and has attractive computational and scalability features. In addition to contrasting our method with existing approaches, we analyse and interpret the method’s output in the application to a seizure data set
Multiscale Change-Point Inference
We introduce a new estimator SMUCE (simultaneous multiscale change-point
estimator) for the change-point problem in exponential family regression. An
unknown step function is estimated by minimizing the number of change-points
over the acceptance region of a multiscale test at a level \alpha. The
probability of overestimating the true number of change-points K is controlled
by the asymptotic null distribution of the multiscale test statistic. Further,
we derive exponential bounds for the probability of underestimating K. By
balancing these quantities, \alpha will be chosen such that the probability of
correctly estimating K is maximized. All results are even non-asymptotic for
the normal case. Based on the aforementioned bounds, we construct
asymptotically honest confidence sets for the unknown step function and its
change-points. At the same time, we obtain exponential bounds for estimating
the change-point locations which for example yield the minimax rate O(1/n) up
to a log term. Finally, SMUCE asymptotically achieves the optimal detection
rate of vanishing signals. We illustrate how dynamic programming techniques can
be employed for efficient computation of estimators and confidence regions. The
performance of the proposed multiscale approach is illustrated by simulations
and in two cutting-edge applications from genetic engineering and photoemission
spectroscopy
Recommended from our members
Constrained estimation via the fused lasso and some generalizations
This dissertation studies structurally constrained statistical estimators. Two entwined main themes are developed: computationally efficient algorithms, and strong statistical guarantees of estimators across a wide range of frameworks.
In the first chapter we discuss a unified view of optimization problems that enforces constrains, such as smoothness, in statistical inference. This in turn helps to incorporate spatial and/or temporal information about
data.
The second chapter studies the fused lasso, a non-parametric regression estimator commonly used for graph denoising. This has been widely used in applications where the graph structure indicates that neighbor nodes
have similar signal values. I prove for the fused lasso on arbitrary graphs, an upper bound on the mean squared error that depends on the total variation
of the underlying signal on the graph. Moreover, I provide a surrogate estimator that can be found in linear time and attains the same upper–bound.
In the third chapter I present an approach for penalized tensor decomposition (PTD) that estimates smoothly varying latent factors in multiway data. This generalizes existing work on sparse tensor decomposition
and penalized matrix decomposition, in a manner parallel to the generalized lasso for regression and smoothing problems. I present an efficient coordinate-wise optimization algorithm for PTD, and characterize its convergence properties.
The fourth chapter proposes histogram trend filtering, a novel approach for density estimation. This estimator arises from looking at surrogate Poisson model for counts of observations in a partition of the support
of the data.
The fifth chapter develops a class of estimators for deconvolution in mixture models based on a simple two-step bin-and-smooth procedure, applied to histogram counts. The method is both statistically and computationally efficient. By exploiting recent advances in convex optimization,we are able to provide a full deconvolution path that shows the estimate for
the mixing distribution across a range of plausible degrees of smoothness, at far less cost than a full Bayesian analysis.
Finally, the sixth chapter summarizes my contributions and provides possible directions for future work.Statistic
Wavelet Methods and Inverse Problems
Archaeological investigations are designed to acquire information without damaging the archaeological site. Magnetometry is one of the important techniques for producing a surface grid of readings, which can be used to infer underground features. The inversion of this data, to give a fitted model, is an inverse problem. This type of problem can be ill-posed or ill-conditioned, making the estimation of model parameters less stable or even impossible. More precisely, the relationship between archaeological data and parameters is expressed by a likelihood. It is not possible to use the standard regression estimate obtained through the likelihood, which means that no maximum likelihood estimate exists.
Instead, various constraints can be added through a prior distribution with an estimate produced using the posterior distribution. Current approaches incorporate prior information describing smoothness, which is not always appropriate. The biggest challenge is that the reconstruction of an archaeological site as a single layer requires various physical features such as depth and extent to be assumed. By applying a smoothing prior in the analysis of stratigraphy data, however, these features are not easily estimated. Wavelet analysis has proved to be highly efficient at eliciting information from noisy data. Additionally, complicated signals can be explained by interpreting only a small number of wavelet coefficients. It is possible that a modelling approach, which attempts to describe an underlying function in terms of a multi-level wavelet representation will be an improvement on standard techniques. Further, a new method proposed uses an elastic-net based distribution as the prior. Two methods are used to solve the problem, one is based on one-stage estimation and the other is based on two stages. The one-stage considers two approaches a single prior for all wavelet resolution levels and a level-dependent prior, with separate priors at each resolution level. In a simulation study and a real data analysis, all these techniques are compared to several existing methods. It is shown that the methodology using a single prior provides good reconstruction, comparable even to several established wavelet methods that use mixture priors
An Algorithmic Theory of Dependent Regularizers, Part 1: Submodular Structure
We present an exploration of the rich theoretical connections between several
classes of regularized models, network flows, and recent results in submodular
function theory. This work unifies key aspects of these problems under a common
theory, leading to novel methods for working with several important models of
interest in statistics, machine learning and computer vision.
In Part 1, we review the concepts of network flows and submodular function
optimization theory foundational to our results. We then examine the
connections between network flows and the minimum-norm algorithm from
submodular optimization, extending and improving several current results. This
leads to a concise representation of the structure of a large class of pairwise
regularized models important in machine learning, statistics and computer
vision.
In Part 2, we describe the full regularization path of a class of penalized
regression problems with dependent variables that includes the graph-guided
LASSO and total variation constrained models. This description also motivates a
practical algorithm. This allows us to efficiently find the regularization path
of the discretized version of TV penalized models. Ultimately, our new
algorithms scale up to high-dimensional problems with millions of variables
Echocardiography
The book "Echocardiography - New Techniques" brings worldwide contributions from highly acclaimed clinical and imaging science investigators, and representatives from academic medical centers. Each chapter is designed and written to be accessible to those with a basic knowledge of echocardiography. Additionally, the chapters are meant to be stimulating and educational to the experts and investigators in the field of echocardiography. This book is aimed primarily at cardiology fellows on their basic echocardiography rotation, fellows in general internal medicine, radiology and emergency medicine, and experts in the arena of echocardiography. Over the last few decades, the rate of technological advancements has developed dramatically, resulting in new techniques and improved echocardiographic imaging. The authors of this book focused on presenting the most advanced techniques useful in today's research and in daily clinical practice. These advanced techniques are utilized in the detection of different cardiac pathologies in patients, in contributing to their clinical decision, as well as follow-up and outcome predictions. In addition to the advanced techniques covered, this book expounds upon several special pathologies with respect to the functions of echocardiography