3,139 research outputs found
Multiscale change-point segmentation: beyond step functions.
Modern multiscale type segmentation methods are known to detect multiple change-points with high statistical accuracy, while allowing for fast computation. Underpinning (minimax) estimation theory has been developed mainly for models that assume the signal as a piecewise constant function. In this paper, for a large collection of multiscale segmentation methods (including various existing procedures), such theory will be extended to certain function classes beyond step functions in a nonparametric regression setting. This extends the interpretation of such methods on the one hand and on the other hand reveals these methods as robust to deviation from piecewise constant functions. Our main finding is the adaptation over nonlinear approximation classes for a universal thresholding, which includes bounded variation functions, and (piecewise) Holder functions of smoothness order 0 < alpha <= 1 as special cases. From this we derive statistical guarantees on feature detection in terms of jumps and modes. Another key finding is that these multiscale segmentation methods perform nearly (up to a log-factor) as well as the oracle piecewise constant segmentation estimator (with known jump locations), and the best piecewise constant approximants of the (unknown) true signal. Theoretical findings are examined by various numerical simulations
Fitting Jump Models
We describe a new framework for fitting jump models to a sequence of data.
The key idea is to alternate between minimizing a loss function to fit multiple
model parameters, and minimizing a discrete loss function to determine which
set of model parameters is active at each data point. The framework is quite
general and encompasses popular classes of models, such as hidden Markov models
and piecewise affine models. The shape of the chosen loss functions to minimize
determine the shape of the resulting jump model.Comment: Accepted for publication in Automatic
Fully-Automatic Multiresolution Idealization for Filtered Ion Channel Recordings: Flickering Event Detection
We propose a new model-free segmentation method, JULES, which combines recent
statistical multiresolution techniques with local deconvolution for
idealization of ion channel recordings. The multiresolution criterion takes
into account scales down to the sampling rate enabling the detection of
flickering events, i.e., events on small temporal scales, even below the filter
frequency. For such small scales the deconvolution step allows for a precise
determination of dwell times and, in particular, of amplitude levels, a task
which is not possible with common thresholding methods. This is confirmed
theoretically and in a comprehensive simulation study. In addition, JULES can
be applied as a preprocessing method for a refined hidden Markov analysis. Our
new methodolodgy allows us to show that gramicidin A flickering events have the
same amplitude as the slow gating events. JULES is available as an R function
jules in the package clampSeg
VIVA: An Online Algorithm for Piecewise Curve Estimation Using ℓ\u3csup\u3e0\u3c/sup\u3e Norm Regularization
Many processes deal with piecewise input functions, which occur naturally as a result of digital commands, user interfaces requiring a confirmation action, or discrete-time sampling. Examples include the assembly of protein polymers and hourly adjustments to the infusion rate of IV fluids during treatment of burn victims. Estimation of the input is straightforward regression when the observer has access to the timing information. More work is needed if the input can change at unknown times. Successful recovery of the change timing is largely dependent on the choice of cost function minimized during parameter estimation.
Optimal estimation of a piecewise input will often proceed by minimization of a cost function which includes an estimation error term (most commonly mean square error) and the number (cardinality) of input changes (number of commands). Because the cardinality (ℓ0 norm) is not convex, the ℓ2 norm (quadratic smoothing) and ℓ1 norm (total variation minimization) are often substituted because they permit the use of convex optimization algorithms. However, these penalize the magnitude of input changes and therefore bias the piecewise estimates. Another disadvantage is that global optimization methods must be run after the end of data collection.
One approach to unbiasing the piecewise parameter fits would include application of total variation minimization to recover timing, followed by piecewise parameter fitting. Another method is presented herein: a dynamic programming approach which iteratively develops populations of candidate estimates of increasing length, pruning those proven to be dominated. Because the usage of input data is entirely causal, the algorithm recovers timing and parameter values online. A functional definition of the algorithm, which is an extension of Viterbi decoding and integrates the pruning concept from branch-and-bound, is presented. Modifications are introduced to improve handling of non-uniform sampling, non-uniform confidence, and burst errors. Performance tests using synthesized data sets as well as volume data from a research system recording fluid infusions show five-fold (piecewise-constant data) and 20-fold (piecewise-linear data) reduction in error compared to total variation minimization, along with improved sparsity and reduced sensitivity to the regularization parameter. Algorithmic complexity and delay are also considered
Support Vector Machines in R
Being among the most popular and efficient classification and regression methods currently available, implementations of support vector machines exist in almost every popular programming language. Currently four R packages contain SVM related software. The purpose of this paper is to present and compare these implementations.
Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches
Imaging spectrometers measure electromagnetic energy scattered in their
instantaneous field view in hundreds or thousands of spectral channels with
higher spectral resolution than multispectral cameras. Imaging spectrometers
are therefore often referred to as hyperspectral cameras (HSCs). Higher
spectral resolution enables material identification via spectroscopic analysis,
which facilitates countless applications that require identifying materials in
scenarios unsuitable for classical spectroscopic analysis. Due to low spatial
resolution of HSCs, microscopic material mixing, and multiple scattering,
spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus,
accurate estimation requires unmixing. Pixels are assumed to be mixtures of a
few materials, called endmembers. Unmixing involves estimating all or some of:
the number of endmembers, their spectral signatures, and their abundances at
each pixel. Unmixing is a challenging, ill-posed inverse problem because of
model inaccuracies, observation noise, environmental conditions, endmember
variability, and data set size. Researchers have devised and investigated many
models searching for robust, stable, tractable, and accurate unmixing
algorithms. This paper presents an overview of unmixing methods from the time
of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models
are first discussed. Signal-subspace, geometrical, statistical, sparsity-based,
and spatial-contextual unmixing algorithms are described. Mathematical problems
and potential solutions are described. Algorithm characteristics are
illustrated experimentally.Comment: This work has been accepted for publication in IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensin
Narrowest Significance Pursuit: inference for multiple change-points in linear models
We propose Narrowest Significance Pursuit (NSP), a general and flexible
methodology for automatically detecting localised regions in data sequences
which each must contain a change-point, at a prescribed global significance
level. Here, change-points are understood as abrupt changes in the parameters
of an underlying linear model. NSP works by fitting the postulated linear model
over many regions of the data, using a certain multiresolution sup-norm loss,
and identifying the shortest interval on which the linearity is significantly
violated. The procedure then continues recursively to the left and to the right
until no further intervals of significance can be found. The use of the
multiresolution sup-norm loss is a key feature of NSP, as it enables the
transfer of significance considerations to the domain of the unobserved true
residuals, a substantial simplification. It also guarantees important
stochastic bounds which directly yield exact desired coverage probabilities,
regardless of the form or number of the regressors.
NSP works with a wide range of distributional assumptions on the errors,
including Gaussian with known or unknown variance, some light-tailed
distributions, and some heavy-tailed, possibly heterogeneous distributions via
self-normalisation. It also works in the presence of autoregression. The
mathematics of NSP is, by construction, uncomplicated, and its key
computational component uses simple linear programming. In contrast to the
widely studied "post-selection inference" approach, NSP enables the opposite
viewpoint and paves the way for the concept of "post-inference selection".
Pre-CRAN R code implementing NSP is available at https://github.com/pfryz/nsp
- …