16,551 research outputs found

    Fast change point analysis on the Hurst index of piecewise fractional Brownian motion

    Get PDF
    In this presentation, we introduce a new method for change point analysis on the Hurst index for a piecewise fractional Brownian motion. We first set the model and the statistical problem. The proposed method is a transposition of the FDpV (Filtered Derivative with p-value) method introduced for the detection of change points on the mean in Bertrand et al. (2011) to the case of changes on the Hurst index. The underlying statistics of the FDpV technology is a new statistic estimator for Hurst index, so-called Increment Bernoulli Statistic (IBS). Both FDpV and IBS are methods with linear time and memory complexity, with respect to the size of the series. Thus the resulting method for change point analysis on Hurst index reaches also a linear complexity

    Multiscale change-point segmentation: beyond step functions.

    No full text
    Modern multiscale type segmentation methods are known to detect multiple change-points with high statistical accuracy, while allowing for fast computation. Underpinning (minimax) estimation theory has been developed mainly for models that assume the signal as a piecewise constant function. In this paper, for a large collection of multiscale segmentation methods (including various existing procedures), such theory will be extended to certain function classes beyond step functions in a nonparametric regression setting. This extends the interpretation of such methods on the one hand and on the other hand reveals these methods as robust to deviation from piecewise constant functions. Our main finding is the adaptation over nonlinear approximation classes for a universal thresholding, which includes bounded variation functions, and (piecewise) Holder functions of smoothness order 0 < alpha <= 1 as special cases. From this we derive statistical guarantees on feature detection in terms of jumps and modes. Another key finding is that these multiscale segmentation methods perform nearly (up to a log-factor) as well as the oracle piecewise constant segmentation estimator (with known jump locations), and the best piecewise constant approximants of the (unknown) true signal. Theoretical findings are examined by various numerical simulations

    Bayesian Regression of Piecewise Constant Functions

    Full text link
    We derive an exact and efficient Bayesian regression algorithm for piecewise constant functions of unknown segment number, boundary location, and levels. It works for any noise and segment level prior, e.g. Cauchy which can handle outliers. We derive simple but good estimates for the in-segment variance. We also propose a Bayesian regression curve as a better way of smoothing data without blurring boundaries. The Bayesian approach also allows straightforward determination of the evidence, break probabilities and error estimates, useful for model selection and significance and robustness studies. We discuss the performance on synthetic and real-world examples. Many possible extensions will be discussed.Comment: 27 pages, 18 figures, 1 table, 3 algorithm

    Heterogeneous Change Point Inference

    Full text link
    We propose HSMUCE (heterogeneous simultaneous multiscale change-point estimator) for the detection of multiple change-points of the signal in a heterogeneous gaussian regression model. A piecewise constant function is estimated by minimizing the number of change-points over the acceptance region of a multiscale test which locally adapts to changes in the variance. The multiscale test is a combination of local likelihood ratio tests which are properly calibrated by scale dependent critical values in order to keep a global nominal level alpha, even for finite samples. We show that HSMUCE controls the error of over- and underestimation of the number of change-points. To this end, new deviation bounds for F-type statistics are derived. Moreover, we obtain confidence sets for the whole signal. All results are non-asymptotic and uniform over a large class of heterogeneous change-point models. HSMUCE is fast to compute, achieves the optimal detection rate and estimates the number of change-points at almost optimal accuracy for vanishing signals, while still being robust. We compare HSMUCE with several state of the art methods in simulations and analyse current recordings of a transmembrane protein in the bacterial outer membrane with pronounced heterogeneity for its states. An R-package is available online

    Binscatter Regressions

    Full text link
    We introduce the \texttt{Stata} (and \texttt{R}) package \textsf{Binsreg}, which implements the binscatter methods developed in \citet*{Cattaneo-Crump-Farrell-Feng_2019_Binscatter}. The package includes the commands \texttt{binsreg}, \texttt{binsregtest}, and \texttt{binsregselect}. The first command (\texttt{binsreg}) implements binscatter for the regression function and its derivatives, offering several point estimation, confidence intervals and confidence bands procedures, with particular focus on constructing binned scatter plots. The second command (\texttt{binsregtest}) implements hypothesis testing procedures for parametric specification and for nonparametric shape restrictions of the unknown regression function. Finally, the third command (\texttt{binsregselect}) implements data-driven number of bins selectors for binscatter implementation using either quantile-spaced or evenly-spaced binning/partitioning. All the commands allow for covariate adjustment, smoothness restrictions, weighting and clustering, among other features. A companion \texttt{R} package with the same capabilities is also available

    Characterizing Ranked Chinese Syllable-to-Character Mapping Spectrum: A Bridge Between the Spoken and Written Chinese Language

    Full text link
    One important aspect of the relationship between spoken and written Chinese is the ranked syllable-to-character mapping spectrum, which is the ranked list of syllables by the number of characters that map to the syllable. Previously, this spectrum is analyzed for more than 400 syllables without distinguishing the four intonations. In the current study, the spectrum with 1280 toned syllables is analyzed by logarithmic function, Beta rank function, and piecewise logarithmic function. Out of the three fitting functions, the two-piece logarithmic function fits the data the best, both by the smallest sum of squared errors (SSE) and by the lowest Akaike information criterion (AIC) value. The Beta rank function is the close second. By sampling from a Poisson distribution whose parameter value is chosen from the observed data, we empirically estimate the pp-value for testing the two-piece-logarithmic-function being better than the Beta rank function hypothesis, to be 0.16. For practical purposes, the piecewise logarithmic function and the Beta rank function can be considered a tie.Comment: 15 pages, 4 figure
    corecore