45 research outputs found

    Smooth tail index estimation

    Full text link
    Both parametric distribution functions appearing in extreme value theory - the generalized extreme value distribution and the generalized Pareto distribution - have log-concave densities if the extreme value index gamma is in [-1,0]. Replacing the order statistics in tail index estimators by their corresponding quantiles from the distribution function that is based on the estimated log-concave density leads to novel smooth quantile and tail index estimators. These new estimators aim at estimating the tail index especially in small samples. Acting as a smoother of the empirical distribution function, the log-concave distribution function estimator reduces estimation variability to a much greater extent than it introduces bias. As a consequence, Monte Carlo simulations demonstrate that the smoothed version of the estimators are well superior to their non-smoothed counterparts, in terms of mean squared error.Comment: 17 pages, 5 figures. Slightly changed Pickand's estimator, added some more introduction and discussio

    Non-Gaussian component analysis: testing the dimension of the signal subspace

    Full text link
    Dimension reduction is a common strategy in multivariate data analysis which seeks a subspace which contains all interesting features needed for the subsequent analysis. Non-Gaussian component analysis attempts for this purpose to divide the data into a non-Gaussian part, the signal, and a Gaussian part, the noise. We will show that the simultaneous use of two scatter functionals can be used for this purpose and suggest a bootstrap test to test the dimension of the non-Gaussian subspace. Sequential application of the test can then for example be used to estimate the signal dimension

    Concentration Inequalities and Confidence Bands for Needlet Density Estimators on Compact Homogeneous Manifolds

    Full text link
    Let X1,...,XnX_1,...,X_n be a random sample from some unknown probability density ff defined on a compact homogeneous manifold M\mathbf M of dimension d1d \ge 1. Consider a 'needlet frame' {ϕjη}\{\phi_{j \eta}\} describing a localised projection onto the space of eigenfunctions of the Laplace operator on M\mathbf M with corresponding eigenvalues less than 22j2^{2j}, as constructed in \cite{GP10}. We prove non-asymptotic concentration inequalities for the uniform deviations of the linear needlet density estimator fn(j)f_n(j) obtained from an empirical estimate of the needlet projection ηϕjηfϕjη\sum_\eta \phi_{j \eta} \int f \phi_{j \eta} of ff. We apply these results to construct risk-adaptive estimators and nonasymptotic confidence bands for the unknown density ff. The confidence bands are adaptive over classes of differentiable and H\"{older}-continuous functions on M\mathbf M that attain their H\"{o}lder exponents.Comment: Probability Theory and Related Fields, to appea

    The Newcomb-Benford Law in Its Relation to Some Common Distributions

    Get PDF
    An often reported, but nevertheless persistently striking observation, formalized as the Newcomb-Benford law (NBL), is that the frequencies with which the leading digits of numbers occur in a large variety of data are far away from being uniform. Most spectacular seems to be the fact that in many data the leading digit 1 occurs in nearly one third of all cases. Explanations for this uneven distribution of the leading digits were, among others, scale- and base-invariance. Little attention, however, found the interrelation between the distribution of the significant digits and the distribution of the observed variable. It is shown here by simulation that long right-tailed distributions of a random variable are compatible with the NBL, and that for distributions of the ratio of two random variables the fit generally improves. Distributions not putting most mass on small values of the random variable (e.g. symmetric distributions) fail to fit. Hence, the validity of the NBL needs the predominance of small values and, when thinking of real-world data, a majority of small entities. Analyses of data on stock prices, the areas and numbers of inhabitants of countries, and the starting page numbers of papers from a bibliography sustain this conclusion. In all, these findings may help to understand the mechanisms behind the NBL and the conditions needed for its validity. That this law is not only of scientific interest per se, but that, in addition, it has also substantial implications can be seen from those fields where it was suggested to be put into practice. These fields reach from the detection of irregularities in data (e.g. economic fraud) to optimizing the architecture of computers regarding number representation, storage, and round-off errors

    On an Auxiliary Function for Log-Density Estimation

    Get PDF

    logcondens: Computations related to univariate log-concave density estimation

    Get PDF
    Maximum likelihood estimation of a log-concave density has attracted considerable attention over the last few years. Several algorithms have been proposed to estimate such a density. Two of those algorithms, an iterative convex minorant and an active set algorithm, are implemented in the R package logcondens. While these algorithms are discussed elsewhere, we describe in this paper the use of the logcondens package and discuss functions and datasets related to log-concave density estimation contained in the package. In particular, we provide functions to (1) compute the maximum likelihood estimate (MLE) as well as a smoothed log-concave density estimator derived from the MLE, (2) evaluate the estimated density, distribution and quantile functions at arbitrary points, (3) compute the characterizing functions of the MLE, (4) sample from the estimated distribution, and �nally (5) perform a two-sample permutation test using a modi�ed Kolmogorov-Smirnov test statistic. In addition, logcondens makes two datasets available that have been used to illustrate log-concave density estimation

    Regularization by Gaussian process priors

    No full text
    corecore