76,908 research outputs found

    EbayesThresh: R Programs for Empirical Bayes Thresholding

    Get PDF
    Suppose that a sequence of unknown parameters is observed sub ject to independent Gaussian noise. The EbayesThresh package in the S language implements a class of Empirical Bayes thresholding methods that can take advantage of possible sparsity in the sequence, to improve the quality of estimation. The prior for each parameter in the sequence is a mixture of an atom of probability at zero and a heavy-tailed density. Within the package, this can be either a Laplace (double exponential) density or else a mixture of normal distributions with tail behavior similar to the Cauchy distribution. The mixing weight, or sparsity parameter, is chosen automatically by marginal maximum likelihood. If estimation is carried out using the posterior median, this is a random thresholding procedure; the estimation can also be carried out using other thresholding rules with the same threshold, and the package provides the posterior mean, and hard and soft thresholding, as additional options. This paper reviews the method, and gives details (far beyond those previously published) of the calculations needed for implementing the procedures. It explains and motivates both the general methodology, and the use of the EbayesThresh package, through simulated and real data examples. When estimating the wavelet transform of an unknown function, it is appropriate to apply the method level by level to the transform of the observed data. The package can carry out these calculations for wavelet transforms obtained using various packages in R and S-PLUS. Details, including a motivating example, are presented, and the application of the method to image estimation is also explored. The final topic considered is the estimation of a single sequence that may become progressively sparser along the sequence. An iterated least squares isotone regression method allows for the choice of a threshold that depends monotonically on the order in which the observations are made. An alternative possibility, also discussed in detail, is a particular parametric dependence of the sparsity parameter on the position in the sequence.

    Gini estimation under infinite variance

    Get PDF
    We study the problems related to the estimation of the Gini index in presence of a fat-tailed data generating process, i.e. one in the stable distribution class with finite mean but infinite variance (i.e. with tail index α(1,2)\alpha\in(1,2)). We show that, in such a case, the Gini coefficient cannot be reliably estimated using conventional nonparametric methods, because of a downward bias that emerges under fat tails. This has important implications for the ongoing discussion about economic inequality. We start by discussing how the nonparametric estimator of the Gini index undergoes a phase transition in the symmetry structure of its asymptotic distribution, as the data distribution shifts from the domain of attraction of a light-tailed distribution to that of a fat-tailed one, especially in the case of infinite variance. We also show how the nonparametric Gini bias increases with lower values of α\alpha. We then prove that maximum likelihood estimation outperforms nonparametric methods, requiring a much smaller sample size to reach efficiency. Finally, for fat-tailed data, we provide a simple correction mechanism to the small sample bias of the nonparametric estimator based on the distance between the mode and the mean of its asymptotic distribution

    Tail index estimation, concentration and adaptivity

    Get PDF
    This paper presents an adaptive version of the Hill estimator based on Lespki's model selection method. This simple data-driven index selection method is shown to satisfy an oracle inequality and is checked to achieve the lower bound recently derived by Carpentier and Kim. In order to establish the oracle inequality, we derive non-asymptotic variance bounds and concentration inequalities for Hill estimators. These concentration inequalities are derived from Talagrand's concentration inequality for smooth functions of independent exponentially distributed random variables combined with three tools of Extreme Value Theory: the quantile transform, Karamata's representation of slowly varying functions, and R\'enyi's characterisation of the order statistics of exponential samples. The performance of this computationally and conceptually simple method is illustrated using Monte-Carlo simulations

    Second-order refined peaks-over-threshold modelling for heavy-tailed distributions

    Full text link
    Modelling excesses over a high threshold using the Pareto or generalized Pareto distribution (PD/GPD) is the most popular approach in extreme value statistics. This method typically requires high thresholds in order for the (G)PD to fit well and in such a case applies only to a small upper fraction of the data. The extension of the (G)PD proposed in this paper is able to describe the excess distribution for lower thresholds in case of heavy tailed distributions. This yields a statistical model that can be fitted to a larger portion of the data. Moreover, estimates of tail parameters display stability for a larger range of thresholds. Our findings are supported by asymptotic results, simulations and a case study.Comment: to appear in the Journal of Statistical Planning and Inferenc

    Efficient inference about the tail weight in multivariate Student tt distributions

    Full text link
    We propose a new testing procedure about the tail weight parameter of multivariate Student tt distributions by having recourse to the Le Cam methodology. Our test is asymptotically as efficient as the classical likelihood ratio test, but outperforms the latter by its flexibility and simplicity: indeed, our approach allows to estimate the location and scatter nuisance parameters by any root-nn consistent estimators, hereby avoiding numerically complex maximum likelihood estimation. The finite-sample properties of our test are analyzed in a Monte Carlo simulation study, and we apply our method on a financial data set. We conclude the paper by indicating how to use this framework for efficient point estimation.Comment: 23 page

    Extreme Value Theory Filtering Techniques for Outlier Detection

    Get PDF
    We introduce asymptotic parameter-free hypothesis tests based on extreme value theory to detect outlying observations in finite samples. Our tests have nontrivial power for detecting outliers for general forms of the parent distribution and can be implemented when this is unknown and needs to be estimated. Using these techniques this article also develops an algorithm to uncover outliers masked by the presence of influential observations

    Quantile estimation with adaptive importance sampling

    Full text link
    We introduce new quantile estimators with adaptive importance sampling. The adaptive estimators are based on weighted samples that are neither independent nor identically distributed. Using a new law of iterated logarithm for martingales, we prove the convergence of the adaptive quantile estimators for general distributions with nonunique quantiles thereby extending the work of Feldman and Tucker [Ann. Math. Statist. 37 (1996) 451--457]. We illustrate the algorithm with an example from credit portfolio risk analysis.Comment: Published in at http://dx.doi.org/10.1214/09-AOS745 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore