76,908 research outputs found
EbayesThresh: R Programs for Empirical Bayes Thresholding
Suppose that a sequence of unknown parameters is observed sub ject to independent Gaussian noise. The EbayesThresh package in the S language implements a class of Empirical Bayes thresholding methods that can take advantage of possible sparsity in the sequence, to improve the quality of estimation. The prior for each parameter in the sequence is a mixture of an atom of probability at zero and a heavy-tailed density. Within the package, this can be either a Laplace (double exponential) density or else a mixture of normal distributions with tail behavior similar to the Cauchy distribution. The mixing weight, or sparsity parameter, is chosen automatically by marginal maximum likelihood. If estimation is carried out using the posterior median, this is a random thresholding procedure; the estimation can also be carried out using other thresholding rules with the same threshold, and the package provides the posterior mean, and hard and soft thresholding, as additional options. This paper reviews the method, and gives details (far beyond those previously published) of the calculations needed for implementing the procedures. It explains and motivates both the general methodology, and the use of the EbayesThresh package, through simulated and real data examples. When estimating the wavelet transform of an unknown function, it is appropriate to apply the method level by level to the transform of the observed data. The package can carry out these calculations for wavelet transforms obtained using various packages in R and S-PLUS. Details, including a motivating example, are presented, and the application of the method to image estimation is also explored. The final topic considered is the estimation of a single sequence that may become progressively sparser along the sequence. An iterated least squares isotone regression method allows for the choice of a threshold that depends monotonically on the order in which the observations are made. An alternative possibility, also discussed in detail, is a particular parametric dependence of the sparsity parameter on the position in the sequence.
Gini estimation under infinite variance
We study the problems related to the estimation of the Gini index in presence
of a fat-tailed data generating process, i.e. one in the stable distribution
class with finite mean but infinite variance (i.e. with tail index
). We show that, in such a case, the Gini coefficient cannot be
reliably estimated using conventional nonparametric methods, because of a
downward bias that emerges under fat tails. This has important implications for
the ongoing discussion about economic inequality.
We start by discussing how the nonparametric estimator of the Gini index
undergoes a phase transition in the symmetry structure of its asymptotic
distribution, as the data distribution shifts from the domain of attraction of
a light-tailed distribution to that of a fat-tailed one, especially in the case
of infinite variance. We also show how the nonparametric Gini bias increases
with lower values of . We then prove that maximum likelihood estimation
outperforms nonparametric methods, requiring a much smaller sample size to
reach efficiency.
Finally, for fat-tailed data, we provide a simple correction mechanism to the
small sample bias of the nonparametric estimator based on the distance between
the mode and the mean of its asymptotic distribution
Tail index estimation, concentration and adaptivity
This paper presents an adaptive version of the Hill estimator based on
Lespki's model selection method. This simple data-driven index selection method
is shown to satisfy an oracle inequality and is checked to achieve the lower
bound recently derived by Carpentier and Kim. In order to establish the oracle
inequality, we derive non-asymptotic variance bounds and concentration
inequalities for Hill estimators. These concentration inequalities are derived
from Talagrand's concentration inequality for smooth functions of independent
exponentially distributed random variables combined with three tools of Extreme
Value Theory: the quantile transform, Karamata's representation of slowly
varying functions, and R\'enyi's characterisation of the order statistics of
exponential samples. The performance of this computationally and conceptually
simple method is illustrated using Monte-Carlo simulations
Second-order refined peaks-over-threshold modelling for heavy-tailed distributions
Modelling excesses over a high threshold using the Pareto or generalized
Pareto distribution (PD/GPD) is the most popular approach in extreme value
statistics. This method typically requires high thresholds in order for the
(G)PD to fit well and in such a case applies only to a small upper fraction of
the data. The extension of the (G)PD proposed in this paper is able to describe
the excess distribution for lower thresholds in case of heavy tailed
distributions. This yields a statistical model that can be fitted to a larger
portion of the data. Moreover, estimates of tail parameters display stability
for a larger range of thresholds. Our findings are supported by asymptotic
results, simulations and a case study.Comment: to appear in the Journal of Statistical Planning and Inferenc
Efficient inference about the tail weight in multivariate Student distributions
We propose a new testing procedure about the tail weight parameter of
multivariate Student distributions by having recourse to the Le Cam
methodology. Our test is asymptotically as efficient as the classical
likelihood ratio test, but outperforms the latter by its flexibility and
simplicity: indeed, our approach allows to estimate the location and scatter
nuisance parameters by any root- consistent estimators, hereby avoiding
numerically complex maximum likelihood estimation. The finite-sample properties
of our test are analyzed in a Monte Carlo simulation study, and we apply our
method on a financial data set. We conclude the paper by indicating how to use
this framework for efficient point estimation.Comment: 23 page
Extreme Value Theory Filtering Techniques for Outlier Detection
We introduce asymptotic parameter-free hypothesis tests based on extreme value theory to detect outlying observations in finite samples. Our tests have nontrivial power for detecting outliers for general forms of the parent distribution and can be implemented when this is unknown and needs to be estimated. Using these techniques this article also develops an algorithm to uncover outliers masked by the presence of influential observations
Quantile estimation with adaptive importance sampling
We introduce new quantile estimators with adaptive importance sampling. The
adaptive estimators are based on weighted samples that are neither independent
nor identically distributed. Using a new law of iterated logarithm for
martingales, we prove the convergence of the adaptive quantile estimators for
general distributions with nonunique quantiles thereby extending the work of
Feldman and Tucker [Ann. Math. Statist. 37 (1996) 451--457]. We illustrate the
algorithm with an example from credit portfolio risk analysis.Comment: Published in at http://dx.doi.org/10.1214/09-AOS745 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …