251 research outputs found

### Central Limit Theorem and convergence to stable laws in Mallows distance

We give a new proof of the classical Central Limit Theorem, in the Mallows
($L^r$-Wasserstein) distance. Our proof is elementary in the sense that it does
not require complex analysis, but rather makes use of a simple subadditive
inequality related to this metric. The key is to analyse the case where
equality holds. We provide some results concerning rates of convergence. We
also consider convergence to stable distributions, and obtain a bound on the
rate of such convergence.Comment: 21 pages; improved version - one result strengthened, exposition
improved, paper to appear in Bernoull

### Theoretical properties of the log-concave maximum likelihood estimator of a multidimensional density

We present theoretical properties of the log-concave maximum likelihood
estimator of a density based on an independent and identically distributed
sample in $\mathbb{R}^d$. Our study covers both the case where the true
underlying density is log-concave, and where this model is misspecified. We
begin by showing that for a sequence of log-concave densities, convergence in
distribution implies much stronger types of convergence -- in particular, it
implies convergence in Hellinger distance and even in certain exponentially
weighted total variation norms. In our main result, we prove the existence and
uniqueness of a log-concave density that minimises the Kullback--Leibler
divergence from the true density over the class all log-concave densities, and
also show that the log-concave maximum likelihood estimator converges almost
surely in these exponentially weighted total variation norms to this minimiser.
In the case of a correctly specified model, this demonstrates a strong type of
consistency for the estimator; in a misspecified model, it shows that the
estimator converges to the log-concave density that is closest in the
Kullback--Leibler sense to the true density.Comment: 20 pages, 0 figure

### Efficient two-sample functional estimation and the super-oracle phenomenon

We consider the estimation of two-sample integral functionals, of the type
that occur naturally, for example, when the object of interest is a divergence
between unknown probability densities. Our first main result is that, in wide
generality, a weighted nearest neighbour estimator is efficient, in the sense
of achieving the local asymptotic minimax lower bound. Moreover, we also prove
a corresponding central limit theorem, which facilitates the construction of
asymptotically valid confidence intervals for the functional, having
asymptotically minimal width. One interesting consequence of our results is the
discovery that, for certain functionals, the worst-case performance of our
estimator may improve on that of the natural `oracle' estimator, which is given
access to the values of the unknown densities at the observations.Comment: 82 page

### Asymptotics and optimal bandwidth selection for highest density region estimation

We study kernel estimation of highest-density regions (HDR). Our main
contributions are two-fold. First, we derive a uniform-in-bandwidth asymptotic
approximation to a risk that is appropriate for HDR estimation. This
approximation is then used to derive a bandwidth selection rule for HDR
estimation possessing attractive asymptotic properties. We also present the
results of numerical studies that illustrate the benefits of our theory and
methodology.Comment: Published in at http://dx.doi.org/10.1214/09-AOS766 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org

### A useful variant of the Davis--Kahan theorem for statisticians

The Davis--Kahan theorem is used in the analysis of many statistical
procedures to bound the distance between subspaces spanned by population
eigenvectors and their sample versions. It relies on an eigenvalue separation
condition between certain relevant population and sample eigenvalues. We
present a variant of this result that depends only on a population eigenvalue
separation condition, making it more natural and convenient for direct
application in statistical contexts, and improving the bounds in some cases. We
also provide an extension to situations where the matrices under study may be
asymmetric or even non-square, and where interest is in the distance between
subspaces spanned by corresponding singular vectors.Comment: 12 page

### Importance Tempering

Simulated tempering (ST) is an established Markov chain Monte Carlo (MCMC)
method for sampling from a multimodal density $\pi(\theta)$. Typically, ST
involves introducing an auxiliary variable $k$ taking values in a finite subset
of $[0,1]$ and indexing a set of tempered distributions, say $\pi_k(\theta)
\propto \pi(\theta)^k$. In this case, small values of $k$ encourage better
mixing, but samples from $\pi$ are only obtained when the joint chain for
$(\theta,k)$ reaches $k=1$. However, the entire chain can be used to estimate
expectations under $\pi$ of functions of interest, provided that importance
sampling (IS) weights are calculated. Unfortunately this method, which we call
importance tempering (IT), can disappoint. This is partly because the most
immediately obvious implementation is na\"ive and can lead to high variance
estimators. We derive a new optimal method for combining multiple IS estimators
and prove that the resulting estimator has a highly desirable property related
to the notion of effective sample size. We briefly report on the success of the
optimal combination in two modelling scenarios requiring reversible-jump MCMC,
where the na\"ive approach fails.Comment: 16 pages, 2 tables, significantly shortened from version 4 in
response to referee comments, to appear in Statistics and Computin

- …