375 research outputs found
Asymptotically minimax Bayes predictive densities
Given a random sample from a distribution with density function that depends
on an unknown parameter , we are interested in accurately estimating
the true parametric density function at a future observation from the same
distribution. The asymptotic risk of Bayes predictive density estimates with
Kullback--Leibler loss function is used to examine various ways of choosing prior
distributions; the principal type of choice studied is minimax. We seek
asymptotically least favorable predictive densities for which the corresponding
asymptotic risk is minimax. A result resembling Stein's paradox for estimating
normal means by the maximum likelihood holds for the uniform prior in the
multivariate location family case: when the dimensionality of the model is at
least three, the Jeffreys prior is minimax, though inadmissible. The Jeffreys
prior is both admissible and minimax for one- and two-dimensional location
problems.Comment: Published at http://dx.doi.org/10.1214/009053606000000885 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
J. K. Ghosh's contribution to statistics: A brief outline
Professor Jayanta Kumar Ghosh has contributed massively to various areas of
Statistics over the last five decades. Here, we survey some of his most
important contributions. In roughly chronological order, we discuss his major
results in the areas of sequential analysis, foundations, asymptotics, and
Bayesian inference. It is seen that he progressed from thinking about data
points, to thinking about data summarization, to the limiting cases of data
summarization in as they relate to parameter estimation, and then to more
general aspects of modeling including prior and model selection.Comment: Published in at http://dx.doi.org/10.1214/074921708000000011 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
Medical image registration using Edgeworth-based approximation of Mutual Information
International audienceWe propose a new similarity measure for iconic medical image registration, an Edgeworth-based third order approximation of Mutual Information (MI) and named 3-EMI. Contrary to classical Edgeworth-based MI approximations, such as those proposed for inde- pendent component analysis, the 3-EMI measure is able to deal with potentially correlated variables. The performance of 3-EMI is then evaluated and compared with the Gaussian and B-Spline kernel-based estimates of MI, and the validation is leaded in three steps. First, we compare the intrinsic behavior of the measures as a function of the number of samples and the variance of an additive Gaussian noise. Then, they are evaluated in the context of multimodal rigid registration, using the RIRE data. We finally validate the use of our measure in the context of thoracic monomodal non-rigid registration, using the database proposed during the MICCAI EMPIRE10 challenge. The results show the wide range of clinical applications for which our measure can perform, including non-rigid registration which remains a challenging problem. They also demonstrate that 3-EMI outperforms classical estimates of MI for a low number of samples or a strong additive Gaussian noise. More generally, our measure gives competitive registration results, with a much lower numerical complexity compared to classical estimators such as the reference B-Spline kernel estimator, which makes 3-EMI a good candidate for fast and accurate registration tasks
Demystifying Fixed k-Nearest Neighbor Information Estimators
Estimating mutual information from i.i.d. samples drawn from an unknown joint
density function is a basic statistical problem of broad interest with
multitudinous applications. The most popular estimator is one proposed by
Kraskov and St\"ogbauer and Grassberger (KSG) in 2004, and is nonparametric and
based on the distances of each sample to its nearest neighboring
sample, where is a fixed small integer. Despite its widespread use (part of
scientific software packages), theoretical properties of this estimator have
been largely unexplored. In this paper we demonstrate that the estimator is
consistent and also identify an upper bound on the rate of convergence of the
bias as a function of number of samples. We argue that the superior performance
benefits of the KSG estimator stems from a curious "correlation boosting"
effect and build on this intuition to modify the KSG estimator in novel ways to
construct a superior estimator. As a byproduct of our investigations, we obtain
nearly tight rates of convergence of the error of the well known fixed
nearest neighbor estimator of differential entropy by Kozachenko and
Leonenko.Comment: 55 pages, 8 figure
Long runs under point conditioning. The real case
This paper presents a sharp approximation of the density of long runs of a
random walk conditioned on its end value or by an average of a functions of its
summands as their number tends to infinity. The conditioning event is of
moderate or large deviation type. The result extends the Gibbs conditional
principle in the sense that it provides a description of the distribution of
the random walk on long subsequences. An algorithm for the simulation of such
long runs is presented, together with an algorithm determining their maximal
length for which the approximation is valid up to a prescribed accuracy
Parameter estimation and the statistics of nonlinear cosmic fields
The large scale distribution of matter in the universe contains valuable information
about fundamental cosmological parameters, the properties of dark matter
and the formation processes of galaxies. The best hope of retrieving this information
lies in providing a statistical description of the matter distribution that
may be used for comparing models with observation. Unfortunately much of the
important information lies on scales below which nonlinear gravitational effects
have taken hold, complicating both models and statistics considerably. This thesis
deals with the distribution of matter - mass and galaxies - on such scales. The
aim is to develop new statistical tools that make use of the nonlinear evolution
for the purposes of constraining cosmological models.A new derivation for the 1 -point probability distribution function (PDF) for density
inhomogeneities is presented first. The calculation is based upon an exact
statistical treatment, using the Chapman -Kolmogorov equation and second order
Eulerian perturbation theory to propagate the initial density field into the nonlinear
regime. The analysis yields the generating function for moments, allowing
for a straightforward derivation of the skewness. A new dependance upon the
perturbation spectrum is found for the skewness at second order. The results of
the analysis are compared against other methods for deriving the 1 -point PDF
and against data from numerical N -body simulations. Good agreement is found
in both cases.The 1 -point PDF for galaxies is derived next, taking into account nonlinear biasing
of the density field and the distorting effects associated with working in
redshift space. Once again perturbation theory is used to evolve the density field
into the nonlinear regime and the Chapman -Kolmogorov equation to propagate
the initial probabilities. Transformation of the dark matter density to a biased
galaxy distribution is done through an Eulerian biasing prescription, expanding
the nonlinear bias function to second order. An advantage of the Chapman-
Kolmogorov approach is the natural way that different initial conditions and biasing
models may be incorporated. It is shown that the method is general enough
to allow a non -deterministic (hidden variable) bias. The dependance on cosmological
parameters of the evolution of the galaxy 1 -point PDF is demonstrated
and a method for differentiating between degenerate models in linear theory is
presented. A new derivation of the skewness for a biased density field in red - shift space is also given and shown to depend significantly on the density and
bias parameters. The results are compared favourably with those of numerical
simulations.Finally a new, general formalism for analysing parameter information from non - Gaussian cosmic fields is developed. The method is general enough for application
to a range of problems including the measurement of parameters from galaxy
redshift surveys, weak lensing surveys and velocity field surveys. It may also be
used to test for non -Gaussianity in the Cosmic Microwave Background. Generalising
maximum likelihood analysis to second order, the non -Gaussian Fisher
information matrix is derived and the detailed shapes of likelihood surfaces in parameter
space are explored via a parameter entropy function. Concentrating on
non -Gaussianity due to nonlinear evolution under gravity, the generalised Fisher
analysis is applied to a model of a Galaxy redshift survey, including the effects
of biasing, redshift space distortions and shot noise. Incorporating second order
moments into the parameter estimation is found to have a large effect, breaking
all of the degeneracies between parameters. The results indicate that using
nonlinear likelihood analysis may yield parameter uncertainties around the few
percent level from forthcoming large galaxy redshift surveys
- …