34 research outputs found
Penalized maximum likelihood and semiparametric second-order efficiency
We consider the problem of estimation of a shift parameter of an unknown
symmetric function in Gaussian white noise. We introduce a notion of
semiparametric second-order efficiency and propose estimators that are
semiparametrically efficient and second-order efficient in our model. These
estimators are of a penalized maximum likelihood type with an appropriately
chosen penalty. We argue that second-order efficiency is crucial in
semiparametric problems since only the second-order terms in asymptotic
expansion for the risk account for the behavior of the ``nonparametric
component'' of a semiparametric procedure, and they are not dramatically
smaller than the first-order terms.Comment: Published at http://dx.doi.org/10.1214/009053605000000895 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Pivotal estimation in high-dimensional regression via linear programming
We propose a new method of estimation in high-dimensional linear regression
model. It allows for very weak distributional assumptions including
heteroscedasticity, and does not require the knowledge of the variance of
random errors. The method is based on linear programming only, so that its
numerical implementation is faster than for previously known techniques using
conic programs, and it allows one to deal with higher dimensional models. We
provide upper bounds for estimation and prediction errors of the proposed
estimator showing that it achieves the same rate as in the more restrictive
situation of fixed design and i.i.d. Gaussian errors with known variance.
Following Gautier and Tsybakov (2011), we obtain the results under weaker
sensitivity assumptions than the restricted eigenvalue or assimilated
conditions
Time series prediction via aggregation : an oracle bound including numerical cost
We address the problem of forecasting a time series meeting the Causal
Bernoulli Shift model, using a parametric set of predictors. The aggregation
technique provides a predictor with well established and quite satisfying
theoretical properties expressed by an oracle inequality for the prediction
risk. The numerical computation of the aggregated predictor usually relies on a
Markov chain Monte Carlo method whose convergence should be evaluated. In
particular, it is crucial to bound the number of simulations needed to achieve
a numerical precision of the same order as the prediction risk. In this
direction we present a fairly general result which can be seen as an oracle
inequality including the numerical cost of the predictor computation. The
numerical cost appears by letting the oracle inequality depend on the number of
simulations required in the Monte Carlo approximation. Some numerical
experiments are then carried out to support our findings
Systems of Hess-Appel'rot Type and Zhukovskii Property
We start with a review of a class of systems with invariant relations, so
called {\it systems of Hess--Appel'rot type} that generalizes the classical
Hess--Appel'rot rigid body case. The systems of Hess-Appel'rot type carry an
interesting combination of both integrable and non-integrable properties.
Further, following integrable line, we study partial reductions and systems
having what we call the {\it Zhukovskii property}: these are Hamiltonian
systems with invariant relations, such that partially reduced systems are
completely integrable. We prove that the Zhukovskii property is a quite general
characteristic of systems of Hess-Appel'rote type. The partial reduction
neglects the most interesting and challenging part of the dynamics of the
systems of Hess-Appel'rot type - the non-integrable part, some analysis of
which may be seen as a reconstruction problem. We show that an integrable
system, the magnetic pendulum on the oriented Grassmannian has
natural interpretation within Zhukovskii property and it is equivalent to a
partial reduction of certain system of Hess-Appel'rot type. We perform a
classical and an algebro-geometric integration of the system, as an example of
an isoholomorphic system. The paper presents a lot of examples of systems of
Hess-Appel'rot type, giving an additional argument in favor of further study of
this class of systems.Comment: 42 page
Revisiting clustering as matrix factorisation on the Stiefel manifold
International audienceThis paper studies clustering for possibly high dimensional data (e.g. images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework. Our approach leverages the well known Burer-Monteiro factorisation strategy from large scale optimisation, in the context of low rank estimation. Moreover, our Burer-Monteiro factors are shown to lie on a Stiefel manifold. We propose a new generalized Bayesian estimator for this problem and prove novel prediction bounds for clustering. We also devise a componentwise Langevin sampler on the Stiefel manifold to compute this estimator
Noisy Monte Carlo: Convergence of Markov chains with approximate transition kernels
Monte Carlo algorithms often aim to draw from a distribution by
simulating a Markov chain with transition kernel such that is
invariant under . However, there are many situations for which it is
impractical or impossible to draw from the transition kernel . For instance,
this is the case with massive datasets, where is it prohibitively expensive to
calculate the likelihood and is also the case for intractable likelihood models
arising from, for example, Gibbs random fields, such as those found in spatial
statistics and network analysis. A natural approach in these cases is to
replace by an approximation . Using theory from the stability of
Markov chains we explore a variety of situations where it is possible to
quantify how 'close' the chain given by the transition kernel is to
the chain given by . We apply these results to several examples from spatial
statistics and network analysis.Comment: This version: results extended to non-uniformly ergodic Markov chain
Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity
We study the problem of aggregation under the squared loss in the model of
regression with deterministic design. We obtain sharp PAC-Bayesian risk bounds
for aggregates defined via exponential weights, under general assumptions on
the distribution of errors and on the functions to aggregate. We then apply
these results to derive sparsity oracle inequalities
Systems of Hess-Appel'rot type
We construct higher-dimensional generalizations of the classical
Hess-Appel'rot rigid body system. We give a Lax pair with a spectral parameter
leading to an algebro-geometric integration of this new class of systems, which
is closely related to the integration of the Lagrange bitop performed by us
recently and uses Mumford relation for theta divisors of double unramified
coverings. Based on the basic properties satisfied by such a class of systems
related to bi-Poisson structure, quasi-homogeneity, and conditions on the
Kowalevski exponents, we suggest an axiomatic approach leading to what we call
the "class of systems of Hess-Appel'rot type".Comment: 40 pages. Comm. Math. Phys. (to appear
Asymptotic equivalence of discretely observed diffusion processes and their Euler scheme: small variance case
This paper establishes the global asymptotic equivalence, in the sense of the
Le Cam -distance, between scalar diffusion models with unknown drift
function and small variance on the one side, and nonparametric autoregressive
models on the other side. The time horizon is kept fixed and both the cases
of discrete and continuous observation of the path are treated. We allow non
constant diffusion coefficient, bounded but possibly tending to zero. The
asymptotic equivalences are established by constructing explicit equivalence
mappings.Comment: 21 page
Non-parametric Bayesian drift estimation for stochastic differential equations
We consider non-parametric Bayesian estimation of the drift coefficient of a
one-dimensional stochastic differential equation from discrete-time
observations on the solution of this equation. Under suitable regularity
conditions that are weaker than those previosly suggested in the literature, we
establish posterior consistency in this context. Furthermore, we show that
posterior consistency extends to the multidimensional setting as well, which,
to the best of our knowledge, is a new result in this setting.Comment: 27 page