34 research outputs found

    Penalized maximum likelihood and semiparametric second-order efficiency

    Full text link
    We consider the problem of estimation of a shift parameter of an unknown symmetric function in Gaussian white noise. We introduce a notion of semiparametric second-order efficiency and propose estimators that are semiparametrically efficient and second-order efficient in our model. These estimators are of a penalized maximum likelihood type with an appropriately chosen penalty. We argue that second-order efficiency is crucial in semiparametric problems since only the second-order terms in asymptotic expansion for the risk account for the behavior of the ``nonparametric component'' of a semiparametric procedure, and they are not dramatically smaller than the first-order terms.Comment: Published at http://dx.doi.org/10.1214/009053605000000895 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Pivotal estimation in high-dimensional regression via linear programming

    Full text link
    We propose a new method of estimation in high-dimensional linear regression model. It allows for very weak distributional assumptions including heteroscedasticity, and does not require the knowledge of the variance of random errors. The method is based on linear programming only, so that its numerical implementation is faster than for previously known techniques using conic programs, and it allows one to deal with higher dimensional models. We provide upper bounds for estimation and prediction errors of the proposed estimator showing that it achieves the same rate as in the more restrictive situation of fixed design and i.i.d. Gaussian errors with known variance. Following Gautier and Tsybakov (2011), we obtain the results under weaker sensitivity assumptions than the restricted eigenvalue or assimilated conditions

    Time series prediction via aggregation : an oracle bound including numerical cost

    Full text link
    We address the problem of forecasting a time series meeting the Causal Bernoulli Shift model, using a parametric set of predictors. The aggregation technique provides a predictor with well established and quite satisfying theoretical properties expressed by an oracle inequality for the prediction risk. The numerical computation of the aggregated predictor usually relies on a Markov chain Monte Carlo method whose convergence should be evaluated. In particular, it is crucial to bound the number of simulations needed to achieve a numerical precision of the same order as the prediction risk. In this direction we present a fairly general result which can be seen as an oracle inequality including the numerical cost of the predictor computation. The numerical cost appears by letting the oracle inequality depend on the number of simulations required in the Monte Carlo approximation. Some numerical experiments are then carried out to support our findings

    Systems of Hess-Appel'rot Type and Zhukovskii Property

    Full text link
    We start with a review of a class of systems with invariant relations, so called {\it systems of Hess--Appel'rot type} that generalizes the classical Hess--Appel'rot rigid body case. The systems of Hess-Appel'rot type carry an interesting combination of both integrable and non-integrable properties. Further, following integrable line, we study partial reductions and systems having what we call the {\it Zhukovskii property}: these are Hamiltonian systems with invariant relations, such that partially reduced systems are completely integrable. We prove that the Zhukovskii property is a quite general characteristic of systems of Hess-Appel'rote type. The partial reduction neglects the most interesting and challenging part of the dynamics of the systems of Hess-Appel'rot type - the non-integrable part, some analysis of which may be seen as a reconstruction problem. We show that an integrable system, the magnetic pendulum on the oriented Grassmannian Gr+(4,2)Gr^+(4,2) has natural interpretation within Zhukovskii property and it is equivalent to a partial reduction of certain system of Hess-Appel'rot type. We perform a classical and an algebro-geometric integration of the system, as an example of an isoholomorphic system. The paper presents a lot of examples of systems of Hess-Appel'rot type, giving an additional argument in favor of further study of this class of systems.Comment: 42 page

    Revisiting clustering as matrix factorisation on the Stiefel manifold

    Get PDF
    International audienceThis paper studies clustering for possibly high dimensional data (e.g. images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework. Our approach leverages the well known Burer-Monteiro factorisation strategy from large scale optimisation, in the context of low rank estimation. Moreover, our Burer-Monteiro factors are shown to lie on a Stiefel manifold. We propose a new generalized Bayesian estimator for this problem and prove novel prediction bounds for clustering. We also devise a componentwise Langevin sampler on the Stiefel manifold to compute this estimator

    Noisy Monte Carlo: Convergence of Markov chains with approximate transition kernels

    Get PDF
    Monte Carlo algorithms often aim to draw from a distribution π\pi by simulating a Markov chain with transition kernel PP such that π\pi is invariant under PP. However, there are many situations for which it is impractical or impossible to draw from the transition kernel PP. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace PP by an approximation P^\hat{P}. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how 'close' the chain given by the transition kernel P^\hat{P} is to the chain given by PP. We apply these results to several examples from spatial statistics and network analysis.Comment: This version: results extended to non-uniformly ergodic Markov chain

    Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity

    Get PDF
    We study the problem of aggregation under the squared loss in the model of regression with deterministic design. We obtain sharp PAC-Bayesian risk bounds for aggregates defined via exponential weights, under general assumptions on the distribution of errors and on the functions to aggregate. We then apply these results to derive sparsity oracle inequalities

    Systems of Hess-Appel'rot type

    Full text link
    We construct higher-dimensional generalizations of the classical Hess-Appel'rot rigid body system. We give a Lax pair with a spectral parameter leading to an algebro-geometric integration of this new class of systems, which is closely related to the integration of the Lagrange bitop performed by us recently and uses Mumford relation for theta divisors of double unramified coverings. Based on the basic properties satisfied by such a class of systems related to bi-Poisson structure, quasi-homogeneity, and conditions on the Kowalevski exponents, we suggest an axiomatic approach leading to what we call the "class of systems of Hess-Appel'rot type".Comment: 40 pages. Comm. Math. Phys. (to appear

    Asymptotic equivalence of discretely observed diffusion processes and their Euler scheme: small variance case

    Full text link
    This paper establishes the global asymptotic equivalence, in the sense of the Le Cam Δ\Delta-distance, between scalar diffusion models with unknown drift function and small variance on the one side, and nonparametric autoregressive models on the other side. The time horizon TT is kept fixed and both the cases of discrete and continuous observation of the path are treated. We allow non constant diffusion coefficient, bounded but possibly tending to zero. The asymptotic equivalences are established by constructing explicit equivalence mappings.Comment: 21 page

    Non-parametric Bayesian drift estimation for stochastic differential equations

    Full text link
    We consider non-parametric Bayesian estimation of the drift coefficient of a one-dimensional stochastic differential equation from discrete-time observations on the solution of this equation. Under suitable regularity conditions that are weaker than those previosly suggested in the literature, we establish posterior consistency in this context. Furthermore, we show that posterior consistency extends to the multidimensional setting as well, which, to the best of our knowledge, is a new result in this setting.Comment: 27 page
    corecore