1,030 research outputs found

    Adaptive goodness-of-fit tests in a density model

    Full text link
    Given an i.i.d. sample drawn from a density ff, we propose to test that ff equals some prescribed density f0f_0 or that ff belongs to some translation/scale family. We introduce a multiple testing procedure based on an estimation of the L2\mathbb{L}_2-distance between ff and f0f_0 or between ff and the parametric family that we consider. For each sample size nn, our test has level of significance α\alpha. In the case of simple hypotheses, we prove that our test is adaptive: it achieves the optimal rates of testing established by Ingster [J. Math. Sci. 99 (2000) 1110--1119] over various classes of smooth functions simultaneously. As for composite hypotheses, we obtain similar results up to a logarithmic factor. We carry out a simulation study to compare our procedures with the Kolmogorov--Smirnov tests, or with goodness-of-fit tests proposed by Bickel and Ritov [in Nonparametric Statistics and Related Topics (1992) 51--57] and by Kallenberg and Ledwina [Ann. Statist. 23 (1995) 1594--1608].Comment: Published at http://dx.doi.org/10.1214/009053606000000119 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Adaptive estimation of linear functionals by model selection

    Full text link
    We propose an estimation procedure for linear functionals based on Gaussian model selection techniques. We show that the procedure is adaptive, and we give a non asymptotic oracle inequality for the risk of the selected estimator with respect to the Lp\mathbb{L}_p loss. An application to the problem of estimating a signal or its rthr^{th} derivative at a given point is developed and minimax rates are proved to hold uniformly over Besov balls. We also apply our non asymptotic oracle inequality to the estimation of the mean of the signal on an interval with length depending on the noise level. Simulations are included to illustrate the performances of the procedure for the estimation of a function at a given point. Our method provides a pointwise adaptive estimator.Comment: Published in at http://dx.doi.org/10.1214/07-EJS127 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The two-sample problem for Poisson processes: adaptive tests with a non-asymptotic wild bootstrap approach

    Get PDF
    Considering two independent Poisson processes, we address the question of testing equality of their respective intensities. We first propose single tests whose test statistics are U-statistics based on general kernel functions. The corresponding critical values are constructed from a non-asymptotic wild bootstrap approach, leading to level \alpha tests. Various choices for the kernel functions are possible, including projection, approximation or reproducing kernels. In this last case, we obtain a parametric rate of testing for a weak metric defined in the RKHS associated with the considered reproducing kernel. Then we introduce, in the other cases, an aggregation procedure, which allows us to import ideas coming from model selection, thresholding and/or approximation kernels adaptive estimation. The resulting multiple tests are proved to be of level \alpha, and to satisfy non-asymptotic oracle type conditions for the classical L2-norm. From these conditions, we deduce that they are adaptive in the minimax sense over a large variety of classes of alternatives based on classical and weak Besov bodies in the univariate case, but also Sobolev and anisotropic Nikol'skii-Besov balls in the multivariate case

    Fluidités victoriennes

    Get PDF
    International audienc

    De Statisticien à Data Scientist: Développements pédagogiques à l'INSA de Toulouse

    Get PDF
    International audienceAccording to a recent report from the European Commission, the world generates every minute 1.7 million of billions of data bytes, the equivalent of 360,000 DVDs, and companies that build their decision-making processes by exploiting these data increase their productivity. The treatment and valorization of massive data has consequences on the employment of graduate students in statistics. Which additional skills do students trained in statistics need to acquire to become data scientists ? How to evolve training so that future graduates can adapt to rapid changes in this area, without neglecting traditional jobs and the fundamental and lasting foundation for the training? After considering the notion of big data and questioning the emergence of a "new" science: Data Science, we present the current developments in the training of engineers in Mathematical and Modeling at INSA Toulouse.Selon un rapport récent de la commission européenne, le monde génère chaque minute 1,7 millions de milliards d'octets de données, soit l'équivalent de 360 000 DVD, et les entreprises qui bâtissent leur processus décisionnels en exploitant ces données accroissent leur productivité. Le traitement et la valorisation de données massives a des conséquence en matière d'emploi pour les diplômés des filières statistiques. Quelles compétences nouvelles les étudiants formés en statistique doivent-ils acquérir devenir des scientifiques des données ? Comment faire évoluer les formations pour permettre aux futurs diplômés de s'adapter aux évolutions rapides dans ce domaine, sans pour autant négliger les métiers traditionnels et le socle fondamental et pérenne de la formation? Après nous être interrogés sur la notion de données massives et l'émergence d'une "nouvelle" science : la science des données, nous présenterons les évolutions en cours dans la formation d'ingénieurs en Génie Mathématique et Modélisation à l'INSA de Toulouse
    • …
    corecore