Search CORE

29 research outputs found

Asymptotic normality for the counting process of weak records and \delta-records in discrete models

Author: Gouet Raúl
López F. Javier
Sanz Gerardo
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2007
Field of study

Let

\{X_n,n\ge1\}

be a sequence of independent and identically distributed random variables, taking non-negative integer values, and call

X_n

\delta

-record if

X_n>\max\{X_1,...,X_{n-1}\}+\delta

, where

\delta

is an integer constant. We use martingale arguments to show that the counting process of

\delta

-records among the first

n

observations, suitably centered and scaled, is asymptotically normally distributed for

\delta\ne0

. In particular, taking

\delta=-1

we obtain a central limit theorem for the number of weak records.Comment: Published at http://dx.doi.org/10.3150/07-BEJ6027 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

Repositorio Universidad de Zaragoza

Geodesic PCA in the Wasserstein space

Author: Alfredo López
Dmia Isae
Jérémie Bigot
Raúl Gouet
Thierry Klein
Publication venue
Publication date: 03/10/2014
Field of study

We introduce the method of Geodesic Principal Component Analysis (GPCA) on the space of probability measures on the line, with finite second moment, endowed with the Wasserstein metric. We discuss the advantages of this approach, over a standard functional PCA of probability densities in the Hilbert space of square-integrable functions. We establish the consistency of the method by showing that the empirical GPCA converges to its population counterpart, as the sample size tends to infinity. A key property in the study of GPCA is the isometry between the Wasserstein space and a closed convex subset of the space of square-integrable functions, with respect to an appropriate measure. Therefore, we consider the general problem of PCA in a closed convex subset of a separable Hilbert space, which serves as basis for the analysis of GPCA and also has interest in its own right. We provide illustrative examples on simple statistical models, to show the benefits of this approach for data analysis. The method is also applied to a real dataset of population pyramids

arXiv.org e-Print Archive

CiteSeerX

Geometric PCA of Images

Author: Bigot Jérémie
Gouet Raúl
López Alfredo
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2012
Field of study

We describe a method for analyzing the principal modes of geometric variability of images. For this purpose, we propose a general framework based on the use of deformation operators for modeling the geometric variability of images around a reference mean pattern. In this setting, we describe a simple algorithm for estimating the geometric variability of a set of images. Some numerical experiments on real data are proposed for highlighting the benefits of this approach. The consistency of this procedure is also analyzed in statistical deformable models

CiteSeerX

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL-INSA Toulouse

Extrapolation of Urn Models via Poissonization: Accurate Measurements of the Microbial Unknown

Author: Gouet Raúl
Lladser Manuel
Reeder Jens
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 28/06/2011
Field of study

The availability of high-throughput parallel methods for sequencing microbial communities is increasing our knowledge of the microbial world at an unprecedented rate. Though most attention has focused on determining lower-bounds on the alpha-diversity i.e. the total number of different species present in the environment, tight bounds on this quantity may be highly uncertain because a small fraction of the environment could be composed of a vast number of different species. To better assess what remains unknown, we propose instead to predict the fraction of the environment that belongs to unsampled classes. Modeling samples as draws with replacement of colored balls from an urn with an unknown composition, and under the sole assumption that there are still undiscovered species, we show that conditionally unbiased predictors and exact prediction intervals (of constant length in logarithmic scale) are possible for the fraction of the environment that belongs to unsampled classes. Our predictions are based on a Poissonization argument, which we have implemented in what we call the Embedding algorithm. In fixed i.e. non-randomized sample sizes, the algorithm leads to very accurate predictions on a sub-sample of the original sample. We quantify the effect of fixed sample sizes on our prediction intervals and test our methods and others found in the literature against simulated environments, which we devise taking into account datasets from a human-gut and -hand microbiota. Our methodology applies to any dataset that can be conceptualized as a sample with replacement from an urn. In particular, it could be applied, for example, to quantify the proportion of all the unseen solutions to a binding site problem in a random RNA pool, or to reassess the surveillance of a certain terrorist group, predicting the conditional probability that it deploys a new tactic in a next attack.Comment: 14 pages, 7 figures, 4 table

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Near-Record Values in Discrete Random Sequences

Author: Gouet Raúl
Lafuente Miguel
López F. Javier
Sanz Gerardo
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Given a sequence (Xn) of random variables, Xn is said to be a near-record if Xn∈(Mn−1−a,Mn−1], where Mn=max{X1,…,Xn} and a>0 is a parameter. We investigate the point process η on [0,∞) of near-record values from an integer-valued, independent and identically distributed sequence, showing that it is a Bernoulli cluster process. We derive the probability generating functional of η and formulas for the expectation, variance and covariance of the counting variables η(A),A⊂[0,∞). We also derive the strong convergence and asymptotic normality of η([0,n]), as n→∞, under mild regularity conditions on the distribution of the observations. For heavy-tailed distributions, with square-summable hazard rates, we prove that η([0,n]) grows to a finite random limit and compute its probability generating function. We present examples of the application of our results to particular distributions, covering a wide range of behaviours in terms of their right tails

Multidisciplinary Digital Publishing Institute

Repositorio Universidad de Zaragoza

A martingale approach to strong convergence in a generalized Pólya-Eggenberger urn model

Author: Gouet Raúl
Publication venue
Publication date
Field of study

We obtain strong convergence for the proportion Wn/Tn of white balls in a generalized Pólya--Eggenberger urn scheme. We use straightforward martingale arguments that do not require moment estimations.urn model martingales limit theorems

Research Papers in Economics

Embedding in extremal processes and the asymptotic behavior of sums of minima

Author: Gouet Raúl
Publication venue
Publication date
Field of study

Limit theorems for sums of minima of positive i.d.d. r.v.'s are obtained by embedding the sequence of maxima in a suitable extremal process.extremal process sums of extremes limit theorems

Research Papers in Economics

Strong convergence of proportions in a multicolor Pólya urn

Author: Johnson
Neveu
Raúl Gouet
Publication venue: 'JSTOR'
Publication date
Field of study

Crossref