2,293 research outputs found
Asymptotic normality for the counting process of weak records and \delta-records in discrete models
Let be a sequence of independent and identically distributed
random variables, taking non-negative integer values, and call a
-record if , where is an
integer constant. We use martingale arguments to show that the counting process
of -records among the first observations, suitably centered and
scaled, is asymptotically normally distributed for . In particular,
taking we obtain a central limit theorem for the number of weak
records.Comment: Published at http://dx.doi.org/10.3150/07-BEJ6027 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Geodesic PCA in the Wasserstein space
We introduce the method of Geodesic Principal Component Analysis (GPCA) on
the space of probability measures on the line, with finite second moment,
endowed with the Wasserstein metric. We discuss the advantages of this
approach, over a standard functional PCA of probability densities in the
Hilbert space of square-integrable functions. We establish the consistency of
the method by showing that the empirical GPCA converges to its population
counterpart, as the sample size tends to infinity. A key property in the study
of GPCA is the isometry between the Wasserstein space and a closed convex
subset of the space of square-integrable functions, with respect to an
appropriate measure. Therefore, we consider the general problem of PCA in a
closed convex subset of a separable Hilbert space, which serves as basis for
the analysis of GPCA and also has interest in its own right. We provide
illustrative examples on simple statistical models, to show the benefits of
this approach for data analysis. The method is also applied to a real dataset
of population pyramids
Viral envelope glycoproteins swing into action
AbstractAnalysis of tick-borne encephalitis virus E protein reveals considerable structural diversity in the glycoproteins that clothe enveloped viruses and hints at the conformational gyrations in this molecule that lead to viral fusion
Multilayer parking with screening on a random tree
In this paper we present a multilayer particle deposition model on a random
tree. We derive the time dependent densities of the first and second layer
analytically and show that in all trees the limiting density of the first layer
exceeds the density in the second layer. We also provide a procedure to
calculate higher layer densities and prove that random trees have a higher
limiting density in the first layer than regular trees. Finally, we compare
densities between the first and second layer and between regular and random
trees.Comment: 15 pages, 2 figure
Geometric PCA of Images
We describe a method for analyzing the principal modes of geometric variability of images. For this purpose, we propose a general framework based on the use of deformation operators for modeling the geometric variability of images around a reference mean pattern. In this setting, we describe a simple
algorithm for estimating the geometric variability of a set of images. Some numerical experiments on real data are proposed for highlighting the benefits of this approach. The consistency of this procedure is also analyzed in statistical deformable models
Extrapolation of Urn Models via Poissonization: Accurate Measurements of the Microbial Unknown
The availability of high-throughput parallel methods for sequencing microbial
communities is increasing our knowledge of the microbial world at an
unprecedented rate. Though most attention has focused on determining
lower-bounds on the alpha-diversity i.e. the total number of different species
present in the environment, tight bounds on this quantity may be highly
uncertain because a small fraction of the environment could be composed of a
vast number of different species. To better assess what remains unknown, we
propose instead to predict the fraction of the environment that belongs to
unsampled classes. Modeling samples as draws with replacement of colored balls
from an urn with an unknown composition, and under the sole assumption that
there are still undiscovered species, we show that conditionally unbiased
predictors and exact prediction intervals (of constant length in logarithmic
scale) are possible for the fraction of the environment that belongs to
unsampled classes. Our predictions are based on a Poissonization argument,
which we have implemented in what we call the Embedding algorithm. In fixed
i.e. non-randomized sample sizes, the algorithm leads to very accurate
predictions on a sub-sample of the original sample. We quantify the effect of
fixed sample sizes on our prediction intervals and test our methods and others
found in the literature against simulated environments, which we devise taking
into account datasets from a human-gut and -hand microbiota. Our methodology
applies to any dataset that can be conceptualized as a sample with replacement
from an urn. In particular, it could be applied, for example, to quantify the
proportion of all the unseen solutions to a binding site problem in a random
RNA pool, or to reassess the surveillance of a certain terrorist group,
predicting the conditional probability that it deploys a new tactic in a next
attack.Comment: 14 pages, 7 figures, 4 table
Absorption of a pulse by an optically dense medium: An argument for field quantization
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/98713/1/AJP000527.pd
- …