26,664 research outputs found

    On Multivariate Records from Random Vectors with Independent Components

    Get PDF
    Let X1,X2,…\boldsymbol{X}_1,\boldsymbol{X}_2,\dots be independent copies of a random vector X\boldsymbol{X} with values in Rd\mathbb{R}^d and with a continuous distribution function. The random vector Xn\boldsymbol{X}_n is a complete record, if each of its components is a record. As we require X\boldsymbol{X} to have independent components, crucial results for univariate records clearly carry over. But there are substantial differences as well: While there are infinitely many records in case d=1d=1, there occur only finitely many in the series if d≥2d\geq 2. Consequently, there is a terminal complete record with probability one. We compute the distribution of the random total number of complete records and investigate the distribution of the terminal record. For complete records, the sequence of waiting times forms a Markov chain, but differently from the univariate case, now the state infinity is an absorbing element of the state space

    A hierarchical Bayesian approach to record linkage and population size problems

    Full text link
    We propose and illustrate a hierarchical Bayesian approach for matching statistical records observed on different occasions. We show how this model can be profitably adopted both in record linkage problems and in capture--recapture setups, where the size of a finite population is the real object of interest. There are at least two important differences between the proposed model-based approach and the current practice in record linkage. First, the statistical model is built up on the actually observed categorical variables and no reduction (to 0--1 comparisons) of the available information takes place. Second, the hierarchical structure of the model allows a two-way propagation of the uncertainty between the parameter estimation step and the matching procedure so that no plug-in estimates are used and the correct uncertainty is accounted for both in estimating the population size and in performing the record linkage. We illustrate and motivate our proposal through a real data example and simulations.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS447 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Hierarchically nested factor model from multivariate data

    Full text link
    We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records.Comment: 7 pages, 5 figures; accepted for publication in Europhys. Lett. ; the Appendix corresponds to the additional material of the accepted letter

    A randomness test for functional panels

    Full text link
    Functional panels are collections of functional time series, and arise often in the study of high frequency multivariate data. We develop a portmanteau style test to determine if the cross-sections of such a panel are independent and identically distributed. Our framework allows the number of functional projections and/or the number of time series to grow with the sample size. A large sample justification is based on a new central limit theorem for random vectors of increasing dimension. With a proper normalization, the limit is standard normal, potentially making this result easily applicable in other FDA context in which projections on a subspace of increasing dimension are used. The test is shown to have correct size and excellent power using simulated panels whose random structure mimics the realistic dependence encountered in real panel data. It is expected to find application in climatology, finance, ecology, economics, and geophysics. We apply it to Southern Pacific sea surface temperature data, precipitation patterns in the South-West United States, and temperature curves in Germany.Comment: Supplemental material from the authors' homepage or upon reques

    The effects of estimation of censoring, truncation, transformation and partial data vectors

    Get PDF
    The purpose of this research was to attack statistical problems concerning the estimation of distributions for purposes of predicting and measuring assembly performance as it appears in biological and physical situations. Various statistical procedures were proposed to attack problems of this sort, that is, to produce the statistical distributions of the outcomes of biological and physical situations which, employ characteristics measured on constituent parts. The techniques are described

    Record statistics in random vectors and quantum chaos

    Full text link
    The record statistics of complex random states are analytically calculated, and shown that the probability of a record intensity is a Bernoulli process. The correlation due to normalization leads to a probability distribution of the records that is non-universal but tends to the Gumbel distribution asymptotically. The quantum standard map is used to study these statistics for the effect of correlations apart from normalization. It is seen that in the mixed phase space regime the number of intensity records is a power law in the dimensionality of the state as opposed to the logarithmic growth for random states.Comment: figures redrawn, discussion adde
    • …
    corecore