4,910 research outputs found

    Estimation of the Covariance Matrix of Large Dimensional Data

    Full text link
    This paper deals with the problem of estimating the covariance matrix of a series of independent multivariate observations, in the case where the dimension of each observation is of the same order as the number of observations. Although such a regime is of interest for many current statistical signal processing and wireless communication issues, traditional methods fail to produce consistent estimators and only recently results relying on large random matrix theory have been unveiled. In this paper, we develop the parametric framework proposed by Mestre, and consider a model where the covariance matrix to be estimated has a (known) finite number of eigenvalues, each of it with an unknown multiplicity. The main contributions of this work are essentially threefold with respect to existing results, and in particular to Mestre's work: To relax the (restrictive) separability assumption, to provide joint consistent estimates for the eigenvalues and their multiplicities, and to study the variance error by means of a Central Limit theorem

    Online Updating of Statistical Inference in the Big Data Setting

    Full text link
    We present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data. In particular, we develop iterative estimating algorithms and statistical inferences for linear models and estimating equations that update as new data arrive. These algorithms are computationally efficient, minimally storage-intensive, and allow for possible rank deficiencies in the subset design matrices due to rare-event covariates. Within the linear model setting, the proposed online-updating framework leads to predictive residual tests that can be used to assess the goodness-of-fit of the hypothesized model. We also propose a new online-updating estimator under the estimating equation setting. Theoretical properties of the goodness-of-fit tests and proposed estimators are examined in detail. In simulation studies and real data applications, our estimator compares favorably with competing approaches under the estimating equation setting.Comment: Submitted to Technometric

    Testing for residual correlation of any order in the autoregressive process

    Get PDF
    We are interested in the implications of a linearly autocorrelated driven noise on the asymptotic behavior of the usual least squares estimator in a stable autoregressive process. We show that the least squares estimator is not consistent and we suggest a sharp analysis of its almost sure limiting value as well as its asymptotic normality. We also establish the almost sure convergence and the asymptotic normality of the estimated serial correlation parameter of the driven noise. Then, we derive a statistical procedure enabling to test for correlation of any order in the residuals of an autoregressive modelling, giving clearly better results than the commonly used portmanteau tests of Ljung-Box and Box-Pierce, and appearing to outperform the Breusch-Godfrey procedure on small-sized samples.Comment: 29 pages, 4 figure

    Quantifying identifiability in independent component analysis

    Get PDF
    We are interested in consistent estimation of the mixing matrix in the ICA model, when the error distribution is close to (but different from) Gaussian. In particular, we consider nn independent samples from the ICA model X=AϵX = A\epsilon, where we assume that the coordinates of ϵ\epsilon are independent and identically distributed according to a contaminated Gaussian distribution, and the amount of contamination is allowed to depend on nn. We then investigate how the ability to consistently estimate the mixing matrix depends on the amount of contamination. Our results suggest that in an asymptotic sense, if the amount of contamination decreases at rate 1/n1/\sqrt{n} or faster, then the mixing matrix is only identifiable up to transpose products. These results also have implications for causal inference from linear structural equation models with near-Gaussian additive noise.Comment: 22 pages, 2 figure

    Non-stationary log-periodogram regression

    Get PDF
    We study asymptotic properties of the log-periodogram semiparametric estimate of the memory parameter d for non-stationary (d>=1/2) time series with Gaussian increments, extending the results of Robinson (1995) for stationary and invertible Gaussian processes. We generalize the definition of the memory parameter d for non-stationary processes in terms of the (successively) differentiated series. We obtain that the log-periodogram estimate is asymptotically normal for dE[1/2, 3/4) and still consistent for dE[1/2, 1). We show that with adequate data tapers, a modified estimate is consistent and asymptotically normal distributed for any d, including both non-stationary and non-invertible processes. The estimates are invariant to the presence of certain deterministic trends, without any need of estimation.Publicad

    On the exponential functional of Markov Additive Processes, and applications to multi-type self-similar fragmentation processes and trees

    Full text link
    A Markov Additive Process is a bi-variate Markov process (ξ,J)=((ξt,Jt),t≥0)(\xi,J)=\big((\xi_t,J_t),t\geq0\big) which should be thought of as a multi-type L\'evy process: the second component JJ is a Markov chain on a finite space {1,…,K}\{1,\ldots,K\}, and the first component ξ\xi behaves locally as a L\'evy process, with local dynamics depending on JJ. In the subordinator-like case where ξ\xi is nondecreasing, we establish several results concerning the moments of ξ\xi and of its exponential functional Iξ=∫0∞e−ξtdt,I_{\xi}=\int_{0}^{\infty} e^{-\xi_t}\mathrm dt, extending the work of Carmona et al., and Bertoin and Yor. We then apply these results to the study of multi-type self-similar fragmentation processes: these are self-similar analogues of Bertoin's homogeneous multi-type fragmentation processes Notably, we encode the genealogy of the process in a tree, and under some Malthusian hypotheses, compute its Hausdorff dimension in a generalisation of our previous work.Comment: Minor corrections and typo
    • …
    corecore