4,910 research outputs found
Estimation of the Covariance Matrix of Large Dimensional Data
This paper deals with the problem of estimating the covariance matrix of a
series of independent multivariate observations, in the case where the
dimension of each observation is of the same order as the number of
observations. Although such a regime is of interest for many current
statistical signal processing and wireless communication issues, traditional
methods fail to produce consistent estimators and only recently results relying
on large random matrix theory have been unveiled. In this paper, we develop the
parametric framework proposed by Mestre, and consider a model where the
covariance matrix to be estimated has a (known) finite number of eigenvalues,
each of it with an unknown multiplicity. The main contributions of this work
are essentially threefold with respect to existing results, and in particular
to Mestre's work: To relax the (restrictive) separability assumption, to
provide joint consistent estimates for the eigenvalues and their
multiplicities, and to study the variance error by means of a Central Limit
theorem
Online Updating of Statistical Inference in the Big Data Setting
We present statistical methods for big data arising from online analytical
processing, where large amounts of data arrive in streams and require fast
analysis without storage/access to the historical data. In particular, we
develop iterative estimating algorithms and statistical inferences for linear
models and estimating equations that update as new data arrive. These
algorithms are computationally efficient, minimally storage-intensive, and
allow for possible rank deficiencies in the subset design matrices due to
rare-event covariates. Within the linear model setting, the proposed
online-updating framework leads to predictive residual tests that can be used
to assess the goodness-of-fit of the hypothesized model. We also propose a new
online-updating estimator under the estimating equation setting. Theoretical
properties of the goodness-of-fit tests and proposed estimators are examined in
detail. In simulation studies and real data applications, our estimator
compares favorably with competing approaches under the estimating equation
setting.Comment: Submitted to Technometric
Testing for residual correlation of any order in the autoregressive process
We are interested in the implications of a linearly autocorrelated driven
noise on the asymptotic behavior of the usual least squares estimator in a
stable autoregressive process. We show that the least squares estimator is not
consistent and we suggest a sharp analysis of its almost sure limiting value as
well as its asymptotic normality. We also establish the almost sure convergence
and the asymptotic normality of the estimated serial correlation parameter of
the driven noise. Then, we derive a statistical procedure enabling to test for
correlation of any order in the residuals of an autoregressive modelling,
giving clearly better results than the commonly used portmanteau tests of
Ljung-Box and Box-Pierce, and appearing to outperform the Breusch-Godfrey
procedure on small-sized samples.Comment: 29 pages, 4 figure
Quantifying identifiability in independent component analysis
We are interested in consistent estimation of the mixing matrix in the ICA
model, when the error distribution is close to (but different from) Gaussian.
In particular, we consider independent samples from the ICA model , where we assume that the coordinates of are independent
and identically distributed according to a contaminated Gaussian distribution,
and the amount of contamination is allowed to depend on . We then
investigate how the ability to consistently estimate the mixing matrix depends
on the amount of contamination. Our results suggest that in an asymptotic
sense, if the amount of contamination decreases at rate or faster,
then the mixing matrix is only identifiable up to transpose products. These
results also have implications for causal inference from linear structural
equation models with near-Gaussian additive noise.Comment: 22 pages, 2 figure
Non-stationary log-periodogram regression
We study asymptotic properties of the log-periodogram semiparametric estimate of the memory parameter d for non-stationary (d>=1/2) time series with Gaussian increments, extending the results of Robinson (1995) for stationary and invertible Gaussian processes. We generalize the definition of the memory parameter d for non-stationary processes in terms of the (successively) differentiated series. We obtain that the log-periodogram estimate is asymptotically normal for dE[1/2, 3/4) and still consistent for dE[1/2, 1). We show that with adequate data tapers, a modified estimate is consistent and asymptotically normal distributed for any d, including both non-stationary and non-invertible processes. The estimates are invariant to the presence of certain deterministic trends, without any need of estimation.Publicad
On the exponential functional of Markov Additive Processes, and applications to multi-type self-similar fragmentation processes and trees
A Markov Additive Process is a bi-variate Markov process
which should be thought of as a
multi-type L\'evy process: the second component is a Markov chain on a
finite space , and the first component behaves locally as
a L\'evy process, with local dynamics depending on . In the
subordinator-like case where is nondecreasing, we establish several
results concerning the moments of and of its exponential functional
extending the work of Carmona
et al., and Bertoin and Yor.
We then apply these results to the study of multi-type self-similar
fragmentation processes: these are self-similar analogues of Bertoin's
homogeneous multi-type fragmentation processes Notably, we encode the genealogy
of the process in a tree, and under some Malthusian hypotheses, compute its
Hausdorff dimension in a generalisation of our previous work.Comment: Minor corrections and typo
- …