83 research outputs found
On the approximate maximum likelihood estimation for diffusion processes
The transition density of a diffusion process does not admit an explicit
expression in general, which prevents the full maximum likelihood estimation
(MLE) based on discretely observed sample paths. A\"{\i}t-Sahalia [J. Finance
54 (1999) 1361--1395; Econometrica 70 (2002) 223--262] proposed asymptotic
expansions to the transition densities of diffusion processes, which lead to an
approximate maximum likelihood estimation (AMLE) for parameters. Built on
A\"{\i}t-Sahalia's [Econometrica 70 (2002) 223--262; Ann. Statist. 36 (2008)
906--937] proposal and analysis on the AMLE, we establish the consistency and
convergence rate of the AMLE, which reveal the roles played by the number of
terms used in the asymptotic density expansions and the sampling interval
between successive observations. We find conditions under which the AMLE has
the same asymptotic distribution as that of the full MLE. A first order
approximation to the Fisher information matrix is proposed.Comment: Published in at http://dx.doi.org/10.1214/11-AOS922 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Principal component analysis for second-order stationary vector time series
We extend the principal component analysis (PCA) to second-order stationary
vector time series in the sense that we seek for a contemporaneous linear
transformation for a -variate time series such that the transformed series
is segmented into several lower-dimensional subseries, and those subseries are
uncorrelated with each other both contemporaneously and serially. Therefore
those lower-dimensional series can be analysed separately as far as the linear
dynamic structure is concerned. Technically it boils down to an eigenanalysis
for a positive definite matrix. When is large, an additional step is
required to perform a permutation in terms of either maximum cross-correlations
or FDR based on multiple tests. The asymptotic theory is established for both
fixed and diverging when the sample size tends to infinity.
Numerical experiments with both simulated and real data sets indicate that the
proposed method is an effective initial step in analysing multiple time series
data, which leads to substantial dimension reduction in modelling and
forecasting high-dimensional linear dynamical structures. Unlike PCA for
independent data, there is no guarantee that the required linear transformation
exists. When it does not, the proposed method provides an approximate
segmentation which leads to the advantages in, for example, forecasting for
future values. The method can also be adapted to segment multiple volatility
processes.Comment: The original title dated back to October 2014 is "Segmenting Multiple
Time Series by Contemporaneous Linear Transformation: PCA for Time Series
High dimensional stochastic regression with latent factors, endogeneity and nonlinearity
We consider a multivariate time series model which represents a high
dimensional vector process as a sum of three terms: a linear regression of some
observed regressors, a linear combination of some latent and serially
correlated factors, and a vector white noise. We investigate the inference
without imposing stationary conditions on the target multivariate time series,
the regressors and the underlying factors. Furthermore we deal with the
endogeneity that there exist correlations between the observed regressors and
the unobserved factors. We also consider the model with nonlinear regression
term which can be approximated by a linear regression function with a large
number of regressors. The convergence rates for the estimators of regression
coefficients, the number of factors, factor loading space and factors are
established under the settings when the dimension of time series and the number
of regressors may both tend to infinity together with the sample size. The
proposed method is illustrated with both simulated and real data examples
High dimensional generalized empirical likelihood for moment restrictions with dependent data
This paper considers the maximum generalized empirical likelihood (GEL)
estimation and inference on parameters identified by high dimensional moment
restrictions with weakly dependent data when the dimensions of the moment
restrictions and the parameters diverge along with the sample size. The
consistency with rates and the asymptotic normality of the GEL estimator are
obtained by properly restricting the growth rates of the dimensions of the
parameters and the moment restrictions, as well as the degree of data
dependence. It is shown that even in the high dimensional time series setting,
the GEL ratio can still behave like a chi-square random variable
asymptotically. A consistent test for the over-identification is proposed. A
penalized GEL method is also provided for estimation under sparsity setting
Estimation of subgraph density in noisy networks
While it is common practice in applied network analysis to report various
standard network summary statistics, these numbers are rarely accompanied by
uncertainty quantification. Yet any error inherent in the measurements
underlying the construction of the network, or in the network construction
procedure itself, necessarily must propagate to any summary statistics
reported. Here we study the problem of estimating the density of an arbitrary
subgraph, given a noisy version of some underlying network as data. Under a
simple model of network error, we show that consistent estimation of such
densities is impossible when the rates of error are unknown and only a single
network is observed. Accordingly, we develop method-of-moment estimators of
network subgraph densities and error rates for the case where a minimal number
of network replicates are available. These estimators are shown to be
asymptotically normal as the number of vertices increases to infinity. We also
provide confidence intervals for quantifying the uncertainty in these estimates
based on the asymptotic normality. To construct the confidence intervals, a new
and non-standard bootstrap method is proposed to compute asymptotic variances,
which is infeasible otherwise. We illustrate the proposed methods in the
context of gene coexpression networks
Marginal empirical likelihood and sure independence feature screening
We study a marginal empirical likelihood approach in scenarios when the
number of variables grows exponentially with the sample size. The marginal
empirical likelihood ratios as functions of the parameters of interest are
systematically examined, and we find that the marginal empirical likelihood
ratio evaluated at zero can be used to differentiate whether an explanatory
variable is contributing to a response variable or not. Based on this finding,
we propose a unified feature screening procedure for linear models and the
generalized linear models. Different from most existing feature screening
approaches that rely on the magnitudes of some marginal estimators to identify
true signals, the proposed screening approach is capable of further
incorporating the level of uncertainties of such estimators. Such a merit
inherits the self-studentization property of the empirical likelihood approach,
and extends the insights of existing feature screening methods. Moreover, we
show that our screening approach is less restrictive to distributional
assumptions, and can be conveniently adapted to be applied in a broad range of
scenarios such as models specified using general moment conditions. Our
theoretical results and extensive numerical examples by simulations and data
analysis demonstrate the merits of the marginal empirical likelihood approach.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1139 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …