83 research outputs found

    On the approximate maximum likelihood estimation for diffusion processes

    Get PDF
    The transition density of a diffusion process does not admit an explicit expression in general, which prevents the full maximum likelihood estimation (MLE) based on discretely observed sample paths. A\"{\i}t-Sahalia [J. Finance 54 (1999) 1361--1395; Econometrica 70 (2002) 223--262] proposed asymptotic expansions to the transition densities of diffusion processes, which lead to an approximate maximum likelihood estimation (AMLE) for parameters. Built on A\"{\i}t-Sahalia's [Econometrica 70 (2002) 223--262; Ann. Statist. 36 (2008) 906--937] proposal and analysis on the AMLE, we establish the consistency and convergence rate of the AMLE, which reveal the roles played by the number of terms used in the asymptotic density expansions and the sampling interval between successive observations. We find conditions under which the AMLE has the same asymptotic distribution as that of the full MLE. A first order approximation to the Fisher information matrix is proposed.Comment: Published in at http://dx.doi.org/10.1214/11-AOS922 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Principal component analysis for second-order stationary vector time series

    Get PDF
    We extend the principal component analysis (PCA) to second-order stationary vector time series in the sense that we seek for a contemporaneous linear transformation for a pp-variate time series such that the transformed series is segmented into several lower-dimensional subseries, and those subseries are uncorrelated with each other both contemporaneously and serially. Therefore those lower-dimensional series can be analysed separately as far as the linear dynamic structure is concerned. Technically it boils down to an eigenanalysis for a positive definite matrix. When pp is large, an additional step is required to perform a permutation in terms of either maximum cross-correlations or FDR based on multiple tests. The asymptotic theory is established for both fixed pp and diverging pp when the sample size nn tends to infinity. Numerical experiments with both simulated and real data sets indicate that the proposed method is an effective initial step in analysing multiple time series data, which leads to substantial dimension reduction in modelling and forecasting high-dimensional linear dynamical structures. Unlike PCA for independent data, there is no guarantee that the required linear transformation exists. When it does not, the proposed method provides an approximate segmentation which leads to the advantages in, for example, forecasting for future values. The method can also be adapted to segment multiple volatility processes.Comment: The original title dated back to October 2014 is "Segmenting Multiple Time Series by Contemporaneous Linear Transformation: PCA for Time Series

    High dimensional stochastic regression with latent factors, endogeneity and nonlinearity

    Get PDF
    We consider a multivariate time series model which represents a high dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise. We investigate the inference without imposing stationary conditions on the target multivariate time series, the regressors and the underlying factors. Furthermore we deal with the endogeneity that there exist correlations between the observed regressors and the unobserved factors. We also consider the model with nonlinear regression term which can be approximated by a linear regression function with a large number of regressors. The convergence rates for the estimators of regression coefficients, the number of factors, factor loading space and factors are established under the settings when the dimension of time series and the number of regressors may both tend to infinity together with the sample size. The proposed method is illustrated with both simulated and real data examples

    High dimensional generalized empirical likelihood for moment restrictions with dependent data

    Get PDF
    This paper considers the maximum generalized empirical likelihood (GEL) estimation and inference on parameters identified by high dimensional moment restrictions with weakly dependent data when the dimensions of the moment restrictions and the parameters diverge along with the sample size. The consistency with rates and the asymptotic normality of the GEL estimator are obtained by properly restricting the growth rates of the dimensions of the parameters and the moment restrictions, as well as the degree of data dependence. It is shown that even in the high dimensional time series setting, the GEL ratio can still behave like a chi-square random variable asymptotically. A consistent test for the over-identification is proposed. A penalized GEL method is also provided for estimation under sparsity setting

    Estimation of subgraph density in noisy networks

    Full text link
    While it is common practice in applied network analysis to report various standard network summary statistics, these numbers are rarely accompanied by uncertainty quantification. Yet any error inherent in the measurements underlying the construction of the network, or in the network construction procedure itself, necessarily must propagate to any summary statistics reported. Here we study the problem of estimating the density of an arbitrary subgraph, given a noisy version of some underlying network as data. Under a simple model of network error, we show that consistent estimation of such densities is impossible when the rates of error are unknown and only a single network is observed. Accordingly, we develop method-of-moment estimators of network subgraph densities and error rates for the case where a minimal number of network replicates are available. These estimators are shown to be asymptotically normal as the number of vertices increases to infinity. We also provide confidence intervals for quantifying the uncertainty in these estimates based on the asymptotic normality. To construct the confidence intervals, a new and non-standard bootstrap method is proposed to compute asymptotic variances, which is infeasible otherwise. We illustrate the proposed methods in the context of gene coexpression networks

    Marginal empirical likelihood and sure independence feature screening

    Full text link
    We study a marginal empirical likelihood approach in scenarios when the number of variables grows exponentially with the sample size. The marginal empirical likelihood ratios as functions of the parameters of interest are systematically examined, and we find that the marginal empirical likelihood ratio evaluated at zero can be used to differentiate whether an explanatory variable is contributing to a response variable or not. Based on this finding, we propose a unified feature screening procedure for linear models and the generalized linear models. Different from most existing feature screening approaches that rely on the magnitudes of some marginal estimators to identify true signals, the proposed screening approach is capable of further incorporating the level of uncertainties of such estimators. Such a merit inherits the self-studentization property of the empirical likelihood approach, and extends the insights of existing feature screening methods. Moreover, we show that our screening approach is less restrictive to distributional assumptions, and can be conveniently adapted to be applied in a broad range of scenarios such as models specified using general moment conditions. Our theoretical results and extensive numerical examples by simulations and data analysis demonstrate the merits of the marginal empirical likelihood approach.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1139 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore