3,238 research outputs found

    Spectrum Estimation: A Unified Framework for Covariance Matrix Estimation and PCA in Large Dimensions

    Full text link
    Covariance matrix estimation and principal component analysis (PCA) are two cornerstones of multivariate analysis. Classic textbook solutions perform poorly when the dimension of the data is of a magnitude similar to the sample size, or even larger. In such settings, there is a common remedy for both statistical problems: nonlinear shrinkage of the eigenvalues of the sample covariance matrix. The optimal nonlinear shrinkage formula depends on unknown population quantities and is thus not available. It is, however, possible to consistently estimate an oracle nonlinear shrinkage, which is motivated on asymptotic grounds. A key tool to this end is consistent estimation of the set of eigenvalues of the population covariance matrix (also known as the spectrum), an interesting and challenging problem in its own right. Extensive Monte Carlo simulations demonstrate that our methods have desirable finite-sample properties and outperform previous proposals.Comment: 40 pages, 8 figures, 5 tables, University of Zurich, Department of Economics, Working Paper No. 105, Revised version, July 201

    Inference in Linear Regression Models with Many Covariates and Heteroskedasticity

    Full text link
    The linear regression model is widely used in empirical work in Economics, Statistics, and many other disciplines. Researchers often include many covariates in their linear model specification in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroskedasticity. Our results are obtained using high-dimensional approximations, where the number of included covariates are allowed to grow as fast as the sample size. We find that all of the usual versions of Eicker-White heteroskedasticity consistent standard error estimators for linear models are inconsistent under this asymptotics. We then propose a new heteroskedasticity consistent standard error formula that is fully automatic and robust to both (conditional)\ heteroskedasticity of unknown form and the inclusion of possibly many covariates. We apply our findings to three settings: parametric linear models with many covariates, linear panel models with many fixed effects, and semiparametric semi-linear models with many technical regressors. Simulation evidence consistent with our theoretical results is also provided. The proposed methods are also illustrated with an empirical application

    Distributed linear regression by averaging

    Full text link
    Distributed statistical learning problems arise commonly when dealing with large datasets. In this setup, datasets are partitioned over machines, which compute locally, and communicate short messages. Communication is often the bottleneck. In this paper, we study one-step and iterative weighted parameter averaging in statistical linear models under data parallelism. We do linear regression on each machine, send the results to a central server, and take a weighted average of the parameters. Optionally, we iterate, sending back the weighted average and doing local ridge regressions centered at it. How does this work compared to doing linear regression on the full data? Here we study the performance loss in estimation, test error, and confidence interval length in high dimensions, where the number of parameters is comparable to the training data size. We find the performance loss in one-step weighted averaging, and also give results for iterative averaging. We also find that different problems are affected differently by the distributed framework. Estimation error and confidence interval length increase a lot, while prediction error increases much less. We rely on recent results from random matrix theory, where we develop a new calculus of deterministic equivalents as a tool of broader interest.Comment: V2 adds a new section on iterative averaging methods, adds applications of the calculus of deterministic equivalents, and reorganizes the pape

    Advances in forecast evaluation

    Get PDF
    This paper surveys recent developments in the evaluation of point forecasts. Taking West’s (2006) survey as a starting point, we briefly cover the state of the literature as of the time of West’s writing. We then focus on recent developments, including advancements in the evaluation of forecasts at the population level (based on true, unknown model coefficients), the evaluation of forecasts in the finite sample (based on estimated model coefficients), and the evaluation of conditional versus unconditional forecasts. We present original results in a few subject areas: the optimization of power in determining the split of a sample into in-sample and out-of-sample portions; whether the accuracy of inference in evaluation of multistep forecasts can be improved with the judicious choice of HAC estimator (it can); and the extension of West’s (1996) theory results for population-level, unconditional forecast evaluation to the case of conditional forecast evaluation.Forecasting ; Time-series analysis

    Advances in forecast evaluation

    Get PDF
    This paper surveys recent developments in the evaluation of point forecasts. Taking West's (2006) survey as a starting point, we briefly cover the state of the literature as of the time of West's writing. We then focus on recent developments, including advancements in the evaluation of forecasts at the population level (based on true, unknown model coefficients), the evaluation of forecasts in the finite sample (based on estimated model coefficients), and the evaluation of conditional versus unconditional forecasts. We present original results in a few subject areas: the optimization of power in determining the split of a sample into in-sample and out-of-sample portions; whether the accuracy of inference in evaluation of multi-step forecasts can be improved with judicious choice of HAC estimator (it can); and the extension of West's (1996) theory results for population-level, unconditional forecast evaluation to the case of conditional forecast evaluation.Forecasting

    Surprises in High-Dimensional Ridgeless Least Squares Interpolation

    Full text link
    Interpolators -- estimators that achieve zero training error -- have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. In this paper, we study minimum ℓ2\ell_2 norm (``ridgeless'') interpolation in high-dimensional least squares regression. We consider two different models for the feature distribution: a linear model, where the feature vectors xi∈Rpx_i \in {\mathbb R}^p are obtained by applying a linear transform to a vector of i.i.d.\ entries, xi=ÎŁ1/2zix_i = \Sigma^{1/2} z_i (with zi∈Rpz_i \in {\mathbb R}^p); and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network, xi=φ(Wzi)x_i = \varphi(W z_i) (with zi∈Rdz_i \in {\mathbb R}^d, W∈Rp×dW \in {\mathbb R}^{p \times d} a matrix of i.i.d.\ entries, and φ\varphi an activation function acting componentwise on WziW z_i). We recover -- in a precise quantitative way -- several phenomena that have been observed in large-scale neural networks and kernel machines, including the "double descent" behavior of the prediction risk, and the potential benefits of overparametrization.Comment: 68 pages; 16 figures. This revision contains non-asymptotic version of earlier results, and results for general coefficient

    Nonparametric estimation of scalar diffusions based on low frequency data

    Full text link
    We study the problem of estimating the coefficients of a diffusion (X_t,t\geq 0); the estimation is based on discrete data X_{n\Delta},n=0,1,...,N. The sampling frequency \Delta^{-1} is constant, and asymptotics are taken as the number N of observations tends to infinity. We prove that the problem of estimating both the diffusion coefficient (the volatility) and the drift in a nonparametric setting is ill-posed: the minimax rates of convergence for Sobolev constraints and squared-error loss coincide with that of a, respectively, first- and second-order linear inverse problem. To ensure ergodicity and limit technical difficulties we restrict ourselves to scalar diffusions living on a compact interval with reflecting boundary conditions. Our approach is based on the spectral analysis of the associated Markov semigroup. A rate-optimal estimation of the coefficients is obtained via the nonparametric estimation of an eigenvalue-eigenfunction pair of the transition operator of the discrete time Markov chain (X_{n\Delta},n=0,1,...,N) in a suitable Sobolev norm, together with an estimation of its invariant density.Comment: Published at http://dx.doi.org/10.1214/009053604000000797 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Estimating Euler equations

    Get PDF
    In this paper we consider conditions under which the estimation of a log-linearized Euler equation for consumption yields consistent estimates of preference parameters. When utility is isoelastic and a sample covering a long time period is available, consistent estimates are obtained from the loglinearized Euler equation when the innovations to the conditional variance of consumption growth are uncorrelated with the instruments typically used in estimation. We perform a Montecarlo experiment, consisting in solving and simulating a simple life cycle model under uncertainty, and show that in most situations, the estimates obtained from the log-linearized equation are not systematically biased. This is true even when we introduce heteroscedasticity in the process generating income. The only exception is when discount rates are very high (e.g. 47% per year). This problem arises because consumers are nearly always close to the maximum borrowing limit: the estimation bias is unrelated to the linearization and estimates using nonlinear GMM are as bad. Across all our situations, estimation using a log-linearized Euler equation does better than nonlinear GMM despite the absence of measurement error. Finally, we plot life cycle profiles for the variance of consumption growth, which, except when the discount factor is very high, is remarkably flat. This implies that claims that demographic variables in log-linearized Euler equations capture changes in the variance of consumption growth are unwarranted
    • 

    corecore