Search CORE

3,834 research outputs found

A blocking and regularization approach to high dimensional realized covariance estimation

Author: Hautsch Nikolaus
Kyj Lada M.
Oomen Roel C. A.
Publication venue
Publication date: 01/01/2009
Field of study

We introduce a regularization and blocking estimator for well-conditioned high-dimensional daily covariances using high-frequency data. Using the Barndorff-Nielsen, Hansen, Lunde, and Shephard (2008a) kernel estimator, we estimate the covariance matrix block-wise and regularize it. A data-driven grouping of assets of similar trading frequency ensures the reduction of data loss due to refresh time sampling. In an extensive simulation study mimicking the empirical features of the S&P 1500 universe we show that the ’RnB’ estimator yields efficiency gains and outperforms competing kernel estimators for varying liquidity settings, noise-to-signal ratios, and dimensions. An empirical application of forecasting daily covariances of the S&P 500 index confirms the simulation results

Crossref

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Hochschulschriftenserver - Universität Frankfurt am Main

Estimation with Norm Regularization

Author: Banerjee Arindam
Chen Sheng
Fazayeli Farideh
Sivakumar Vidyashankar
Publication venue
Publication date: 30/11/2015
Field of study

Analysis of non-asymptotic estimation error and structured statistical recovery based on norm regularized regression, such as Lasso, needs to consider four aspects: the norm, the loss function, the design matrix, and the noise model. This paper presents generalizations of such estimation error analysis on all four aspects compared to the existing literature. We characterize the restricted error set where the estimation error vector lies, establish relations between error sets for the constrained and regularized problems, and present an estimation error bound applicable to any norm. Precise characterizations of the bound is presented for isotropic as well as anisotropic subGaussian design matrices, subGaussian noise models, and convex loss functions, including least squares and generalized linear models. Generic chaining and associated results play an important role in the analysis. A key result from the analysis is that the sample complexity of all such estimators depends on the Gaussian width of a spherical cap corresponding to the restricted error set. Further, once the number of samples

n

crosses the required sample complexity, the estimation error decreases as

\frac{c}{\sqrt{n}}

, where

c

depends on the Gaussian width of the unit norm ball.Comment: Fixed technical issues. Generalized some result

arXiv.org e-Print Archive

CiteSeerX

Asymptotic Analysis of MAP Estimation via the Replica Method and Applications to Compressed Sensing

Author: Fletcher Alyson K.
Goyal Vivek K
Rangan Sundeep
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2011
Field of study

The replica method is a non-rigorous but well-known technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method, under the assumption of replica symmetry, to study estimators that are maximum a posteriori (MAP) under a postulated prior distribution. It is shown that with random linear measurements and Gaussian noise, the replica-symmetric prediction of the asymptotic behavior of the postulated MAP estimate of an n-dimensional vector "decouples" as n scalar postulated MAP estimators. The result is based on applying a hardening argument to the replica analysis of postulated posterior mean estimators of Tanaka and of Guo and Verdu. The replica-symmetric postulated MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero norm-regularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for precisely predicting various performance metrics including mean-squared error and sparsity pattern recovery probability.Comment: 22 pages; added details on the replica symmetry assumptio

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Boston University Institutional Repository (OpenBU)

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

Author: Loh Po-Ling
Wainwright Martin J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and more surprisingly, to prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing close agreement with the predicted scalings.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1018 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

ScholarlyCommons@Penn

Fast global convergence of gradient methods for high-dimensional statistical recovery

Author: Agarwal Alekh
Negahban Sahand N.
Wainwright Martin J.
Publication venue
Publication date: 01/01/2011
Field of study

Many statistical

M

-estimators are based on convex optimization problems formed by the combination of a data-dependent loss function with a norm-based regularizer. We analyze the convergence rates of projected gradient and composite gradient methods for solving such problems, working within a high-dimensional framework that allows the data dimension \pdim to grow with (and possibly exceed) the sample size \numobs. This high-dimensional structure precludes the usual global assumptions---namely, strong convexity and smoothness conditions---that underlie much of classical optimization analysis. We define appropriately restricted versions of these conditions, and show that they are satisfied with high probability for various statistical models. Under these conditions, our theory guarantees that projected gradient descent has a globally geometric rate of convergence up to the \emph{statistical precision} of the model, meaning the typical distance between the true unknown parameter

\theta^*

and an optimal solution

\hat{\theta}

. This result is substantially sharper than previous convergence results, which yielded sublinear convergence, or linear convergence only up to the noise level. Our analysis applies to a wide range of

M

-estimators and statistical models, including sparse linear regression using Lasso (

\ell_1

-regularized regression); group Lasso for block sparsity; log-linear models with regularization; low-rank matrix recovery using nuclear norm regularization; and matrix decomposition. Overall, our analysis reveals interesting connections between statistical precision and computational efficiency in high-dimensional estimation

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref