185,066 research outputs found
Power spectrum and correlation function errors: Poisson vs. Gaussian shot noise
Poisson distributed shot noise is normally considered in the Gaussian limit
in cosmology. However, if the shot noise is large enough and the correlation
function/power spectrum conspires, the Gaussian approximation mis-estimates the
errors and their covariance significantly. The power spectrum, even for
initially Gaussian densities,acquires cross correlations which can be large,
while the change in the correlation function error matrix is diagonal except at
zero separation. Two and three dimensional power law correlation function and
power spectrum examples are given. These corrections appear to have a large
effect when applied to galaxy clusters, e.g. for SZ selected galaxy clusters in
2 dimensions. This can increase the error estimates for cosmological parameter
estimation and consequently affect survey strategies, as the corrections are
minimized for surveys which are deep and narrow rather than wide and shallow.
In addition, a rewriting of the error matrix for the power spectrum/correlation
function is given which eliminates most of the Bessel function dependence (in
two dimensions) and all of it (in three dimensions), which makes the
calculation of the error matrix more tractable. This applies even when the shot
noise is in the (usual) Gaussian limit.Comment: 22 pages, 4 figures, 3 equations corrected/figures updated, results
unchange
Estimation and Inference about Heterogeneous Treatment Effects in High-Dimensional Dynamic Panels
This paper provides estimation and inference methods for a large number of
heterogeneous treatment effects in a panel data setting with many potential
controls. We assume that heterogeneous treatment is the result of a
low-dimensional base treatment interacting with many heterogeneity-relevant
controls, but only a small number of these interactions have a non-zero
heterogeneous effect relative to the average. The method has two stages. First,
we use modern machine learning techniques to estimate the expectation functions
of the outcome and base treatment given controls and take the residuals of each
variable. Second, we estimate the treatment effect by l1-regularized regression
(i.e., Lasso) of the outcome residuals on the base treatment residuals
interacted with the controls. We debias this estimator to conduct pointwise
inference about a single coefficient of treatment effect vector and
simultaneous inference about the whole vector. To account for the unobserved
unit effects inherent in panel data, we use an extension of correlated random
effects approach of Mundlak (1978) and Chamberlain (1982) to a high-dimensional
setting. As an empirical application, we estimate a large number of
heterogeneous demand elasticities based on a novel dataset from a major
European food distributor
Inference for High-Dimensional Sparse Econometric Models
This article is about estimation and inference methods for high dimensional
sparse (HDS) regression models in econometrics. High dimensional sparse models
arise in situations where many regressors (or series terms) are available and
the regression function is well-approximated by a parsimonious, yet unknown set
of regressors. The latter condition makes it possible to estimate the entire
regression function effectively by searching for approximately the right set of
regressors. We discuss methods for identifying this set of regressors and
estimating their coefficients based on -penalization and describe key
theoretical results. In order to capture realistic practical situations, we
expressly allow for imperfect selection of regressors and study the impact of
this imperfect selection on estimation and inference results. We focus the main
part of the article on the use of HDS models and methods in the instrumental
variables model and the partially linear model. We present a set of novel
inference results for these models and illustrate their use with applications
to returns to schooling and growth regression
Transposable regularized covariance models with an application to missing data imputation
Missing data estimation is an important challenge with high-dimensional data
arranged in the form of a matrix. Typically this data matrix is transposable,
meaning that either the rows, columns or both can be treated as features. To
model transposable data, we present a modification of the matrix-variate
normal, the mean-restricted matrix-variate normal, in which the rows and
columns each have a separate mean vector and covariance matrix. By placing
additive penalties on the inverse covariance matrices of the rows and columns,
these so-called transposable regularized covariance models allow for maximum
likelihood estimation of the mean and nonsingular covariance matrices. Using
these models, we formulate EM-type algorithms for missing data imputation in
both the multivariate and transposable frameworks. We present theoretical
results exploiting the structure of our transposable models that allow these
models and imputation methods to be applied to high-dimensional data.
Simulations and results on microarray data and the Netflix data show that these
imputation techniques often outperform existing methods and offer a greater
degree of flexibility.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS314 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …