10,673 research outputs found
Model Selection Principles in Misspecified Models
Model selection is of fundamental importance to high dimensional modeling
featured in many contemporary applications. Classical principles of model
selection include the Kullback-Leibler divergence principle and the Bayesian
principle, which lead to the Akaike information criterion and Bayesian
information criterion when models are correctly specified. Yet model
misspecification is unavoidable when we have no knowledge of the true model or
when we have the correct family of distributions but miss some true predictor.
In this paper, we propose a family of semi-Bayesian principles for model
selection in misspecified models, which combine the strengths of the two
well-known principles. We derive asymptotic expansions of the semi-Bayesian
principles in misspecified generalized linear models, which give the new
semi-Bayesian information criteria (SIC). A specific form of SIC admits a
natural decomposition into the negative maximum quasi-log-likelihood, a penalty
on model dimensionality, and a penalty on model misspecification directly.
Numerical studies demonstrate the advantage of the newly proposed SIC
methodology for model selection in both correctly specified and misspecified
models.Comment: 25 pages, 6 table
Asymptotic analysis of the role of spatial sampling for covariance parameter estimation of Gaussian processes
Covariance parameter estimation of Gaussian processes is analyzed in an
asymptotic framework. The spatial sampling is a randomly perturbed regular grid
and its deviation from the perfect regular grid is controlled by a single
scalar regularity parameter. Consistency and asymptotic normality are proved
for the Maximum Likelihood and Cross Validation estimators of the covariance
parameters. The asymptotic covariance matrices of the covariance parameter
estimators are deterministic functions of the regularity parameter. By means of
an exhaustive study of the asymptotic covariance matrices, it is shown that the
estimation is improved when the regular grid is strongly perturbed. Hence, an
asymptotic confirmation is given to the commonly admitted fact that using
groups of observation points with small spacing is beneficial to covariance
function estimation. Finally, the prediction error, using a consistent
estimator of the covariance parameters, is analyzed in details.Comment: 47 pages. A supplementary material (pdf) is available in the arXiv
source
Distributed linear regression by averaging
Distributed statistical learning problems arise commonly when dealing with
large datasets. In this setup, datasets are partitioned over machines, which
compute locally, and communicate short messages. Communication is often the
bottleneck. In this paper, we study one-step and iterative weighted parameter
averaging in statistical linear models under data parallelism. We do linear
regression on each machine, send the results to a central server, and take a
weighted average of the parameters. Optionally, we iterate, sending back the
weighted average and doing local ridge regressions centered at it. How does
this work compared to doing linear regression on the full data? Here we study
the performance loss in estimation, test error, and confidence interval length
in high dimensions, where the number of parameters is comparable to the
training data size. We find the performance loss in one-step weighted
averaging, and also give results for iterative averaging. We also find that
different problems are affected differently by the distributed framework.
Estimation error and confidence interval length increase a lot, while
prediction error increases much less. We rely on recent results from random
matrix theory, where we develop a new calculus of deterministic equivalents as
a tool of broader interest.Comment: V2 adds a new section on iterative averaging methods, adds
applications of the calculus of deterministic equivalents, and reorganizes
the pape
Panel Data Tests Of PPP: A Critical Overview
This paper reviews recent developments in the analysis of non-stationary panels, focusing on empirical applications of panel unit root and cointegration tests in the context of PPP. It highlights various drawbacks of existing methods. First, unit root tests suffer from severe size distortions in the presence of negative moving average errors. Second, the common demeaning procedure to correct for the bias resulting from homogeneous cross-sectional dependence is not effective; more worryingly, it introduces cross-correlation when it is not already present. Third, standard corrections for the case of heterogeneous cross-sectional dependence do not generally produce consistent estimators. Fourth, if there is between-group correlation in the innovations, the SURE estimator is affected by similar problems to FGLS methods, and does not necessarily outperform OLS. Finally, cointegration between different groups in the panel could also be a source of size distortions. We offer some empirical guidelines to deal with these problems, but conclude that panel methods are unlikely to solve the PPP puzzl
Foundational principles for large scale inference: Illustrations through correlation mining
When can reliable inference be drawn in the "Big Data" context? This paper
presents a framework for answering this fundamental question in the context of
correlation mining, with implications for general large scale inference. In
large scale data applications like genomics, connectomics, and eco-informatics
the dataset is often variable-rich but sample-starved: a regime where the
number of acquired samples (statistical replicates) is far fewer than the
number of observed variables (genes, neurons, voxels, or chemical
constituents). Much of recent work has focused on understanding the
computational complexity of proposed methods for "Big Data." Sample complexity
however has received relatively less attention, especially in the setting when
the sample size is fixed, and the dimension grows without bound. To
address this gap, we develop a unified statistical framework that explicitly
quantifies the sample complexity of various inferential tasks. Sampling regimes
can be divided into several categories: 1) the classical asymptotic regime
where the variable dimension is fixed and the sample size goes to infinity; 2)
the mixed asymptotic regime where both variable dimension and sample size go to
infinity at comparable rates; 3) the purely high dimensional asymptotic regime
where the variable dimension goes to infinity and the sample size is fixed.
Each regime has its niche but only the latter regime applies to exa-scale data
dimension. We illustrate this high dimensional framework for the problem of
correlation mining, where it is the matrix of pairwise and partial correlations
among the variables that are of interest. We demonstrate various regimes of
correlation mining based on the unifying perspective of high dimensional
learning rates and sample complexity for different structured covariance models
and different inference tasks
- âŠ