64,990 research outputs found
Statistical and Computational Tradeoffs in Stochastic Composite Likelihood
Maximum likelihood estimators are often of limited practical use due to the
intensive computation they require. We propose a family of alternative
estimators that maximize a stochastic variation of the composite likelihood
function. Each of the estimators resolve the computation-accuracy tradeoff
differently, and taken together they span a continuous spectrum of
computation-accuracy tradeoff resolutions. We prove the consistency of the
estimators, provide formulas for their asymptotic variance, statistical
robustness, and computational complexity. We discuss experimental results in
the context of Boltzmann machines and conditional random fields. The
theoretical and experimental studies demonstrate the effectiveness of the
estimators when the computational resources are insufficient. They also
demonstrate that in some cases reduced computational complexity is associated
with robustness thereby increasing statistical accuracy.Comment: 30 pages, 97 figures, 2 author
Predictive hypothesis identification
While statistics focusses on hypothesis testing and on estimating (properties
of) the true sampling distribution, in machine learning the performance of
learning algorithms on future data is the primary issue. In this paper we bridge
the gap with a general principle (PHI) that identifies hypotheses with best
predictive performance. This includes predictive point and interval estimation,
simple and composite hypothesis testing, (mixture) model selection, and
others as special cases. For concrete instantiations we will recover well-known
methods, variations thereof, and new ones. PHI nicely justifies, reconciles,
and blends (a reparametrization invariant variation of) MAP, ML, MDL, and
moment estimation. One particular feature of PHI is that it can genuinely
deal with nested hypotheses
Hidden Gibbs random fields model selection using Block Likelihood Information Criterion
Performing model selection between Gibbs random fields is a very challenging
task. Indeed, due to the Markovian dependence structure, the normalizing
constant of the fields cannot be computed using standard analytical or
numerical methods. Furthermore, such unobserved fields cannot be integrated out
and the likelihood evaluztion is a doubly intractable problem. This forms a
central issue to pick the model that best fits an observed data. We introduce a
new approximate version of the Bayesian Information Criterion. We partition the
lattice into continuous rectangular blocks and we approximate the probability
measure of the hidden Gibbs field by the product of some Gibbs distributions
over the blocks. On that basis, we estimate the likelihood and derive the Block
Likelihood Information Criterion (BLIC) that answers model choice questions
such as the selection of the dependency structure or the number of latent
states. We study the performances of BLIC for those questions. In addition, we
present a comparison with ABC algorithms to point out that the novel criterion
offers a better trade-off between time efficiency and reliable results
- …