4 research outputs found
Efficient Localized Inference for Large Graphical Models
We propose a new localized inference algorithm for answering marginalization
queries in large graphical models with the correlation decay property. Given a
query variable and a large graphical model, we define a much smaller model in a
local region around the query variable in the target model so that the marginal
distribution of the query variable can be accurately approximated. We introduce
two approximation error bounds based on the Dobrushin's comparison theorem and
apply our bounds to derive a greedy expansion algorithm that efficiently guides
the selection of neighbor nodes for localized inference. We verify our
theoretical bounds on various datasets and demonstrate that our localized
inference algorithm can provide fast and accurate approximation for large
graphical models
Maximum Likelihood Learning With Arbitrary Treewidth via Fast-Mixing Parameter Sets
Inference is typically intractable in high-treewidth undirected graphical
models, making maximum likelihood learning a challenge. One way to overcome
this is to restrict parameters to a tractable set, most typically the set of
tree-structured parameters. This paper explores an alternative notion of a
tractable set, namely a set of "fast-mixing parameters" where Markov chain
Monte Carlo (MCMC) inference can be guaranteed to quickly converge to the
stationary distribution. While it is common in practice to approximate the
likelihood gradient using samples obtained from MCMC, such procedures lack
theoretical guarantees. This paper proves that for any exponential family with
bounded sufficient statistics, (not just graphical models) when parameters are
constrained to a fast-mixing set, gradient descent with gradients approximated
by sampling will approximate the maximum likelihood solution inside the set
with high-probability. When unregularized, to find a solution epsilon-accurate
in log-likelihood requires a total amount of effort cubic in 1/epsilon,
disregarding logarithmic factors. When ridge-regularized, strong convexity
allows a solution epsilon-accurate in parameter distance with effort quadratic
in 1/epsilon. Both of these provide of a fully-polynomial time randomized
approximation scheme.Comment: Advances in Neural Information Processing Systems 201