6,098 research outputs found
Maximum Likelihood Learning With Arbitrary Treewidth via Fast-Mixing Parameter Sets
Inference is typically intractable in high-treewidth undirected graphical
models, making maximum likelihood learning a challenge. One way to overcome
this is to restrict parameters to a tractable set, most typically the set of
tree-structured parameters. This paper explores an alternative notion of a
tractable set, namely a set of "fast-mixing parameters" where Markov chain
Monte Carlo (MCMC) inference can be guaranteed to quickly converge to the
stationary distribution. While it is common in practice to approximate the
likelihood gradient using samples obtained from MCMC, such procedures lack
theoretical guarantees. This paper proves that for any exponential family with
bounded sufficient statistics, (not just graphical models) when parameters are
constrained to a fast-mixing set, gradient descent with gradients approximated
by sampling will approximate the maximum likelihood solution inside the set
with high-probability. When unregularized, to find a solution epsilon-accurate
in log-likelihood requires a total amount of effort cubic in 1/epsilon,
disregarding logarithmic factors. When ridge-regularized, strong convexity
allows a solution epsilon-accurate in parameter distance with effort quadratic
in 1/epsilon. Both of these provide of a fully-polynomial time randomized
approximation scheme.Comment: Advances in Neural Information Processing Systems 201
Heuristic Ranking in Tightly Coupled Probabilistic Description Logics
The Semantic Web effort has steadily been gaining traction in the recent
years. In particular,Web search companies are recently realizing that their
products need to evolve towards having richer semantic search capabilities.
Description logics (DLs) have been adopted as the formal underpinnings for
Semantic Web languages used in describing ontologies. Reasoning under
uncertainty has recently taken a leading role in this arena, given the nature
of data found on theWeb. In this paper, we present a probabilistic extension of
the DL EL++ (which underlies the OWL2 EL profile) using Markov logic networks
(MLNs) as probabilistic semantics. This extension is tightly coupled, meaning
that probabilistic annotations in formulas can refer to objects in the
ontology. We show that, even though the tightly coupled nature of our language
means that many basic operations are data-intractable, we can leverage a
sublanguage of MLNs that allows to rank the atomic consequences of an ontology
relative to their probability values (called ranking queries) even when these
values are not fully computed. We present an anytime algorithm to answer
ranking queries, and provide an upper bound on the error that it incurs, as
well as a criterion to decide when results are guaranteed to be correct.Comment: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty
in Artificial Intelligence (UAI2012
Bethe Projections for Non-Local Inference
Many inference problems in structured prediction are naturally solved by
augmenting a tractable dependency structure with complex, non-local auxiliary
objectives. This includes the mean field family of variational inference
algorithms, soft- or hard-constrained inference using Lagrangian relaxation or
linear programming, collective graphical models, and forms of semi-supervised
learning such as posterior regularization. We present a method to
discriminatively learn broad families of inference objectives, capturing
powerful non-local statistics of the latent variables, while maintaining
tractable and provably fast inference using non-Euclidean projected gradient
descent with a distance-generating function given by the Bethe entropy. We
demonstrate the performance and flexibility of our method by (1) extracting
structured citations from research papers by learning soft global constraints,
(2) achieving state-of-the-art results on a widely-used handwriting recognition
task using a novel learned non-convex inference procedure, and (3) providing a
fast and highly scalable algorithm for the challenging problem of inference in
a collective graphical model applied to bird migration.Comment: minor bug fix to appendix. appeared in UAI 201
Modeling networks of spiking neurons as interacting processes with memory of variable length
We consider a new class of non Markovian processes with a countable number of
interacting components, both in discrete and continuous time. Each component is
represented by a point process indicating if it has a spike or not at a given
time. The system evolves as follows. For each component, the rate (in
continuous time) or the probability (in discrete time) of having a spike
depends on the entire time evolution of the system since the last spike time of
the component. In discrete time this class of systems extends in a non trivial
way both Spitzer's interacting particle systems, which are Markovian, and
Rissanen's stochastic chains with memory of variable length which have finite
state space. In continuous time they can be seen as a kind of Rissanen's
variable length memory version of the class of self-exciting point processes
which are also called "Hawkes processes", however with infinitely many
components. These features make this class a good candidate to describe the
time evolution of networks of spiking neurons. In this article we present a
critical reader's guide to recent papers dealing with this class of models,
both in discrete and in continuous time. We briefly sketch results concerning
perfect simulation and existence issues, de-correlation between successive
interspike intervals, the longtime behavior of finite non-excited systems and
propagation of chaos in mean field systems
Recommended from our members
A Stochastic Grammar of Images
This exploratory paper quests for a stochastic and context sensitive grammar of images. The grammar should achieve the following four objectives and thus serves as a unified framework of representation, learning, and recognition for a large number of object categories. (i) The grammar represents both the hierarchical decompositions from scenes, to objects, parts, primitives and pixels by terminal and non-terminal nodes and the contexts for spatial and functional relations by horizontal links between the nodes. It formulates each object category as the set of all possible valid configurations produced by the grammar. (ii) The grammar is embodied in a simple And-Or graph representation where each Or-node points to alternative sub-configurations and an And-node is decomposed into a number of components. This representation supports recursive top-down/bottom-up procedures for image parsing under the Bayesian framework and make it convenient to scale up in complexity. Given an input image, the image parsing task constructs a most probable parse graph on-the-fly as the output interpretation and this parse graph is a subgraph of the And-Or graph after making choice on the Or-nodes. (iii) A probabilistic model is defined on this And-Or graph representation to account for the natural occurrence frequency of objects and parts as well as their relations. This model is learned from a relatively small training set per category and then sampled to synthesize a large number of configurations to cover novel object instances in the test set. This generalization capability is mostly missing in discriminative machine learning methods and can largely improve recognition performance in experiments. (iv) To fill the well-known semantic gap between symbols and raw signals, the grammar includes a series of visual dictionaries and organizes them through graph composition. At the bottom-level the dictionary is a set of image primitives each having a number of anchor points with open bonds to link with other primitives. These primitives can be combined to form larger and larger graph structures for parts and objects. The ambiguities in inferring local primitives shall be resolved through top-down computation using larger structures. Finally these primitives forms a primal sketch representation which will generate the input image with every pixels explained. The proposal grammar integrates three prominent representations in the literature: stochastic grammars for composition, Markov (or graphical) models for contexts, and sparse coding with primitives (wavelets). It also combines the structure-based and appearance based methods in the vision literature. Finally the paper presents three case studies to illustrate the proposed grammar.Mathematic
- …