6,098 research outputs found

    Maximum Likelihood Learning With Arbitrary Treewidth via Fast-Mixing Parameter Sets

    Full text link
    Inference is typically intractable in high-treewidth undirected graphical models, making maximum likelihood learning a challenge. One way to overcome this is to restrict parameters to a tractable set, most typically the set of tree-structured parameters. This paper explores an alternative notion of a tractable set, namely a set of "fast-mixing parameters" where Markov chain Monte Carlo (MCMC) inference can be guaranteed to quickly converge to the stationary distribution. While it is common in practice to approximate the likelihood gradient using samples obtained from MCMC, such procedures lack theoretical guarantees. This paper proves that for any exponential family with bounded sufficient statistics, (not just graphical models) when parameters are constrained to a fast-mixing set, gradient descent with gradients approximated by sampling will approximate the maximum likelihood solution inside the set with high-probability. When unregularized, to find a solution epsilon-accurate in log-likelihood requires a total amount of effort cubic in 1/epsilon, disregarding logarithmic factors. When ridge-regularized, strong convexity allows a solution epsilon-accurate in parameter distance with effort quadratic in 1/epsilon. Both of these provide of a fully-polynomial time randomized approximation scheme.Comment: Advances in Neural Information Processing Systems 201

    Heuristic Ranking in Tightly Coupled Probabilistic Description Logics

    Full text link
    The Semantic Web effort has steadily been gaining traction in the recent years. In particular,Web search companies are recently realizing that their products need to evolve towards having richer semantic search capabilities. Description logics (DLs) have been adopted as the formal underpinnings for Semantic Web languages used in describing ontologies. Reasoning under uncertainty has recently taken a leading role in this arena, given the nature of data found on theWeb. In this paper, we present a probabilistic extension of the DL EL++ (which underlies the OWL2 EL profile) using Markov logic networks (MLNs) as probabilistic semantics. This extension is tightly coupled, meaning that probabilistic annotations in formulas can refer to objects in the ontology. We show that, even though the tightly coupled nature of our language means that many basic operations are data-intractable, we can leverage a sublanguage of MLNs that allows to rank the atomic consequences of an ontology relative to their probability values (called ranking queries) even when these values are not fully computed. We present an anytime algorithm to answer ranking queries, and provide an upper bound on the error that it incurs, as well as a criterion to decide when results are guaranteed to be correct.Comment: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012

    Bethe Projections for Non-Local Inference

    Full text link
    Many inference problems in structured prediction are naturally solved by augmenting a tractable dependency structure with complex, non-local auxiliary objectives. This includes the mean field family of variational inference algorithms, soft- or hard-constrained inference using Lagrangian relaxation or linear programming, collective graphical models, and forms of semi-supervised learning such as posterior regularization. We present a method to discriminatively learn broad families of inference objectives, capturing powerful non-local statistics of the latent variables, while maintaining tractable and provably fast inference using non-Euclidean projected gradient descent with a distance-generating function given by the Bethe entropy. We demonstrate the performance and flexibility of our method by (1) extracting structured citations from research papers by learning soft global constraints, (2) achieving state-of-the-art results on a widely-used handwriting recognition task using a novel learned non-convex inference procedure, and (3) providing a fast and highly scalable algorithm for the challenging problem of inference in a collective graphical model applied to bird migration.Comment: minor bug fix to appendix. appeared in UAI 201

    Modeling networks of spiking neurons as interacting processes with memory of variable length

    Get PDF
    We consider a new class of non Markovian processes with a countable number of interacting components, both in discrete and continuous time. Each component is represented by a point process indicating if it has a spike or not at a given time. The system evolves as follows. For each component, the rate (in continuous time) or the probability (in discrete time) of having a spike depends on the entire time evolution of the system since the last spike time of the component. In discrete time this class of systems extends in a non trivial way both Spitzer's interacting particle systems, which are Markovian, and Rissanen's stochastic chains with memory of variable length which have finite state space. In continuous time they can be seen as a kind of Rissanen's variable length memory version of the class of self-exciting point processes which are also called "Hawkes processes", however with infinitely many components. These features make this class a good candidate to describe the time evolution of networks of spiking neurons. In this article we present a critical reader's guide to recent papers dealing with this class of models, both in discrete and in continuous time. We briefly sketch results concerning perfect simulation and existence issues, de-correlation between successive interspike intervals, the longtime behavior of finite non-excited systems and propagation of chaos in mean field systems
    corecore