128 research outputs found
Graphical Models and Symmetries : Loopy Belief Propagation Approaches
Whenever a person or an automated system has to reason in uncertain domains, probability theory is necessary. Probabilistic graphical models allow us to build statistical models that capture complex dependencies between random variables. Inference in these models, however, can easily become intractable. Typical ways to address this scaling issue are inference by approximate message-passing, stochastic gradients, and MapReduce, among others. Exploiting the symmetries of graphical models, however, has not yet been considered for scaling statistical machine learning applications. One instance of graphical models that are inherently symmetric are statistical relational models. These have recently gained attraction within the machine learning and AI communities and combine probability theory with first-order logic, thereby allowing for an efficient representation of structured relational domains. The provided formalisms to compactly represent complex real-world domains enable us to effectively describe large problem instances. Inference within and training of graphical models, however, have not been able to keep pace with the increased representational power. This thesis tackles two major aspects of graphical models and shows that both inference and training can indeed benefit from exploiting symmetries. It first deals with efficient inference exploiting symmetries in graphical models for various query types. We introduce lifted loopy belief propagation (lifted LBP), the first lifted parallel inference approach for relational as well as propositional graphical models. Lifted LBP can effectively speed up marginal inference, but cannot straightforwardly be applied to other types of queries. Thus we also demonstrate efficient lifted algorithms for MAP inference and higher order marginals, as well as the efficient handling of multiple inference tasks. Then we turn to the training of graphical models and introduce the first lifted online training for relational models. Our training procedure and the MapReduce lifting for loopy belief propagation combine lifting with the traditional statistical approaches to scaling, thereby bridging the gap between statistical relational learning and traditional statistical machine learning
Recommended from our members
Probabilistic Inference with Generating Functions for Population Dynamics of Unmarked Individuals
Modeling the interactions of different population dynamics (e.g. reproduction, migration) within a population is a challenging problem that underlies numerous ecological research questions. Powerful, interpretable models for population dynamics are key to developing intervention tactics, allocating limited conservation resources, and predicting the impact of uncertain environmental forces on a population. Fortunately, probabilistic graphical models provide a robust mechanistic framework for these kinds of problems. However, in the relatively common case where individuals in the population are unmarked (i.e. indistinguishable from one another), models of the population dynamics naturally contain a deceptively challenging statistical feature: discrete latent variables with unbounded/countably infinite support. Unfortunately, existing inference algorithms for discrete distributions are applicable only for finite distributions and while approximate inference algorithms exist for countably infinite discrete distributions, they are generally unreliable and inefficient. In this work, we develop the first known general-purpose polynomial-time exact inference algorithms for this class of models using a novel representation based on probability generating functions. These methods are flexibe, easy to use, and significantly faster than existing approximate solutions. We also introduce a novel approximation scheme based on this technique that allows it to gracefully scale to populations well beyond the computational limits of any previously known exact or approximate general-purpose inference algorithm for population dynamics. Finally, we conduct an ecological case study on historical data demonstrating the downstream impact of these advances to a large scale population monitoring setting
Graphical models beyond standard settings: lifted decimation, labeling, and counting
With increasing complexity and growing problem sizes in AI and Machine Learning, inference and learning are still major issues in Probabilistic Graphical Models (PGMs). On the other hand, many problems are specified in such a way that symmetries arise from the underlying model structure. Exploiting these symmetries during inference, which is referred to as "lifted inference", has lead to significant efficiency gains. This thesis provides several enhanced versions of known algorithms that show to be liftable too and thereby applies lifting in "non-standard" settings. By doing so, the understanding of the applicability of lifted inference and lifting in general is extended. Among various other experiments, it is shown how lifted inference in combination with an innovative Web-based data harvesting pipeline is used to label author-paper-pairs with geographic information in online bibliographies. This results is a large-scale transnational bibliography containing affiliation information over time for roughly one million authors. Analyzing this dataset reveals the importance of understanding count data. Although counting is done literally everywhere, mainstream PGMs have widely been neglecting count data. In the case where the ranges of the random variables are defined over the natural numbers, crude approximations to the true distribution are often made by discretization or a Gaussian assumption. To handle count data, Poisson Dependency Networks (PDNs) are introduced which presents a new class of non-standard PGMs naturally handling count data
Recommended from our members
Fast high dimensional approximation via random embeddings
In the big data era, dimension reduction techniques have been a key tool in making high dimensional geometric problems tractable. This thesis focuses on two such problems - hashing and parameter estimation. We study locality sensitive hashing(LSH), which is a framework for randomized hashing that efficiently solves an approximate version of nearest neighbor search. We propose an efficient and provably optimal hash function for LSH that builds on a simple existing hash function called cross-polytope LSH. In the context of parameter estimation, we focus on regression, for which the well-known LASSO requires precise knowledge of the unknown noise variance. We provide an estimator for this noise variance when the signal is sparse that is consistent and faster than a single iteration of LASSO. Finally, we discuss notions of distance between probability distributions for the purposes of quantization and propose a distance metric called the ReÌnyi divergence, that achieves both large and small scale bounds.Mathematic
Probabilistic Programming Concepts
A multitude of different probabilistic programming languages exists today,
all extending a traditional programming language with primitives to support
modeling of complex, structured probability distributions. Each of these
languages employs its own probabilistic primitives, and comes with a particular
syntax, semantics and inference procedure. This makes it hard to understand the
underlying programming concepts and appreciate the differences between the
different languages. To obtain a better understanding of probabilistic
programming, we identify a number of core programming concepts underlying the
primitives used by various probabilistic languages, discuss the execution
mechanisms that they require and use these to position state-of-the-art
probabilistic languages and their implementation. While doing so, we focus on
probabilistic extensions of logic programming languages such as Prolog, which
have been developed since more than 20 years
Tree-valued Feller diffusion
We consider the evolution of the genealogy of the population currently alive
in a Feller branching diffusion model. In contrast to the approach via labeled
trees in the continuum random tree world, the genealogies are modeled as
equivalence classes of ultrametric measure spaces, the elements of the space
. This space is Polish and has a rich semigroup structure for the
genealogy. We focus on the evolution of the genealogy in time and the large
time asymptotics conditioned both on survival up to present time and on
survival forever. We prove existence, uniqueness and Feller property of
solutions of the martingale problem for this genealogy valued, i.e.,
-valued Feller diffusion. We give the precise relation to the
time-inhomogeneous -valued Fleming-Viot process. The uniqueness
is shown via Feynman-Kac duality with the distance matrix augmented Kingman
coalescent. Using a semigroup operation on , called concatenation,
together with the branching property we obtain a L{\'e}vy-Khintchine formula
for -valued Feller diffusion and we determine explicitly the
L{\'e}vy measure on . From this we obtain for
the decomposition into depth- subfamilies, a representation of the process
as concatenation of a Cox point process of genealogies of single ancestor
subfamilies. Furthermore, we will identify the -valued process
conditioned to survive until a finite time . We study long time asymptotics,
such as generalized quasi-equilibrium and Kolmogorov-Yaglom limit law on the
level of ultrametric measure spaces. We also obtain various representations of
the long time limits.Comment: 93 pages, replaced by revised versio
- âŠ