72 research outputs found
Inference in Probabilistic Logic Programs using Weighted CNF's
Probabilistic logic programs are logic programs in which some of the facts
are annotated with probabilities. Several classical probabilistic inference
tasks (such as MAP and computing marginals) have not yet received a lot of
attention for this formalism. The contribution of this paper is that we develop
efficient inference algorithms for these tasks. This is based on a conversion
of the probabilistic logic program and the query and evidence to a weighted CNF
formula. This allows us to reduce the inference tasks to well-studied tasks
such as weighted model counting. To solve such tasks, we employ
state-of-the-art methods. We consider multiple methods for the conversion of
the programs as well as for inference on the weighted CNF. The resulting
approach is evaluated experimentally and shown to improve upon the
state-of-the-art in probabilistic logic programming
A Study of the Learnability of Relational Properties: Model Counting Meets Machine Learning (MCML)
This paper introduces the MCML approach for empirically studying the
learnability of relational properties that can be expressed in the well-known
software design language Alloy. A key novelty of MCML is quantification of the
performance of and semantic differences among trained machine learning (ML)
models, specifically decision trees, with respect to entire (bounded) input
spaces, and not just for given training and test datasets (as is the common
practice). MCML reduces the quantification problems to the classic complexity
theory problem of model counting, and employs state-of-the-art model counters.
The results show that relatively simple ML models can achieve surprisingly high
performance (accuracy and F1-score) when evaluated in the common setting of
using training and test datasets - even when the training dataset is much
smaller than the test dataset - indicating the seeming simplicity of learning
relational properties. However, MCML metrics based on model counting show that
the performance can degrade substantially when tested against the entire
(bounded) input space, indicating the high complexity of precisely learning
these properties, and the usefulness of model counting in quantifying the true
performance
Improving the Efficiency of Gibbs Sampling for Probabilistic Logical Models by Means of Program Specialization
There is currently a large interest in probabilistic logical models. A popular algorithm for approximate probabilistic inference with such models is Gibbs sampling. From a computational perspective, Gibbs sampling boils down to repeatedly executing certain queries on a knowledge base composed of a static part and a dynamic part. The larger the static part, the more redundancy there is in these repeated calls. This is problematic since inefficient Gibbs sampling yields poor approximations.
We show how to apply program specialization to make Gibbs sampling more efficient. Concretely, we develop an algorithm that specializes the definitions of the query-predicates with respect to the static part of the knowledge base. In experiments on real-world benchmarks we obtain speedups of up to an order of magnitude
Learning Directed Probabilistic Logical Models from Relational Data (Het leren van gerichte probabilistisch-logische modellen uit relationele gegevens)
Automatisch leren (``machine learning'') is de studie van algoritmenvoor het leren van modellen uit gegevens door computers. Eenbelangrijke toepassing hiervan is datamining, het automatischextraheren van nuttige patronen uit gegevens. Twee soorten modellen dieveel gebruikt worden, zijn probabilistische modellen en logischemodellen (modellen die elementen van logisch programmeren of eersteorde logica gebruiken). Het voordeel van de eerste is dat zestochastische en ruizige gegevens kunnen modelleren, het voordeel vande laatste is dat ze kunnen omgaan met relationele gegevens. Er is eengroeiende interesse in het combineren van beide voordelen door middelvan zogenaamde probabilistisch-logische modellen.In dit proefschrift richten we ons voornamelijk op gerichteprobabilistisch-logische modellen. We introduceren Logische BayesiaanseNetwerken (LBNs), een formalisme voor het omschrijven van zulkemodellen, en vergelijken dit met gelijkaardige formalismen. Hetvoornaamste verschil met andere formalismen is dat we in LBNsprobabilistische afhankelijkheden kwantificeren met logischeprobabiliteitsbomen (in plaats van met kanstabellen encombinatieregels). Dit heeft het voordeel dat context-specifiekeonafhankelijkheden kunnen gevat worden.Het voornaamste deel van dit proefschrift handelt over algoritmen voorhet leren van LBNs uit relationele gegevens. Aangezienprobabiliteitsbomen een centraal onderdeel vormen van LBNs, hebben wenood aan een nauwkeurig en efficiënt algoritme voor het leren vanprobabiliteitsbomen. Om deze reden voeren we eerst een uitgebreideexperimentele vergelijking uit van verschillende zulke algoritmen.We ontwikkelen twee nieuwe algoritmen voor het leren vanniet-recursieve LBNs. Het eerste algoritme is gebaseerd op het zoekenover gerichte acyclische grafen en is nauw verbonden met bestaandealgoritmen voor andere formalismen. Het tweede algoritme is gebaseerdop het zoeken over volgordes. Experimenten tonen aan dat de twee algoritmen vergelijkbaar zijn in termen van de kwaliteit van de geleerde modellen en dat het zoeken over volgordes beduidend efficiënter is.In een volgende stap tonen we hoe de bovenstaande algoritmen kunnengebruikt worden voor het leren van recursieve LBNs onder eenvereenvoudigende veronderstelling. We ontwikkelen ook een nieuwalgoritme dat deze veronderstelling niet vereist, door het algoritmevoor het zoeken over volgordes te veralgemenen. Experimenten oprelationele gegevens tonen aan dat het nieuwe algoritme inderdaadnuttige recursieve afhankelijkheden kan leren. Voor het leren vanniet-recursieve afhankelijkheden presteert het originele algoritmeechter beter.*** ENGLISH ABSTRACT ***
Machine learning is concerned with algorithms that allow computers to learn models from data. An important application of machine learning is knowledge discovery, which is the automated extraction of useful patterns from data. Two kinds of models that have received special attention are probabilistic models and logical models (models using elements of logic programming or first-order logic). The advantage of the former is the ability to model stochastic or noisy data, the advantage of the latter is the ability to handle relational data. There is a growing interest in combining these advantages, by using so-called probabilistic logical models.
In this dissertation we focus on directed probabilistic logical models. We introduce Logical Bayesian Networks (LBNs), a formalism for representing such models, and compare it to related formalisms. The most important difference with other formalisms is that in LBNs we quantify probabilistic dependencies using logical probability trees (instead of conditional probability tables and combining rules). This has the advantage that context-specific independencies can be captured.
The main part of this dissertation is concerned with the development of algorithms for learning LBNs from relational data. Since probability trees are a central component of LBNs, we need an accurate and efficient learning algorithm for probability trees. For this reason we first perform an extensive experimental comparison of several such algorithms, using relational data as well as attribute-value (non-relational) data.
We introduce two algorithms for learning non-recursive LBNs. The first algorithm is based on searching over directed acyclic graphs and is relatively close to existing learning algorithms for formalisms related to LBNs. The second algorithm is based on searching over orderings. Experiments on relational data show that the two algorithms are comparable in terms of the quality of the learned models and that searching over orderings is significantly more efficient.
In a next step we show how the above algorithms can be used for learning recursive LBNs under a simplifying assumption. We also introduce an algorithm for learning recursive LBNs which does not require this assumption. We do this by generalizing the algorithm for searching over orderings. Experiments on relational data show that the new algorithm can indeed learn useful recursive dependencies, but for learning non-recursive dependencies the original algorithm is superior.status: publishe
- …