38,398 research outputs found
Efficient prediction of relational structure and its application to natural language processing
Many tasks in Natural Language Processing (NLP) require us to predict a relational
structure over entities. For example, in Semantic Role Labelling we try to predict the
āsemantic roleā relation between a predicate verb and its argument constituents. Often
NLP tasks not only involve related entities but also relations that are stochastically
correlated. For instance, in Semantic Role Labelling the roles of different constituents
are correlated: we cannot assign the agent role to one constituent if we have already
assigned this role to another.
Statistical Relational Learning (also known as First Order Probabilistic Logic) allows
us to capture the aforementioned nature of NLP tasks because it is based on the
notions of entities, relations and stochastic correlations between relationships. It is
therefore often straightforward to formulate an NLP task using a First Order probabilistic
language such as Markov Logic. However, the generality of this approach
comes at a price: the process of finding the relational structure with highest probability,
also known as maximum a posteriori (MAP) inference, is often inefficient, if not
intractable.
In this work we seek to improve the efficiency of MAP inference for Statistical
Relational Learning. We propose a meta-algorithm, namely Cutting Plane Inference
(CPI), that iteratively solves small subproblems of the original problem using any
existing MAP technique and inspects parts of the problem that are not yet included in
the current subproblem but could potentially lead to an improved solution. Our hypothesis
is that this algorithm can dramatically improve the efficiency of existing methods
while remaining at least as accurate.
We frame the algorithm in Markov Logic, a language that combines First Order
Logic and Markov Networks. Our hypothesis is evaluated using two tasks: Semantic
Role Labelling and Entity Resolution. It is shown that the proposed algorithm improves
the efficiency of two existing methods by two orders of magnitude and leads an
approximate method to more probable solutions. We also give show that CPI, at convergence,
is guaranteed to be at least as accurate as the method used within its inner
loop.
Another core contribution of this work is a theoretic and empirical analysis of the
boundary conditions of Cutting Plane Inference. We describe cases when Cutting Plane
Inference will definitely be difficult (because it instantiates large networks or needs
many iterations) and when it will be easy (because it instantiates small networks and
needs only few iterations)
Interpreting Embedding Models of Knowledge Bases: A Pedagogical Approach
Knowledge bases are employed in a variety of applications from natural
language processing to semantic web search; alas, in practice their usefulness
is hurt by their incompleteness. Embedding models attain state-of-the-art
accuracy in knowledge base completion, but their predictions are notoriously
hard to interpret. In this paper, we adapt "pedagogical approaches" (from the
literature on neural networks) so as to interpret embedding models by
extracting weighted Horn rules from them. We show how pedagogical approaches
have to be adapted to take upon the large-scale relational aspects of knowledge
bases and show experimentally their strengths and weaknesses.Comment: presented at 2018 ICML Workshop on Human Interpretability in Machine
Learning (WHI 2018), Stockholm, Swede
kLog: A Language for Logical and Relational Learning with Kernels
We introduce kLog, a novel approach to statistical relational learning.
Unlike standard approaches, kLog does not represent a probability distribution
directly. It is rather a language to perform kernel-based learning on
expressive logical and relational representations. kLog allows users to specify
learning problems declaratively. It builds on simple but powerful concepts:
learning from interpretations, entity/relationship data modeling, logic
programming, and deductive databases. Access by the kernel to the rich
representation is mediated by a technique we call graphicalization: the
relational representation is first transformed into a graph --- in particular,
a grounded entity/relationship diagram. Subsequently, a choice of graph kernel
defines the feature space. kLog supports mixed numerical and symbolic data, as
well as background knowledge in the form of Prolog or Datalog programs as in
inductive logic programming systems. The kLog framework can be applied to
tackle the same range of tasks that has made statistical relational learning so
popular, including classification, regression, multitask learning, and
collective classification. We also report about empirical comparisons, showing
that kLog can be either more accurate, or much faster at the same level of
accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at
http://klog.dinfo.unifi.it along with tutorials
End-to-End Differentiable Proving
We introduce neural networks for end-to-end differentiable proving of queries
to knowledge bases by operating on dense vector representations of symbols.
These neural networks are constructed recursively by taking inspiration from
the backward chaining algorithm as used in Prolog. Specifically, we replace
symbolic unification with a differentiable computation on vector
representations of symbols using a radial basis function kernel, thereby
combining symbolic reasoning with learning subsymbolic vector representations.
By using gradient descent, the resulting neural network can be trained to infer
facts from a given incomplete knowledge base. It learns to (i) place
representations of similar symbols in close proximity in a vector space, (ii)
make use of such similarities to prove queries, (iii) induce logical rules, and
(iv) use provided and induced logical rules for multi-hop reasoning. We
demonstrate that this architecture outperforms ComplEx, a state-of-the-art
neural link prediction model, on three out of four benchmark knowledge bases
while at the same time inducing interpretable function-free first-order logic
rules.Comment: NIPS 2017 camera-ready, NIPS 201
- ā¦