End-to-end neuro-symbolic learning of logic-based inference

Abstract

Artificial Intelligence has long taken the human mind as a point of inspiration and research. One remarkable feat of the human brain is its ability to seamlessly reconcile low-level sensory inputs such as vision with high-level abstract reasoning using symbols related to objects and rules. Inspired by this, neuro-symbolic computing attempts to bring together advances in connectionist architectures like artificial neural networks with principled symbolic inference of logic-based systems. How this integration between the two branches of research can be achieved remains an open question. In this thesis, we tackle neuro-symbolic inference in an end-to-end differentiable fashion from three different aspects: learning to perform symbolic deduction and manipulation over logic programs, the ability to learn and leverage variables through unification across data points and finally the ability to induce symbolic rules directly from non-symbolic inputs such as images. We first start by proposing a novel neural network model, Iterative Memory Attention (IMA), to ascertain the level of symbolic deduction and manipulation neural networks can achieve over logic programs of increased complexity. We demonstrate that our approach outperforms existing neural network models and analyse the vector representations learnt by our model. We observe that the principal components of the continuous real-valued embedding space align with the constructs of logic programs such as arity of predicates and types of rules. We then focus on a key component of symbolic inference: variables. Humans leverage variables in everyday reasoning to construct high level abstract rules such as “if someone went somewhere then they are there” instead of mentioning specific people or places. We present a novel end-to-end differentiable neural network architecture called Unification Network that is capable of recognising which symbols can act as variables through the application of soft unification. The by-products of the model are invariants that capture some common underlying principle present in the dataset. Unification Networks exhibit better data efficiency and generalisation to unseen examples compared to models that do not utilise soft unification. Finally, we redirect our attention to the question: How can a neural network learn symbolic rules directly from visual inputs in a coherent manner? We bridge the gap between continuous vector representations and discrete symbolic reasoning by presenting a fully differentiable layer in a deep learning architecture called the Semi-symbolic Layer. When stacked, the Semi-symbolic Layers within a larger model are able to learn complete logic programs along with continuous representations of image patches directly from pixel level input in an end-to-end fashion. The resulting model holistically learns objects, relations between them and logical rules. By pruning and thresholding the weights of the Semi-symbolic Layers, we can extract out the exact symbolic relations and rules used to reason about the tasks and verify them using symbolic inference engines. Using two datasets, we demonstrate that our approach scales better than existing state-of-the-art symbolic rule learning systems and outperforms previous deep relational neural network architectures.Open Acces

    Similar works