3,575 research outputs found
TensorLog: A Differentiable Deductive Database
Large knowledge bases (KBs) are useful in many tasks, but it is unclear how
to integrate this sort of knowledge into "deep" gradient-based learning
systems. To address this problem, we describe a probabilistic deductive
database, called TensorLog, in which reasoning uses a differentiable process.
In TensorLog, each clause in a logical theory is first converted into certain
type of factor graph. Then, for each type of query to the factor graph, the
message-passing steps required to perform belief propagation (BP) are
"unrolled" into a function, which is differentiable. We show that these
functions can be composed recursively to perform inference in non-trivial
logical theories containing multiple interrelated clauses and predicates. Both
compilation and inference in TensorLog are efficient: compilation is linear in
theory size and proof depth, and inference is linear in database size and the
number of message-passing steps used in BP. We also present experimental
results with TensorLog and discuss its relationship to other first-order
probabilistic logics
SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver
Integrating logical reasoning within deep learning architectures has been a
major goal of modern AI systems. In this paper, we propose a new direction
toward this goal by introducing a differentiable (smoothed) maximum
satisfiability (MAXSAT) solver that can be integrated into the loop of larger
deep learning systems. Our (approximate) solver is based upon a fast coordinate
descent approach to solving the semidefinite program (SDP) associated with the
MAXSAT problem. We show how to analytically differentiate through the solution
to this SDP and efficiently solve the associated backward pass. We demonstrate
that by integrating this solver into end-to-end learning systems, we can learn
the logical structure of challenging problems in a minimally supervised
fashion. In particular, we show that we can learn the parity function using
single-bit supervision (a traditionally hard task for deep networks) and learn
how to play 9x9 Sudoku solely from examples. We also solve a "visual Sudok"
problem that maps images of Sudoku puzzles to their associated logical
solutions by combining our MAXSAT solver with a traditional convolutional
architecture. Our approach thus shows promise in integrating logical structures
within deep learning.Comment: Accepted at ICML'19. The code can be found at
https://github.com/locuslab/satne
Neural Logic Machines
We propose the Neural Logic Machine (NLM), a neural-symbolic architecture for
both inductive learning and logic reasoning. NLMs exploit the power of both
neural networks---as function approximators, and logic programming---as a
symbolic processor for objects with properties, relations, logic connectives,
and quantifiers. After being trained on small-scale tasks (such as sorting
short arrays), NLMs can recover lifted rules, and generalize to large-scale
tasks (such as sorting longer arrays). In our experiments, NLMs achieve perfect
generalization in a number of tasks, from relational reasoning tasks on the
family tree and general graphs, to decision making tasks including sorting
arrays, finding shortest paths, and playing the blocks world. Most of these
tasks are hard to accomplish for neural networks or inductive logic programming
alone.Comment: ICLR 2019. Project page:
https://sites.google.com/view/neural-logic-machine
A Semantic Loss Function for Deep Learning with Symbolic Knowledge
This paper develops a novel methodology for using symbolic knowledge in deep
learning. From first principles, we derive a semantic loss function that
bridges between neural output vectors and logical constraints. This loss
function captures how close the neural network is to satisfying the constraints
on its output. An experimental evaluation shows that it effectively guides the
learner to achieve (near-)state-of-the-art results on semi-supervised
multi-class classification. Moreover, it significantly increases the ability of
the neural network to predict structured objects, such as rankings and paths.
These discrete concepts are tremendously difficult to learn, and benefit from a
tight integration of deep learning and symbolic reasoning methods.Comment: This version appears in the Proceedings of the 35th International
Conference on Machine Learning (ICML 2018
Differentiable Representations For Multihop Inference Rules
We present efficient differentiable implementations of second-order multi-hop
reasoning using a large symbolic knowledge base (KB). We introduce a new
operation which can be used to compositionally construct second-order multi-hop
templates in a neural model, and evaluate a number of alternative
implementations, with different time and memory trade offs. These techniques
scale to KBs with millions of entities and tens of millions of triples, and
lead to simple models with competitive performance on several learning tasks
requiring multi-hop reasoning
Lifted Relational Neural Networks
We propose a method combining relational-logic representations with neural
network learning. A general lifted architecture, possibly reflecting some
background domain knowledge, is described through relational rules which may be
handcrafted or learned. The relational rule-set serves as a template for
unfolding possibly deep neural networks whose structures also reflect the
structures of given training or testing relational examples. Different networks
corresponding to different examples share their weights, which co-evolve during
training by stochastic gradient descent algorithm. The framework allows for
hierarchical relational modeling constructs and learning of latent relational
concepts through shared hidden layers weights corresponding to the rules.
Discovery of notable relational concepts and experiments on 78 relational
learning benchmarks demonstrate favorable performance of the method.Comment: Expanded section on weight learning, added explanation of
relationship to convolutional neural network
GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework
There is a pressing need to build an architecture that could subsume these
networks under a unified framework that achieves both higher performance and
less overhead. To this end, two fundamental issues are yet to be addressed. The
first one is how to implement the back propagation when neuronal activations
are discrete. The second one is how to remove the full-precision hidden weights
in the training phase to break the bottlenecks of memory/computation
consumption. To address the first issue, we present a multi-step neuronal
activation discretization method and a derivative approximation technique that
enable the implementing the back propagation algorithm on discrete DNNs. While
for the second issue, we propose a discrete state transition (DST) methodology
to constrain the weights in a discrete space without saving the hidden weights.
Through this way, we build a unified framework that subsumes the binary or
ternary networks as its special cases, and under which a heuristic algorithm is
provided at the website https://github.com/AcrossV/Gated-XNOR. More
particularly, we find that when both the weights and activations become ternary
values, the DNNs can be reduced to sparse binary networks, termed as gated XNOR
networks (GXNOR-Nets) since only the event of non-zero weight and non-zero
activation enables the control gate to start the XNOR logic operations in the
original binary networks. This promises the event-driven hardware design for
efficient mobile intelligence. We achieve advanced performance compared with
state-of-the-art algorithms. Furthermore, the computational sparsity and the
number of states in the discrete space can be flexibly modified to make it
suitable for various hardware platforms.Comment: 11 pages, 13 figure
Simple, Distributed, and Accelerated Probabilistic Programming
We describe a simple, low-level approach for embedding probabilistic
programming in a deep learning ecosystem. In particular, we distill
probabilistic programming down to a single abstraction---the random variable.
Our lightweight implementation in TensorFlow enables numerous applications: a
model-parallel variational auto-encoder (VAE) with 2nd-generation tensor
processing units (TPUv2s); a data-parallel autoregressive model (Image
Transformer) with TPUv2s; and multi-GPU No-U-Turn Sampler (NUTS). For both a
state-of-the-art VAE on 64x64 ImageNet and Image Transformer on 256x256
CelebA-HQ, our approach achieves an optimal linear speedup from 1 to 256 TPUv2
chips. With NUTS, we see a 100x speedup on GPUs over Stan and 37x over PyMC3.Comment: Appears in Neural Information Processing Systems, 2018. Code
available at http://bit.ly/2JpFip
Neural Query Language: A Knowledge Base Query Language for Tensorflow
Large knowledge bases (KBs) are useful for many AI tasks, but are difficult
to integrate into modern gradient-based learning systems. Here we describe a
framework for accessing soft symbolic database using only differentiable
operators. For example, this framework makes it easy to conveniently write
neural models that adjust confidences associated with facts in a soft KB;
incorporate prior knowledge in the form of hand-coded KB access rules; or learn
to instantiate query templates using information extracted from text. NQL can
work well with KBs with millions of tuples and hundreds of thousands of
entities on a single GPU
Learning Relational Representations with Auto-encoding Logic Programs
Deep learning methods capable of handling relational data have proliferated
over the last years. In contrast to traditional relational learning methods
that leverage first-order logic for representing such data, these deep learning
methods aim at re-representing symbolic relational data in Euclidean spaces.
They offer better scalability, but can only numerically approximate relational
structures and are less flexible in terms of reasoning tasks supported. This
paper introduces a novel framework for relational representation learning that
combines the best of both worlds. This framework, inspired by the auto-encoding
principle, uses first-order logic as a data representation language, and the
mapping between the original and latent representation is done by means of
logic programs instead of neural networks. We show how learning can be cast as
a constraint optimisation problem for which existing solvers can be used. The
use of logic as a representation language makes the proposed framework more
accurate (as the representation is exact, rather than approximate), more
flexible, and more interpretable than deep learning methods. We experimentally
show that these latent representations are indeed beneficial in relational
learning tasks.Comment: 8 pages,4 figures, paper + supplement, published at IJCA
- …