477 research outputs found
Neural Natural Language Inference Models Enhanced with External Knowledge
Modeling natural language inference is a very challenging task. With the
availability of large annotated data, it has recently become feasible to train
complex models such as neural-network-based inference models, which have shown
to achieve the state-of-the-art performance. Although there exist relatively
large annotated data, can machines learn all knowledge needed to perform
natural language inference (NLI) from these data? If not, how can
neural-network-based NLI models benefit from external knowledge and how to
build NLI models to leverage it? In this paper, we enrich the state-of-the-art
neural natural language inference models with external knowledge. We
demonstrate that the proposed models improve neural NLI models to achieve the
state-of-the-art performance on the SNLI and MultiNLI datasets.Comment: Accepted by ACL 201
Enabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)
In applications of Gaussian processes where quantification of uncertainty is
of primary interest, it is necessary to accurately characterize the posterior
distribution over covariance parameters. This paper proposes an adaptation of
the Stochastic Gradient Langevin Dynamics algorithm to draw samples from the
posterior distribution over covariance parameters with negligible bias and
without the need to compute the marginal likelihood. In Gaussian process
regression, this has the enormous advantage that stochastic gradients can be
computed by solving linear systems only. A novel unbiased linear systems solver
based on parallelizable covariance matrix-vector products is developed to
accelerate the unbiased estimation of gradients. The results demonstrate the
possibility to enable scalable and exact (in a Monte Carlo sense)
quantification of uncertainty in Gaussian processes without imposing any
special structure on the covariance or reducing the number of input vectors.Comment: 10 pages - paper accepted at ICML 201
End-to-End Differentiable Proving
We introduce neural networks for end-to-end differentiable proving of queries
to knowledge bases by operating on dense vector representations of symbols.
These neural networks are constructed recursively by taking inspiration from
the backward chaining algorithm as used in Prolog. Specifically, we replace
symbolic unification with a differentiable computation on vector
representations of symbols using a radial basis function kernel, thereby
combining symbolic reasoning with learning subsymbolic vector representations.
By using gradient descent, the resulting neural network can be trained to infer
facts from a given incomplete knowledge base. It learns to (i) place
representations of similar symbols in close proximity in a vector space, (ii)
make use of such similarities to prove queries, (iii) induce logical rules, and
(iv) use provided and induced logical rules for multi-hop reasoning. We
demonstrate that this architecture outperforms ComplEx, a state-of-the-art
neural link prediction model, on three out of four benchmark knowledge bases
while at the same time inducing interpretable function-free first-order logic
rules.Comment: NIPS 2017 camera-ready, NIPS 201
Bridging Symbolic and Sub-Symbolic AI: Towards Cooperative Transfer Learning in Multi-Agent Systems
Cooperation and knowledge sharing are of paramount importance in the evolution of an intelligent species. Knowledge sharing requires a set of symbols with a shared interpretation, enabling effective communication supporting cooperation. The engineering of intelligent systems may then benefit from the distribution of knowledge among multiple components capable of cooperation and symbolic knowledge sharing. Accordingly, in this paper, we propose a roadmap for the exploitation of knowledge representation and sharing to foster higher degrees of artificial intelligence. We do so by envisioning intelligent systems as composed by multiple agents, capable of cooperative (transfer) learning—Co(T)L for short. In CoL, agents can improve their local (sub-symbolic) knowledge by exchanging (symbolic) information among each others. In CoTL, agents can also learn new tasks autonomously by sharing information about similar tasks. Along this line, we motivate the introduction of Co(T)L and discuss benefits and feasibility
Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes
I argue that data becomes temporarily interesting by itself to some
self-improving, but computationally limited, subjective observer once he learns
to predict or compress the data in a better way, thus making it subjectively
simpler and more beautiful. Curiosity is the desire to create or discover more
non-random, non-arbitrary, regular data that is novel and surprising not in the
traditional sense of Boltzmann and Shannon but in the sense that it allows for
compression progress because its regularity was not yet known. This drive
maximizes interestingness, the first derivative of subjective beauty or
compressibility, that is, the steepness of the learning curve. It motivates
exploring infants, pure mathematicians, composers, artists, dancers, comedians,
yourself, and (since 1990) artificial systems.Comment: 35 pages, 3 figures, based on KES 2008 keynote and ALT 2007 / DS 2007
joint invited lectur
Better Exploration with Optimistic Actor-Critic
Actor-critic methods, a type of model-free Reinforcement Learning, have been
successfully applied to challenging tasks in continuous control, often
achieving state-of-the art performance. However, wide-scale adoption of these
methods in real-world domains is made difficult by their poor sample
efficiency. We address this problem both theoretically and empirically. On the
theoretical side, we identify two phenomena preventing efficient exploration in
existing state-of-the-art algorithms such as Soft Actor Critic. First,
combining a greedy actor update with a pessimistic estimate of the critic leads
to the avoidance of actions that the agent does not know about, a phenomenon we
call pessimistic underexploration. Second, current algorithms are directionally
uninformed, sampling actions with equal probability in opposite directions from
the current mean. This is wasteful, since we typically need actions taken along
certain directions much more than others. To address both of these phenomena,
we introduce a new algorithm, Optimistic Actor Critic, which approximates a
lower and upper confidence bound on the state-action value function. This
allows us to apply the principle of optimism in the face of uncertainty to
perform directed exploration using the upper bound while still using the lower
bound to avoid overestimation. We evaluate OAC in several challenging
continuous control tasks, achieving state-of the art sample efficiency.Comment: 20 pages (including supplement
On learning history based policies for controlling Markov decision processes
Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapproximationmethods,suchas
recurrent neural nets or history-based state abstraction, perform better than
their memory-less counterparts, due to the fact that function approximation in
Markov decision processes (MDP) can be viewed as inducing a Partially
observable MDP. However, there has been little formal analysis of such
history-based algorithms, as most existing frameworks focus exclusively on
memory-less features. In this paper, we introduce a theoretical framework for
studying the behaviour of RL algorithms that learn to control an MDP using
history-based feature abstraction mappings. Furthermore, we use this framework
to design a practical RL algorithm and we numerically evaluate its
effectiveness on a set of continuous control tasks
Exploring the Promise and Limits of Real-Time Recurrent Learning
Real-time recurrent learning (RTRL) for sequence-processing recurrent neural
networks (RNNs) offers certain conceptual advantages over backpropagation
through time (BPTT). RTRL requires neither caching past activations nor
truncating context, and enables online learning. However, RTRL's time and space
complexity make it impractical. To overcome this problem, most recent work on
RTRL focuses on approximation theories, while experiments are often limited to
diagnostic settings. Here we explore the practical promise of RTRL in more
realistic settings. We study actor-critic methods that combine RTRL and policy
gradients, and test them in several subsets of DMLab-30, ProcGen, and
Atari-2600 environments. On DMLab memory tasks, our system trained on fewer
than 1.2 B environmental frames is competitive with or outperforms well-known
IMPALA and R2D2 baselines trained on 10 B frames. To scale to such challenging
tasks, we focus on certain well-known neural architectures with element-wise
recurrence, allowing for tractable RTRL without approximation. We also discuss
rarely addressed limitations of RTRL in real-world applications, such as its
complexity in the multi-layer case
- …