4,512 research outputs found
On solving Ordinary Differential Equations using Gaussian Processes
We describe a set of Gaussian Process based approaches that can be used to
solve non-linear Ordinary Differential Equations. We suggest an explicit
probabilistic solver and two implicit methods, one analogous to Picard
iteration and the other to gradient matching. All methods have greater accuracy
than previously suggested Gaussian Process approaches. We also suggest a
general approach that can yield error estimates from any standard ODE solver
Common Sense is Not Common. So How Can A Leader Make Good Decisions?
In education today there is a move away from top-down leadership toward a more inclusive, shared or participative leadership model. This model includes shared decision-making, which has the potential to empower and radically change any organization willing to take the risk of implementing this type of leadership. This article combines another important aspect of leadership to the concept of shared decision-making: servant leadership. With a servant-leader at the helm and shared decision-making in place, a school has the potential to grow in sync with the needs and desires of its stakeholders
Generative Neural Machine Translation
We introduce Generative Neural Machine Translation (GNMT), a latent variable
architecture which is designed to model the semantics of the source and target
sentences. We modify an encoder-decoder translation model by adding a latent
variable as a language agnostic representation which is encouraged to learn the
meaning of the sentence. GNMT achieves competitive BLEU scores on pure
translation tasks, and is superior when there are missing words in the source
sentence. We augment the model to facilitate multilingual translation and
semi-supervised learning without adding parameters. This framework
significantly reduces overfitting when there is limited paired data available,
and is effective for translating between pairs of languages not seen during
training
Bayesian Conditional Cointegration
Cointegration is an important topic for time-series, and describes a
relationship between two series in which a linear combination is stationary.
Classically, the test for cointegration is based on a two stage process in
which first the linear relation between the series is estimated by Ordinary
Least Squares. Subsequently a unit root test is performed on the residuals. A
well-known deficiency of this classical approach is that it can lead to
erroneous conclusions about the presence of cointegration. As an alternative,
we present a framework for estimating whether cointegration exists using
Bayesian inference which is empirically superior to the classical approach.
Finally, we apply our technique to model segmented cointegration in which
cointegration may exist only for limited time. In contrast to previous
approaches our model makes no restriction on the number of possible
cointegration segments.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Generating Sentences Using a Dynamic Canvas
We introduce the Attentive Unsupervised Text (W)riter (AUTR), which is a word
level generative model for natural language. It uses a recurrent neural network
with a dynamic attention and canvas memory mechanism to iteratively construct
sentences. By viewing the state of the memory at intermediate stages and where
the model is placing its attention, we gain insight into how it constructs
sentences. We demonstrate that AUTR learns a meaningful latent representation
for each sentence, and achieves competitive log-likelihood lower bounds whilst
being computationally efficient. It is effective at generating and
reconstructing sentences, as well as imputing missing words.Comment: AAAI 201
Thinking Fast and Slow with Deep Learning and Tree Search
Sequential decision making problems, such as structured prediction, robotic
control, and game playing, require a combination of planning policies and
generalisation of those plans. In this paper, we present Expert Iteration
(ExIt), a novel reinforcement learning algorithm which decomposes the problem
into separate planning and generalisation tasks. Planning new policies is
performed by tree search, while a deep neural network generalises those plans.
Subsequently, tree search is improved by using the neural network policy to
guide search, increasing the strength of new plans. In contrast, standard deep
Reinforcement Learning algorithms rely on a neural network not only to
generalise plans, but to discover them too. We show that ExIt outperforms
REINFORCE for training a neural network to play the board game Hex, and our
final tree search agent, trained tabula rasa, defeats MoHex 1.0, the most
recent Olympiad Champion player to be publicly released.Comment: v1 to v2: - Add a value function in MCTS - Some MCTS hyper-parameters
changed - Repetition of experiments: improved accuracy and errors shown.
(note the reduction in effect size for the tpt/cat experiment) - Results from
a longer training run, including changes in expert strength in training -
Comparison to MoHex. v3: clarify independence of ExIt and AG0. v4: see
appendix
Practical Gauss-Newton Optimisation for Deep Learning
We present an efficient block-diagonal ap- proximation to the Gauss-Newton
matrix for feedforward neural networks. Our result- ing algorithm is
competitive against state- of-the-art first order optimisation methods, with
sometimes significant improvement in optimisation performance. Unlike
first-order methods, for which hyperparameter tuning of the optimisation
parameters is often a labo- rious process, our approach can provide good
performance even when used with default set- tings. A side result of our work
is that for piecewise linear transfer functions, the net- work objective
function can have no differ- entiable local maxima, which may partially explain
why such transfer functions facilitate effective optimisation.Comment: ICML 201
Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting
We introduce the Kronecker factored online Laplace approximation for
overcoming catastrophic forgetting in neural networks. The method is grounded
in a Bayesian online learning framework, where we recursively approximate the
posterior after every task with a Gaussian, leading to a quadratic penalty on
changes to the weights. The Laplace approximation requires calculating the
Hessian around a mode, which is typically intractable for modern architectures.
In order to make our method scalable, we leverage recent block-diagonal
Kronecker factored approximations to the curvature. Our algorithm achieves over
90% test accuracy across a sequence of 50 instantiations of the permuted MNIST
dataset, substantially outperforming related methods for overcoming
catastrophic forgetting.Comment: 13 pages, 6 figure
- …