7 research outputs found
Exact Combinatorial Optimization with Graph Convolutional Neural Networks
Combinatorial optimization problems are typically tackled by the
branch-and-bound paradigm. We propose a new graph convolutional neural network
model for learning branch-and-bound variable selection policies, which
leverages the natural variable-constraint bipartite graph representation of
mixed-integer linear programs. We train our model via imitation learning from
the strong branching expert rule, and demonstrate on a series of hard problems
that our approach produces policies that improve upon state-of-the-art
machine-learning methods for branching and generalize to instances
significantly larger than seen during training. Moreover, we improve for the
first time over expert-designed branching rules implemented in a
state-of-the-art solver on large problems. Code for reproducing all the
experiments can be found at https://github.com/ds4dm/learn2branch.Comment: Accepted paper at the NeurIPS 2019 conferenc
Going in circles is the way forward: the role of recurrence in visual inference
Biological visual systems exhibit abundant recurrent connectivity.
State-of-the-art neural network models for visual recognition, by contrast,
rely heavily or exclusively on feedforward computation. Any finite-time
recurrent neural network (RNN) can be unrolled along time to yield an
equivalent feedforward neural network (FNN). This important insight suggests
that computational neuroscientists may not need to engage recurrent
computation, and that computer-vision engineers may be limiting themselves to a
special case of FNN if they build recurrent models. Here we argue, to the
contrary, that FNNs are a special case of RNNs and that computational
neuroscientists and engineers should engage recurrence to understand how brains
and machines can (1) achieve greater and more flexible computational depth, (2)
compress complex computations into limited hardware, (3) integrate priors and
priorities into visual inference through expectation and attention, (4) exploit
sequential dependencies in their data for better inference and prediction, and
(5) leverage the power of iterative computation
Selective algorithms for large-scale classification and structured learning
The desired output in many machine learning tasks is a structured object, such as tree, clustering, or
sequence. Learning accurate prediction models for such problems requires training on large amounts
of data, making use of expressive features and performing global inference that simultaneously
assigns values to all interrelated nodes in the structure. All these contribute to significant scalability
problems. In this thesis, we describe a collection of results that address several aspects of these
problems – by carefully selecting and caching samples, structures, or latent items.
Our results lead to efficient learning algorithms for large-scale binary classification models,
structured prediction models and for online clustering models which, in turn, support reduction in
problem size, improvements in training and evaluation speed and improved performance. We have
used our algorithms to learn expressive models from large amounts of annotated data and achieve
state-of-the art performance on several natural language processing tasks