12 research outputs found
Combating catastrophic forgetting with developmental compression
Generally intelligent agents exhibit successful behavior across problems in
several settings. Endemic in approaches to realize such intelligence in
machines is catastrophic forgetting: sequential learning corrupts knowledge
obtained earlier in the sequence, or tasks antagonistically compete for system
resources. Methods for obviating catastrophic forgetting have sought to
identify and preserve features of the system necessary to solve one problem
when learning to solve another, or to enforce modularity such that minimally
overlapping sub-functions contain task specific knowledge. While successful,
both approaches scale poorly because they require larger architectures as the
number of training instances grows, causing different parts of the system to
specialize for separate subsets of the data. Here we present a method for
addressing catastrophic forgetting called developmental compression. It
exploits the mild impacts of developmental mutations to lessen adverse changes
to previously-evolved capabilities and `compresses' specialized neural networks
into a generalized one. In the absence of domain knowledge, developmental
compression produces systems that avoid overt specialization, alleviating the
need to engineer a bespoke system for every task permutation and suggesting
better scalability than existing approaches. We validate this method on a robot
control problem and hope to extend this approach to other machine learning
domains in the future
Multi-task Deep Reinforcement Learning with PopArt
The reinforcement learning community has made great strides in designing
algorithms capable of exceeding human performance on specific tasks. These
algorithms are mostly trained one task at the time, each new task requiring to
train a brand new agent instance. This means the learning algorithm is general,
but each solution is not; each agent can only solve the one task it was trained
on. In this work, we study the problem of learning to master not one but
multiple sequential-decision tasks at once. A general issue in multi-task
learning is that a balance must be found between the needs of multiple tasks
competing for the limited resources of a single learning system. Many learning
algorithms can get distracted by certain tasks in the set of tasks to solve.
Such tasks appear more salient to the learning process, for instance because of
the density or magnitude of the in-task rewards. This causes the algorithm to
focus on those salient tasks at the expense of generality. We propose to
automatically adapt the contribution of each task to the agent's updates, so
that all tasks have a similar impact on the learning dynamics. This resulted in
state of the art performance on learning to play all games in a set of 57
diverse Atari games. Excitingly, our method learned a single trained policy -
with a single set of weights - that exceeds median human performance. To our
knowledge, this was the first time a single agent surpassed human-level
performance on this multi-task domain. The same approach also demonstrated
state of the art performance on a set of 30 tasks in the 3D reinforcement
learning platform DeepMind Lab
Teaching Machines to Read and Comprehend
Teaching machines to read natural language documents remains an elusive
challenge. Machine reading systems can be tested on their ability to answer
questions posed on the contents of documents that they have seen, but until now
large scale training and test datasets have been missing for this type of
evaluation. In this work we define a new methodology that resolves this
bottleneck and provides large scale supervised reading comprehension data. This
allows us to develop a class of attention based deep neural networks that learn
to read real documents and answer complex questions with minimal prior
knowledge of language structure.Comment: Appears in: Advances in Neural Information Processing Systems 28
(NIPS 2015). 14 pages, 13 figure
Neural Packet Classification
Packet classification is a fundamental problem in computer networking. This
problem exposes a hard tradeoff between the computation and state complexity,
which makes it particularly challenging. To navigate this tradeoff, existing
solutions rely on complex hand-tuned heuristics, which are brittle and hard to
optimize. In this paper, we propose a deep reinforcement learning (RL) approach
to solve the packet classification problem. There are several characteristics
that make this problem a good fit for Deep RL. First, many of the existing
solutions are iteratively building a decision tree by splitting nodes in the
tree. Second, the effects of these actions (e.g., splitting nodes) can only be
evaluated once we are done with building the tree. These two characteristics
are naturally captured by the ability of RL to take actions that have sparse
and delayed rewards. Third, it is computationally efficient to generate data
traces and evaluate decision trees, which alleviate the notoriously high sample
complexity problem of Deep RL algorithms. Our solution, NeuroCuts, uses
succinct representations to encode state and action space, and efficiently
explore candidate decision trees to optimize for a global objective. It
produces compact decision trees optimized for a specific set of rules and a
given performance metric, such as classification time, memory footprint, or a
combination of the two. Evaluation on ClassBench shows that NeuroCuts
outperforms existing hand-crafted algorithms in classification time by 18% at
the median, and reduces both time and memory footprint by up to 3x
Qd-tree: Learning Data Layouts for Big Data Analytics
Corporations today collect data at an unprecedented and accelerating scale,
making the need to run queries on large datasets increasingly important.
Technologies such as columnar block-based data organization and compression
have become standard practice in most commercial database systems. However, the
problem of best assigning records to data blocks on storage is still open. For
example, today's systems usually partition data by arrival time into row
groups, or range/hash partition the data based on selected fields. For a given
workload, however, such techniques are unable to optimize for the important
metric of the number of blocks accessed by a query. This metric directly
relates to the I/O cost, and therefore performance, of most analytical queries.
Further, they are unable to exploit additional available storage to drive this
metric down further.
In this paper, we propose a new framework called a query-data routing tree,
or qd-tree, to address this problem, and propose two algorithms for their
construction based on greedy and deep reinforcement learning techniques.
Experiments over benchmark and real workloads show that a qd-tree can provide
physical speedups of more than an order of magnitude compared to current
blocking schemes, and can reach within 2X of the lower bound for data skipping
based on selectivity, while providing complete semantic descriptions of created
blocks.Comment: ACM SIGMOD 202
Google Research Football: A Novel Reinforcement Learning Environment
Recent progress in the field of reinforcement learning has been accelerated
by virtual learning environments such as video games, where novel algorithms
and ideas can be quickly tested in a safe and reproducible manner. We introduce
the Google Research Football Environment, a new reinforcement learning
environment where agents are trained to play football in an advanced,
physics-based 3D simulator. The resulting environment is challenging, easy to
use and customize, and it is available under a permissive open-source license.
In addition, it provides support for multiplayer and multi-agent experiments.
We propose three full-game scenarios of varying difficulty with the Football
Benchmarks and report baseline results for three commonly used reinforcement
algorithms (IMPALA, PPO, and Ape-X DQN). We also provide a diverse set of
simpler scenarios with the Football Academy and showcase several promising
research directions
Google research football : a novel reinforcement learning environment
Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner. We introduce the Google Research Football Environment, a new reinforcement learning environment where agents are trained to play football in an advanced, physics-based 3D simulator. The resulting environment is challenging, easy to use and customize, and it is available under a permissive open-source license. In addition, it provides support for multiplayer and multi-agent experiments. We propose three full-game scenarios of varying difficulty with the Football Benchmarks and report baseline results for three commonly used reinforcement algorithms (IMPALA, PPO, and Ape-X DQN). We also provide a diverse set of simpler scenarios with the Football Academy and showcase several promising research directions