Search CORE

12 research outputs found

Combating catastrophic forgetting with developmental compression

Author: Bongard J.
Dellaert F.
Espeholt Lasse
Fernando Chrisantha
Szubert Marcin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/04/2018
Field of study

Generally intelligent agents exhibit successful behavior across problems in several settings. Endemic in approaches to realize such intelligence in machines is catastrophic forgetting: sequential learning corrupts knowledge obtained earlier in the sequence, or tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or to enforce modularity such that minimally overlapping sub-functions contain task specific knowledge. While successful, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data. Here we present a method for addressing catastrophic forgetting called developmental compression. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously-evolved capabilities and `compresses' specialized neural networks into a generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation and suggesting better scalability than existing approaches. We validate this method on a robot control problem and hope to extend this approach to other machine learning domains in the future

arXiv.org e-Print Archive

Crossref

Multi-task Deep Reinforcement Learning with PopArt

Author: Czarnecki Wojciech
Espeholt Lasse
Hessel Matteo
Schmitt Simon
Soyer Hubert
van Hasselt Hado
Publication venue
Publication date: 12/09/2018
Field of study

The reinforcement learning community has made great strides in designing algorithms capable of exceeding human performance on specific tasks. These algorithms are mostly trained one task at the time, each new task requiring to train a brand new agent instance. This means the learning algorithm is general, but each solution is not; each agent can only solve the one task it was trained on. In this work, we study the problem of learning to master not one but multiple sequential-decision tasks at once. A general issue in multi-task learning is that a balance must be found between the needs of multiple tasks competing for the limited resources of a single learning system. Many learning algorithms can get distracted by certain tasks in the set of tasks to solve. Such tasks appear more salient to the learning process, for instance because of the density or magnitude of the in-task rewards. This causes the algorithm to focus on those salient tasks at the expense of generality. We propose to automatically adapt the contribution of each task to the agent's updates, so that all tasks have a similar impact on the learning dynamics. This resulted in state of the art performance on learning to play all games in a set of 57 diverse Atari games. Excitingly, our method learned a single trained policy - with a single set of weights - that exceeds median human performance. To our knowledge, this was the first time a single agent surpassed human-level performance on this multi-task domain. The same approach also demonstrated state of the art performance on a set of 30 tasks in the 3D reinforcement learning platform DeepMind Lab

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Teaching Machines to Read and Comprehend

Author: Blunsom Phil
Espeholt Lasse
Grefenstette Edward
Hermann Karl Moritz
Kay Will
Kočiský Tomáš
Suleyman Mustafa
Publication venue
Publication date: 19/11/2015
Field of study

Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.Comment: Appears in: Advances in Neural Information Processing Systems 28 (NIPS 2015). 14 pages, 13 figure

arXiv.org e-Print Archive

Oxford University Research Archive

Neural Packet Classification

Author: Espeholt Lasse
Gauci Jason
Kolmogorov AN
Qu Yun R.
Valadarsky Asaf
Xiong Zheng
Publication venue
Publication date: 26/02/2019
Field of study

Packet classification is a fundamental problem in computer networking. This problem exposes a hard tradeoff between the computation and state complexity, which makes it particularly challenging. To navigate this tradeoff, existing solutions rely on complex hand-tuned heuristics, which are brittle and hard to optimize. In this paper, we propose a deep reinforcement learning (RL) approach to solve the packet classification problem. There are several characteristics that make this problem a good fit for Deep RL. First, many of the existing solutions are iteratively building a decision tree by splitting nodes in the tree. Second, the effects of these actions (e.g., splitting nodes) can only be evaluated once we are done with building the tree. These two characteristics are naturally captured by the ability of RL to take actions that have sparse and delayed rewards. Third, it is computationally efficient to generate data traces and evaluate decision trees, which alleviate the notoriously high sample complexity problem of Deep RL algorithms. Our solution, NeuroCuts, uses succinct representations to encode state and action space, and efficiently explore candidate decision trees to optimize for a global objective. It produces compact decision trees optimized for a specific set of rules and a given performance metric, such as classification time, memory footprint, or a combination of the two. Evaluation on ClassBench shows that NeuroCuts outperforms existing hand-crafted algorithms in classification time by 18% at the median, and reduces both time and memory footprint by up to 3x

arXiv.org e-Print Archive

Crossref

Qd-tree: Learning Data Layouts for Big Data Analytics

Author: Agrawal Sanjay
Bruno Nicolas
Espeholt Lasse
Idreos Stratos
Liang Eric
Marcus Ryan
Moritz Philipp
Sun Liwen
Theo
Zilio Daniel C
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/04/2020
Field of study

Corporations today collect data at an unprecedented and accelerating scale, making the need to run queries on large datasets increasingly important. Technologies such as columnar block-based data organization and compression have become standard practice in most commercial database systems. However, the problem of best assigning records to data blocks on storage is still open. For example, today's systems usually partition data by arrival time into row groups, or range/hash partition the data based on selected fields. For a given workload, however, such techniques are unable to optimize for the important metric of the number of blocks accessed by a query. This metric directly relates to the I/O cost, and therefore performance, of most analytical queries. Further, they are unable to exploit additional available storage to drive this metric down further. In this paper, we propose a new framework called a query-data routing tree, or qd-tree, to address this problem, and propose two algorithms for their construction based on greedy and deep reinforcement learning techniques. Experiments over benchmark and real workloads show that a qd-tree can provide physical speedups of more than an order of magnitude compared to current blocking schemes, and can reach within 2X of the lower bound for data skipping based on selectivity, while providing complete semantic descriptions of created blocks.Comment: ACM SIGMOD 202

arXiv.org e-Print Archive

Crossref

Google Research Football: A Novel Reinforcement Learning Environment

Author: Bachem Olivier
Bousquet Olivier
Espeholt Lasse
Gelly Sylvain
Kurach Karol
Michalski Marcin
Raichuk Anton
Riquelme Carlos
Stańczyk Piotr
Vincent Damien
Zając Michał
Publication venue
Publication date: 01/01/2020
Field of study

Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner. We introduce the Google Research Football Environment, a new reinforcement learning environment where agents are trained to play football in an advanced, physics-based 3D simulator. The resulting environment is challenging, easy to use and customize, and it is available under a permissive open-source license. In addition, it provides support for multiplayer and multi-agent experiments. We propose three full-game scenarios of varying difficulty with the Football Benchmarks and report baseline results for three commonly used reinforcement algorithms (IMPALA, PPO, and Ape-X DQN). We also provide a diverse set of simpler scenarios with the Football Academy and showcase several promising research directions

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository

Association for the Advancement of Artificial Intelligence: AAAI Publications

Google research football : a novel reinforcement learning environment

Author: Bachem Olivier
Bousquet Olivier
Espeholt Lasse
Gelly Sylvain
Kurach Karol
Michalski Marcin
Raichuk Anton
Riquelme Carlos
Stańczyk Piotr
Vincent Damien
Zając Michał
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/01/2020
Field of study

Jagiellonian Univeristy Repository

Association for the Advancement of Artificial Intelligence: AAAI Publications