Search CORE

1,381 research outputs found

Learning with Latent Language

Author: Andreas Jacob
Klein Dan
Levine Sergey
Publication venue
Publication date: 01/11/2017
Field of study

The named concepts and compositional operators present in natural language provide a rich source of information about the kinds of abstractions humans use to navigate the world. Can this linguistic background knowledge improve the generality and efficiency of learned classifiers and control policies? This paper aims to show that using the space of natural language strings as a parameter space is an effective way to capture natural task structure. In a pretraining phase, we learn a language interpretation model that transforms inputs (e.g. images) into outputs (e.g. labels) given natural language descriptions. To learn a new concept (e.g. a classifier), we search directly in the space of descriptions to minimize the interpreter's loss on training examples. Crucially, our models do not require language data to learn these concepts: language is used only in pretraining to impose structure on subsequent learning. Results on image classification, text editing, and reinforcement learning show that, in all settings, models with a linguistic parameterization outperform those without

arXiv.org e-Print Archive

Crossref

Recommended from our members

Evolutionary neural architecture search for deep learning

Author: Liang Jason Zhi
Publication venue
Publication date: 11/04/2019
Field of study

Deep neural networks (DNNs) have produced state-of-the-art results in many benchmarks and problem domains. However, the success of DNNs depends on the proper configuration of its architecture and hyperparameters. DNNs are often not used to their full potential because it is difficult to determine what architectures and hyperparameters should be used. While several approaches have been proposed, computational complexity of searching large design spaces makes them impractical for large modern DNNs. This dissertation introduces an efficient evolutionary algorithm (EA) for simultaneous optimization of DNN architecture and hyperparameters. It builds upon extensive past research of evolutionary optimization of neural network structure. Various improvements to the core algorithm are introduced, including: (1) discovering DNN architectures of arbitrary complexity; (1) generating modular, repetitive modules commonly seen in state-of-the-art DNNs; (3) extending to the multitask learning and multiobjective optimization domains; (4) maximizing performance and reducing wasted computation through asynchronous evaluations. Experimental results in image classification, image captioning, and multialphabet character recognition show that the approach is able to evolve networks that are competitive with or even exceed hand-designed networks. Thus, the method enables an automated and streamlined process to optimize DNN architectures for a given problem and can be widely applied to solve harder tasks.Computer Science

Texas ScholarWorks

Asynchronous Evolution of Deep Neural Network Architectures

Author: Liang Jason
Miikkulainen Risto
Shahrzad Hormoz
Publication venue
Publication date: 08/08/2023
Field of study

Many evolutionary algorithms (EAs) take advantage of parallel evaluation of candidates. However, if evaluation times vary significantly, many worker nodes (i.e.,\ compute clients) are idle much of the time, waiting for the next generation to be created. Evolutionary neural architecture search (ENAS), a class of EAs that optimizes the architecture and hyperparameters of deep neural networks, is particularly vulnerable to this issue. This paper proposes a generic asynchronous evaluation strategy (AES) that is then adapted to work with ENAS. AES increases throughput by maintaining a queue of upto

K

individuals ready to be sent to the workers for evaluation and proceeding to the next generation as soon as

M<<K

individuals have been evaluated by the workers. A suitable value for

M

is determined experimentally, balancing diversity and efficiency. To showcase the generality and power of AES, it was first evaluated in 11-bit multiplexer design (a single-population verifiable discovery task) and then scaled up to ENAS for image captioning (a multi-population open-ended-optimization task). In both problems, a multifold performance improvement was observed, suggesting that AES is a promising method for parallelizing the evolution of complex systems with long and variable evaluation times, such as those in ENAS

arXiv.org e-Print Archive