Search CORE

32,954 research outputs found

Grounding Language for Transfer in Deep Reinforcement Learning

Author: Barzilay Regina
Jaakkola Tommi
Narasimhan Karthik
Publication venue
Publication date: 01/12/2018
Field of study

In this paper, we explore the utilization of natural language to drive transfer for reinforcement learning (RL). Despite the wide-spread application of deep RL techniques, learning generalized policy representations that work across domains remains a challenging problem. We demonstrate that textual descriptions of environments provide a compact intermediate channel to facilitate effective policy transfer. Specifically, by learning to ground the meaning of text to the dynamics of the environment such as transitions and rewards, an autonomous agent can effectively bootstrap policy learning on a new domain given its description. We employ a model-based RL approach consisting of a differentiable planning module, a model-free component and a factorized state representation to effectively use entity descriptions. Our model outperforms prior work on both transfer and multi-task scenarios in a variety of different environments. For instance, we achieve up to 14% and 11.5% absolute improvement over previously existing models in terms of average and initial rewards, respectively.Comment: JAIR 201

arXiv.org e-Print Archive

Princeton University Open Access Repository

DSpace@MIT

A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images

Author: Akagunduz Erdem
Ulku Irem
Publication venue
Publication date: 14/05/2020
Field of study

Semantic segmentation is the pixel-wise labelling of an image. Since the problem is defined at the pixel level, determining image class labels only is not acceptable, but localising them at the original image pixel resolution is necessary. Boosted by the extraordinary ability of convolutional neural networks (CNN) in creating semantic, high level and hierarchical image features; excessive numbers of deep learning-based 2D semantic segmentation approaches have been proposed within the last decade. In this survey, we mainly focus on the recent scientific developments in semantic segmentation, specifically on deep learning-based methods using 2D images. We started with an analysis of the public image sets and leaderboards for 2D semantic segmantation, with an overview of the techniques employed in performance evaluation. In examining the evolution of the field, we chronologically categorised the approaches into three main periods, namely pre-and early deep learning era, the fully convolutional era, and the post-FCN era. We technically analysed the solutions put forward in terms of solving the fundamental problems of the field, such as fine-grained localisation and scale invariance. Before drawing our conclusions, we present a table of methods from all mentioned eras, with a brief summary of each approach that explains their contribution to the field. We conclude the survey by discussing the current challenges of the field and to what extent they have been solved.Comment: Updated with new studie

arXiv.org e-Print Archive

Directory of Open Access Journals

Evolving stochastic learning algorithm based on Tsallis entropic index

Author: Anastasiadis A.D.
Magoulas George D.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/12/2005
Field of study

In this paper, inspired from our previous algorithm, which was based on the theory of Tsallis statistical mechanics, we develop a new evolving stochastic learning algorithm for neural networks. The new algorithm combines deterministic and stochastic search steps by employing a different adaptive stepsize for each network weight, and applies a form of noise that is characterized by the nonextensive entropic index q, regulated by a weight decay term. The behavior of the learning algorithm can be made more stochastic or deterministic depending on the trade off between the temperature T and the q values. This is achieved by introducing a formula that defines a time-dependent relationship between these two important learning parameters. Our experimental study verifies that there are indeed improvements in the convergence speed of this new evolving stochastic learning algorithm, which makes learning faster than using the original Hybrid Learning Scheme (HLS). In addition, experiments are conducted to explore the influence of the entropic index q and temperature T on the convergence speed and stability of the proposed method

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Birkbeck Institutional Research Online

Research Papers in Economics

SPIDA: Abstracting and generalizing layout design cases

Author: Duffy A.H.D.
Lee B.S.
Manfaat D.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/1998
Field of study

Abstraction and generalization of layout design cases generate new knowledge that is more widely applicable to use than specific design cases. The abstraction and generalization of design cases into hierarchical levels of abstractions provide the designer with the flexibility to apply any level of abstract and generalized knowledge for a new layout design problem. Existing case-based layout learning (CBLL) systems abstract and generalize cases into single levels of abstractions, but not into a hierarchy. In this paper, we propose a new approach, termed customized viewpoint - spatial (CV-S), which supports the generalization and abstraction of spatial layouts into hierarchies along with a supporting system, SPIDA (SPatial Intelligent Design Assistant)

Crossref

University of Strathclyde Institutional Repository

Robust Speech Recognition Using Generative Adversarial Networks

Author: Gaur Yashesh
Jun Heewoo
Satheesh Sanjeev
Sriram Anuroop
Publication venue
Publication date: 05/11/2017
Field of study

This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition. Encoders trained with the proposed approach enjoy improved invariance by learning to map noisy audio to the same embedding space as that of clean audio. Unlike previous methods, the new framework does not rely on domain expertise or simplifying assumptions as are often needed in signal processing, and directly encourages robustness in a data-driven way. We show the new approach improves simulated far-field speech recognition of vanilla sequence-to-sequence models without specialized front-ends or preprocessing

arXiv.org e-Print Archive

Crossref