Search CORE

49,534 research outputs found

Differentiable Algorithm Networks for Composable Robot Learning

Author: Hsu David
Kaelbling Leslie Pack
Karkus Peter
Lee Wee Sun
Lozano-Perez Tomas
Ma Xiao
Publication venue
Publication date: 28/05/2019
Field of study

This paper introduces the Differentiable Algorithm Network (DAN), a composable architecture for robot learning systems. A DAN is composed of neural network modules, each encoding a differentiable robot algorithm and an associated model; and it is trained end-to-end from data. DAN combines the strengths of model-driven modular system design and data-driven end-to-end learning. The algorithms and models act as structural assumptions to reduce the data requirements for learning; end-to-end learning allows the modules to adapt to one another and compensate for imperfect models and algorithms, in order to achieve the best overall system performance. We illustrate the DAN methodology through a case study on a simulated robot system, which learns to navigate in complex 3-D environments with only local visual observations and an image of a partially correct 2-D floor map.Comment: RSS 2019 camera ready. Video is available at https://youtu.be/4jcYlTSJF4

arXiv.org e-Print Archive

DSpace@MIT

Credit assignment in multiple goal embodied visuomotor behavior

Author: Ballard Dana H.
Rothkopf Constantin A.
Publication venue
Publication date: 01/01/2010
Field of study

The intrinsic complexity of the brain can lead one to set aside issues related to its relationships with the body, but the field of embodied cognition emphasizes that understanding brain function at the system level requires one to address the role of the brain-body interface. It has only recently been appreciated that this interface performs huge amounts of computation that does not have to be repeated by the brain, and thus affords the brain great simplifications in its representations. In effect the brain’s abstract states can refer to coded representations of the world created by the body. But even if the brain can communicate with the world through abstractions, the severe speed limitations in its neural circuitry mean that vast amounts of indexing must be performed during development so that appropriate behavioral responses can be rapidly accessed. One way this could happen would be if the brain used a decomposition whereby behavioral primitives could be quickly accessed and combined. This realization motivates our study of independent sensorimotor task solvers, which we call modules, in directing behavior. The issue we focus on herein is how an embodied agent can learn to calibrate such individual visuomotor modules while pursuing multiple goals. The biologically plausible standard for module programming is that of reinforcement given during exploration of the environment. However this formulation contains a substantial issue when sensorimotor modules are used in combination: The credit for their overall performance must be divided amongst them. We show that this problem can be solved and that diverse task combinations are beneficial in learning and not a complication, as usually assumed. Our simulations show that fast algorithms are available that allot credit correctly and are insensitive to measurement noise

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Hochschulschriftenserver - Universität Frankfurt am Main

Memory Augmented Control Networks

Author: Atanasov Nikolay
Karydis Konstantinos
Khan Arbaaz
Kumar Vijay
Lee Daniel D.
Zhang Clark
Publication venue
Publication date: 27/12/2017
Field of study

Planning problems in partially observable environments cannot be solved directly with convolutional networks and require some form of memory. But, even memory networks with sophisticated addressing schemes are unable to learn intelligent reasoning satisfactorily due to the complexity of simultaneously learning to access memory and plan. To mitigate these challenges we introduce the Memory Augmented Control Network (MACN). The proposed network architecture consists of three main parts. The first part uses convolutions to extract features and the second part uses a neural network-based planning module to pre-plan in the environment. The third part uses a network controller that learns to store those specific instances of past information that are necessary for planning. The performance of the network is evaluated in discrete grid world environments for path planning in the presence of simple and complex obstacles. We show that our network learns to plan and can generalize to new environments

arXiv.org e-Print Archive

eScholarship - University of California

Smart Finite State Devices: A Modeling Framework for Demand Response Technologies

Author: Ananyev Maxim
Backhaus Scott
Chertkov Michael
Turitsyn Konstantin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

We introduce and analyze Markov Decision Process (MDP) machines to model individual devices which are expected to participate in future demand-response markets on distribution grids. We differentiate devices into the following four types: (a) optional loads that can be shed, e.g. light dimming; (b) deferrable loads that can be delayed, e.g. dishwashers; (c) controllable loads with inertia, e.g. thermostatically-controlled loads, whose task is to maintain an auxiliary characteristic (temperature) within pre-defined margins; and (d) storage devices that can alternate between charging and generating. Our analysis of the devices seeks to find their optimal price-taking control strategy under a given stochastic model of the distribution market.Comment: 8 pages, 8 figures, submitted IEEE CDC 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

DSpace@MIT