Search CORE

14,095 research outputs found

Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing

Author: A Liniger
B Paden
C Urmson
CW Anderson
D Dolgov
D Wierstra
DQ Mayne
E Frazzoli
HT Siegelmann
J Xu
P Falcone
R Tedrake
T Schouwenaars
Publication venue
Publication date: 02/08/2018
Field of study

Within the context of autonomous driving a model-based reinforcement learning algorithm is proposed for the design of neural network-parameterized controllers. Classical model-based control methods, which include sampling- and lattice-based algorithms and model predictive control, suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. To circumvent this trade-off, a 2-step procedure is motivated: first learning of a controller during offline training based on an arbitrarily complicated mathematical system model, before online fast feedforward evaluation of the trained controller. The contribution of this paper is the proposition of a simple gradient-free and model-based algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). In particular, (i) simultaneous training on separate deterministic tasks with the purpose of encoding many motion primitives in a neural network, and (ii) the employment of maximally sparse rewards in combination with virtual velocity constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Norm Optimal Iterative Learning Control with Application to Problems in Accelerator based Free Electron Lasers and Rehabilitation Robotics

Author: Freeman C T
Kichhoff S
Lewin P L
Lichtenberg G
Owens D H
Rogers E
Schmidt C
Werner H
Publication venue
Publication date: 01/01/2010
Field of study

This paper gives an overview of the theoretical basis of the norm optimal approach to iterative learning control followed by results that describe more recent work which has experimentally benchmarking the performance that can be achieved. The remainder of then paper then describes its actual application to a physical process and a very novel application in stroke rehabilitation

DESY Publication Database

Southampton (e-Prints Soton)

Crossref

DESY

REPOSIT

Robust nonlinear control of vectored thrust aircraft

Author: Doyle John C.
Morris John
Murray Richard
Publication venue
Publication date
Field of study

An interdisciplinary program in robust control for nonlinear systems with applications to a variety of engineering problems is outlined. Major emphasis will be placed on flight control, with both experimental and analytical studies. This program builds on recent new results in control theory for stability, stabilization, robust stability, robust performance, synthesis, and model reduction in a unified framework using Linear Fractional Transformations (LFT's), Linear Matrix Inequalities (LMI's), and the structured singular value micron. Most of these new advances have been accomplished by the Caltech controls group independently or in collaboration with researchers in other institutions. These recent results offer a new and remarkably unified framework for all aspects of robust control, but what is particularly important for this program is that they also have important implications for system identification and control of nonlinear systems. This combines well with Caltech's expertise in nonlinear control theory, both in geometric methods and methods for systems with constraints and saturations

NASA Technical Reports Server

Distributed Estimation with Information-Seeking Control in Agent Network

Author: Fröhle Markus
Hlawatsch Franz
Meyer Florian
Wymeersch Henk
Publication venue
Publication date: 01/01/2015
Field of study

We introduce a distributed, cooperative framework and method for Bayesian estimation and control in decentralized agent networks. Our framework combines joint estimation of time-varying global and local states with information-seeking control optimizing the behavior of the agents. It is suited to nonlinear and non-Gaussian problems and, in particular, to location-aware networks. For cooperative estimation, a combination of belief propagation message passing and consensus is used. For cooperative control, the negative posterior joint entropy of all states is maximized via a gradient ascent. The estimation layer provides the control layer with probabilistic information in the form of sample representations of probability distributions. Simulation results demonstrate intelligent behavior of the agents and excellent estimation performance for a simultaneous self-localization and target tracking problem. In a cooperative localization scenario with only one anchor, mobile agents can localize themselves after a short time with an accuracy that is higher than the accuracy of the performed distance measurements.Comment: 17 pages, 10 figure

arXiv.org e-Print Archive

Chalmers Research

Chalmers Publication Library

New control strategies for neuroprosthetic systems

Author: Abbas James J.
Crago Patrick E.
Kantor Carole
Lan Ning
Veltink Peter H.
Publication venue: United States Department of Veterans Affairs
Publication date: 01/01/1996
Field of study

The availability of techniques to artificially excite paralyzed muscles opens enormous potential for restoring both upper and lower extremity movements with\ud neuroprostheses. Neuroprostheses must stimulate muscle, and control and regulate the artificial movements produced. Control methods to accomplish these tasks include feedforward (open-loop), feedback, and adaptive control. Feedforward control requires a great deal of information about the biomechanical behavior of the limb. For the upper extremity, an artificial motor program was developed to provide such movement program input to a neuroprosthesis. In lower extremity control, one group achieved their best results by attempting to meet naturally perceived gait objectives rather than to follow an exact joint angle trajectory. Adaptive feedforward control, as implemented in the cycleto-cycle controller, gave good compensation for the gradual decrease in performance observed with open-loop control. A neural network controller was able to control its system to customize stimulation parameters in order to generate a desired output trajectory in a given individual and to maintain tracking performance in the presence of muscle fatigue. The authors believe that practical FNS control systems must\ud exhibit many of these features of neurophysiological systems

University of Twente Research Information