Search CORE

2 research outputs found

Q-learning for POMDP: An application to learning locomotion gaits

Author: Mehta Prashant G.
Taghvaei Amirhossein
Wang Tixian
Publication venue
Publication date: 30/09/2019
Field of study

This paper presents a Q-learning framework for learning optimal locomotion gaits in robotic systems modeled as coupled rigid bodies. Inspired by prevalence of periodic gaits in bio-locomotion, an open loop periodic input is assumed to (say) affect a nominal gait. The learning problem is to learn a new (modified) gait by using only partial noisy measurements of the state. The objective of learning is to maximize a given reward modeled as an objective function in optimal control settings. The proposed control architecture has three main components: (i) Phase modeling of dynamics by a single phase variable; (ii) A coupled oscillator feedback particle filter to represent the posterior distribution of the phase conditioned in the sensory measurements; and (iii) A Q-learning algorithm to learn the approximate optimal control law. The architecture is illustrated with the aid of a planar two-body system. The performance of the learning is demonstrated in a simulation environment.Comment: 8 pages, 6 figures, 58th IEEE Conference on Decision and Contro

arXiv.org e-Print Archive

Pattern generation and the control of nonlinear systems

Author: Roger W. Brockett
Publication venue
Publication date
Field of study

Abstract—Many important engineering systems accomplish their purpose using cyclic processes whose characteristics are under feedback control. Examples involving thermodynamic cycles and electromechanical energy conversion processes are particularly noteworthy. Likewise, cyclic processes are prevalent in nature and the idea of a pattern generator is widely used to rationalize mechanisms used for orchestrating movements such as those involved in locomotion and respiration. In this paper, we develop a linkage between the use of cyclic processes and the control of nonholonomic systems, emphasizing the problem of achieving stable regulation. The discussion brings to the fore characteristic phenomena that distinguish the regulation problem for such strongly nonlinear systems from the more commonly studied linear feedback regulators. Finally, we compare this approach to controlling nonholonomic systems to another approach based on the idea of an open-loop approximate inverse as discussed in the literature. Index Terms—Inverse systems, Lie brackets, nonlinear control, pattern generation, regulation, stabilization. I

CiteSeerX