2 research outputs found
Q-learning for POMDP: An application to learning locomotion gaits
This paper presents a Q-learning framework for learning optimal locomotion
gaits in robotic systems modeled as coupled rigid bodies. Inspired by
prevalence of periodic gaits in bio-locomotion, an open loop periodic input is
assumed to (say) affect a nominal gait. The learning problem is to learn a new
(modified) gait by using only partial noisy measurements of the state. The
objective of learning is to maximize a given reward modeled as an objective
function in optimal control settings. The proposed control architecture has
three main components: (i) Phase modeling of dynamics by a single phase
variable; (ii) A coupled oscillator feedback particle filter to represent the
posterior distribution of the phase conditioned in the sensory measurements;
and (iii) A Q-learning algorithm to learn the approximate optimal control law.
The architecture is illustrated with the aid of a planar two-body system. The
performance of the learning is demonstrated in a simulation environment.Comment: 8 pages, 6 figures, 58th IEEE Conference on Decision and Contro
Pattern generation and the control of nonlinear systems
Abstract—Many important engineering systems accomplish their purpose using cyclic processes whose characteristics are under feedback control. Examples involving thermodynamic cycles and electromechanical energy conversion processes are particularly noteworthy. Likewise, cyclic processes are prevalent in nature and the idea of a pattern generator is widely used to rationalize mechanisms used for orchestrating movements such as those involved in locomotion and respiration. In this paper, we develop a linkage between the use of cyclic processes and the control of nonholonomic systems, emphasizing the problem of achieving stable regulation. The discussion brings to the fore characteristic phenomena that distinguish the regulation problem for such strongly nonlinear systems from the more commonly studied linear feedback regulators. Finally, we compare this approach to controlling nonholonomic systems to another approach based on the idea of an open-loop approximate inverse as discussed in the literature. Index Terms—Inverse systems, Lie brackets, nonlinear control, pattern generation, regulation, stabilization. I