3 research outputs found

    Switching between Limit Cycles in a Model of Running Using Exponentially Stabilizing Discrete Control Lyapunov Function

    Full text link
    This paper considers the problem of switching between two periodic motions, also known as limit cycles, to create agile running motions. For each limit cycle, we use a control Lyapunov function to estimate the region of attraction at the apex of the flight phase. We switch controllers at the apex, only if the current state of the robot is within the region of attraction of the subsequent limit cycle. If the intersection between two limit cycles is the null set, then we construct additional limit cycles till we are able to achieve sufficient overlap of the region of attraction between sequential limit cycles. Additionally, we impose an exponential convergence condition on the control Lyapunov function that allows us to rapidly transition between limit cycles. Using the approach we demonstrate switching between 5 limit cycles in about 5 steps with the speed changing from 2 m/s to 5 m/s.Comment: 6 pages, 4 figures, To be appeared in IEEE American Control Conference (ACC) 201

    Mesh Based Analysis of Low Fractal Dimension Reinforcement Learning Policies

    Full text link
    In previous work, using a process we call meshing, the reachable state spaces for various continuous and hybrid systems were approximated as a discrete set of states which can then be synthesized into a Markov chain. One of the applications for this approach has been to analyze locomotion policies obtained by reinforcement learning, in a step towards making empirical guarantees about the stability properties of the resulting system. In a separate line of research, we introduced a modified reward function for on-policy reinforcement learning algorithms that utilizes a "fractal dimension" of rollout trajectories. This reward was shown to encourage policies that induce individual trajectories which can be more compactly represented as a discrete mesh. In this work we combine these two threads of research by building meshes of the reachable state space of a system subject to disturbances and controlled by policies obtained with the modified reward. Our analysis shows that the modified policies do produce much smaller reachable meshes. This shows that agents trained with the fractal dimension reward transfer their desirable quality of having a more compact state space to a setting with external disturbances. The results also suggest that the previous work using mesh based tools to analyze RL policies may be extended to higher dimensional systems or to higher resolution meshes than would have otherwise been possible.Comment: ICRA 202

    Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement Learning

    Full text link
    A key limitation in using various modern methods of machine learning in developing feedback control policies is the lack of appropriate methodologies to analyze their long-term dynamics, in terms of making any sort of guarantees (even statistically) about robustness. The central reasons for this are largely due to the so-called curse of dimensionality, combined with the black-box nature of the resulting control policies themselves. This paper aims at the first of these issues. Although the full state space of a system may be quite large in dimensionality, it is a common feature of most model-based control methods that the resulting closed-loop systems demonstrate dominant dynamics that are rapidly driven to some lower-dimensional sub-space within. In this work we argue that the dimensionality of this subspace is captured by tools from fractal geometry, namely various notions of a fractional dimension. We then show that the dimensionality of trajectories induced by model free reinforcement learning agents can be influenced adding a post processing function to the agents reward signal. We verify that the dimensionality reduction is robust to noise being added to the system and show that that the modified agents are more actually more robust to noise and push disturbances in general for the systems we examined.Comment: Presented at CORL 202
    corecore