6 research outputs found
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers
We propose to address quadrupedal locomotion tasks using Reinforcement
Learning (RL) with a Transformer-based model that learns to combine
proprioceptive information and high-dimensional depth sensor inputs. While
learning-based locomotion has made great advances using RL, most methods still
rely on domain randomization for training blind agents that generalize to
challenging terrains. Our key insight is that proprioceptive states only offer
contact measurements for immediate reaction, whereas an agent equipped with
visual sensory observations can learn to proactively maneuver environments with
obstacles and uneven terrain by anticipating changes in the environment many
steps ahead. In this paper, we introduce LocoTransformer, an end-to-end RL
method for quadrupedal locomotion that leverages a Transformer-based model for
fusing proprioceptive states and visual observations. We evaluate our method in
challenging simulated environments with different obstacles and uneven terrain.
We show that our method obtains significant improvements over policies with
only proprioceptive state inputs, and that Transformer-based models further
improve generalization across environments. Our project page with videos is at
https://RchalYang.github.io/LocoTransformer .Comment: Our project page with videos is at
https://RchalYang.github.io/LocoTransforme
Trajectory Optimization for Legged Robots With Slipping Motions
The dynamics of legged systems are characterized by under-actuation, instability, and contact state switching. We present a trajectory optimization method for generating physically consistent motions under these conditions. By integrating a custom solver for hard contact forces in the system dynamics model, the optimal control algorithm has the authority to freely transition between open, closed, and sliding contact states along the trajectory. Our method can discover stepping motions without a predefined contact schedule. Moreover, the optimizer makes use of slipping contacts if a no-slip condition is too restrictive for the task at hand. Additionally, we show that new behaviors like skating over slippery surfaces emerge automatically, which would not be possible with classical methods that assume stationary contact points. Experiments in simulation and on hardware confirm the physical consistency of the generated trajectories. Our solver achieves iteration rates of 40 Hz for a 1 s horizon and is therefore fast enough to run in a receding horizon setting