33,304 research outputs found

    Goal Set Inverse Optimal Control and Iterative Re-planning for Predicting Human Reaching Motions in Shared Workspaces

    Full text link
    To enable safe and efficient human-robot collaboration in shared workspaces it is important for the robot to predict how a human will move when performing a task. While predicting human motion for tasks not known a priori is very challenging, we argue that single-arm reaching motions for known tasks in collaborative settings (which are especially relevant for manufacturing) are indeed predictable. Two hypotheses underlie our approach for predicting such motions: First, that the trajectory the human performs is optimal with respect to an unknown cost function, and second, that human adaptation to their partner's motion can be captured well through iterative re-planning with the above cost function. The key to our approach is thus to learn a cost function which "explains" the motion of the human. To do this, we gather example trajectories from pairs of participants performing a collaborative assembly task using motion capture. We then use Inverse Optimal Control to learn a cost function from these trajectories. Finally, we predict reaching motions from the human's current configuration to a task-space goal region by iteratively re-planning a trajectory using the learned cost function. Our planning algorithm is based on the trajectory optimizer STOMP, it plans for a 23 DoF human kinematic model and accounts for the presence of a moving collaborator and obstacles in the environment. Our results suggest that in most cases, our method outperforms baseline methods when predicting motions. We also show that our method outperforms baselines for predicting human motion when a human and a robot share the workspace.Comment: 12 pages, Accepted for publication IEEE Transaction on Robotics 201

    Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks

    Full text link
    Human action recognition in 3D skeleton sequences has attracted a lot of research attention. Recently, Long Short-Term Memory (LSTM) networks have shown promising performance in this task due to their strengths in modeling the dependencies and dynamics in sequential data. As not all skeletal joints are informative for action recognition, and the irrelevant joints often bring noise which can degrade the performance, we need to pay more attention to the informative ones. However, the original LSTM network does not have explicit attention ability. In this paper, we propose a new class of LSTM network, Global Context-Aware Attention LSTM (GCA-LSTM), for skeleton based action recognition. This network is capable of selectively focusing on the informative joints in each frame of each skeleton sequence by using a global context memory cell. To further improve the attention capability of our network, we also introduce a recurrent attention mechanism, with which the attention performance of the network can be enhanced progressively. Moreover, we propose a stepwise training scheme in order to train our network effectively. Our approach achieves state-of-the-art performance on five challenging benchmark datasets for skeleton based action recognition

    Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting

    Full text link
    Multi-person pose forecasting remains a challenging problem, especially in modeling fine-grained human body interaction in complex crowd scenarios. Existing methods typically represent the whole pose sequence as a temporal series, yet overlook interactive influences among people based on skeletal body parts. In this paper, we propose a novel Trajectory-Aware Body Interaction Transformer (TBIFormer) for multi-person pose forecasting via effectively modeling body part interactions. Specifically, we construct a Temporal Body Partition Module that transforms all the pose sequences into a Multi-Person Body-Part sequence to retain spatial and temporal information based on body semantics. Then, we devise a Social Body Interaction Self-Attention (SBI-MSA) module, utilizing the transformed sequence to learn body part dynamics for inter- and intra-individual interactions. Furthermore, different from prior Euclidean distance-based spatial encodings, we present a novel and efficient Trajectory-Aware Relative Position Encoding for SBI-MSA to offer discriminative spatial information and additional interactive clues. On both short- and long-term horizons, we empirically evaluate our framework on CMU-Mocap, MuPoTS-3D as well as synthesized datasets (6 ~ 10 persons), and demonstrate that our method greatly outperforms the state-of-the-art methods. Code will be made publicly available upon acceptance.Comment: Accepted by CVPR2023, 8 pages, 6 figures. arXiv admin note: text overlap with arXiv:2208.0922
    • …
    corecore