33,304 research outputs found
Goal Set Inverse Optimal Control and Iterative Re-planning for Predicting Human Reaching Motions in Shared Workspaces
To enable safe and efficient human-robot collaboration in shared workspaces
it is important for the robot to predict how a human will move when performing
a task. While predicting human motion for tasks not known a priori is very
challenging, we argue that single-arm reaching motions for known tasks in
collaborative settings (which are especially relevant for manufacturing) are
indeed predictable. Two hypotheses underlie our approach for predicting such
motions: First, that the trajectory the human performs is optimal with respect
to an unknown cost function, and second, that human adaptation to their
partner's motion can be captured well through iterative re-planning with the
above cost function. The key to our approach is thus to learn a cost function
which "explains" the motion of the human. To do this, we gather example
trajectories from pairs of participants performing a collaborative assembly
task using motion capture. We then use Inverse Optimal Control to learn a cost
function from these trajectories. Finally, we predict reaching motions from the
human's current configuration to a task-space goal region by iteratively
re-planning a trajectory using the learned cost function. Our planning
algorithm is based on the trajectory optimizer STOMP, it plans for a 23 DoF
human kinematic model and accounts for the presence of a moving collaborator
and obstacles in the environment. Our results suggest that in most cases, our
method outperforms baseline methods when predicting motions. We also show that
our method outperforms baselines for predicting human motion when a human and a
robot share the workspace.Comment: 12 pages, Accepted for publication IEEE Transaction on Robotics 201
Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks
Human action recognition in 3D skeleton sequences has attracted a lot of
research attention. Recently, Long Short-Term Memory (LSTM) networks have shown
promising performance in this task due to their strengths in modeling the
dependencies and dynamics in sequential data. As not all skeletal joints are
informative for action recognition, and the irrelevant joints often bring noise
which can degrade the performance, we need to pay more attention to the
informative ones. However, the original LSTM network does not have explicit
attention ability. In this paper, we propose a new class of LSTM network,
Global Context-Aware Attention LSTM (GCA-LSTM), for skeleton based action
recognition. This network is capable of selectively focusing on the informative
joints in each frame of each skeleton sequence by using a global context memory
cell. To further improve the attention capability of our network, we also
introduce a recurrent attention mechanism, with which the attention performance
of the network can be enhanced progressively. Moreover, we propose a stepwise
training scheme in order to train our network effectively. Our approach
achieves state-of-the-art performance on five challenging benchmark datasets
for skeleton based action recognition
Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting
Multi-person pose forecasting remains a challenging problem, especially in
modeling fine-grained human body interaction in complex crowd scenarios.
Existing methods typically represent the whole pose sequence as a temporal
series, yet overlook interactive influences among people based on skeletal body
parts. In this paper, we propose a novel Trajectory-Aware Body Interaction
Transformer (TBIFormer) for multi-person pose forecasting via effectively
modeling body part interactions. Specifically, we construct a Temporal Body
Partition Module that transforms all the pose sequences into a Multi-Person
Body-Part sequence to retain spatial and temporal information based on body
semantics. Then, we devise a Social Body Interaction Self-Attention (SBI-MSA)
module, utilizing the transformed sequence to learn body part dynamics for
inter- and intra-individual interactions. Furthermore, different from prior
Euclidean distance-based spatial encodings, we present a novel and efficient
Trajectory-Aware Relative Position Encoding for SBI-MSA to offer discriminative
spatial information and additional interactive clues. On both short- and
long-term horizons, we empirically evaluate our framework on CMU-Mocap,
MuPoTS-3D as well as synthesized datasets (6 ~ 10 persons), and demonstrate
that our method greatly outperforms the state-of-the-art methods. Code will be
made publicly available upon acceptance.Comment: Accepted by CVPR2023, 8 pages, 6 figures. arXiv admin note: text
overlap with arXiv:2208.0922
- …