80 research outputs found
Hierarchical Consistent Contrastive Learning for Skeleton-Based Action Recognition with Growing Augmentations
Contrastive learning has been proven beneficial for self-supervised
skeleton-based action recognition. Most contrastive learning methods utilize
carefully designed augmentations to generate different movement patterns of
skeletons for the same semantics. However, it is still a pending issue to apply
strong augmentations, which distort the images/skeletons' structures and cause
semantic loss, due to their resulting unstable training. In this paper, we
investigate the potential of adopting strong augmentations and propose a
general hierarchical consistent contrastive learning framework (HiCLR) for
skeleton-based action recognition. Specifically, we first design a gradual
growing augmentation policy to generate multiple ordered positive pairs, which
guide to achieve the consistency of the learned representation from different
views. Then, an asymmetric loss is proposed to enforce the hierarchical
consistency via a directional clustering operation in the feature space,
pulling the representations from strongly augmented views closer to those from
weakly augmented views for better generalizability. Meanwhile, we propose and
evaluate three kinds of strong augmentations for 3D skeletons to demonstrate
the effectiveness of our method. Extensive experiments show that HiCLR
outperforms the state-of-the-art methods notably on three large-scale datasets,
i.e., NTU60, NTU120, and PKUMMD.Comment: Accepted by AAAI 2023. Project page:
https://jhang2020.github.io/Projects/HiCLR/HiCLR.htm
Prompted Contrast with Masked Motion Modeling: Towards Versatile 3D Action Representation Learning
Self-supervised learning has proved effective for skeleton-based human action
understanding, which is an important yet challenging topic. Previous works
mainly rely on contrastive learning or masked motion modeling paradigm to model
the skeleton relations. However, the sequence-level and joint-level
representation learning cannot be effectively and simultaneously handled by
these methods. As a result, the learned representations fail to generalize to
different downstream tasks. Moreover, combining these two paradigms in a naive
manner leaves the synergy between them untapped and can lead to interference in
training. To address these problems, we propose Prompted Contrast with Masked
Motion Modeling, PCM, for versatile 3D action representation
learning. Our method integrates the contrastive learning and masked prediction
tasks in a mutually beneficial manner, which substantially boosts the
generalization capacity for various downstream tasks. Specifically, masked
prediction provides novel training views for contrastive learning, which in
turn guides the masked prediction training with high-level semantic
information. Moreover, we propose a dual-prompted multi-task pretraining
strategy, which further improves model representations by reducing the
interference caused by learning the two different pretext tasks. Extensive
experiments on five downstream tasks under three large-scale datasets are
conducted, demonstrating the superior generalization capacity of PCM
compared to the state-of-the-art works. Our project is publicly available at:
https://jhang2020.github.io/Projects/PCM3/PCM3.html .Comment: Accepted by ACM Multimedia 202
A deep learning framework based on Koopman operator for data-driven modeling of vehicle dynamics
Autonomous vehicles and driving technologies have received notable attention
in the past decades. In autonomous driving systems, \textcolor{black}{the}
information of vehicle dynamics is required in most cases for designing of
motion planning and control algorithms. However, it is nontrivial for
identifying a global model of vehicle dynamics due to the existence of strong
non-linearity and uncertainty. Many efforts have resorted to machine learning
techniques for building data-driven models, but it may suffer from
interpretability and result in a complex nonlinear representation. In this
paper, we propose a deep learning framework relying on an interpretable Koopman
operator to build a data-driven predictor of the vehicle dynamics. The main
idea is to use the Koopman operator for representing the nonlinear dynamics in
a linear lifted feature space. The approach results in a global model that
integrates the dynamics in both longitudinal and lateral directions. As the
core contribution, we propose a deep learning-based extended dynamic mode
decomposition (Deep EDMD) algorithm to learn a finite approximation of the
Koopman operator. Different from other machine learning-based approaches, deep
neural networks play the role of learning feature representations for EDMD in
the framework of the Koopman operator. Simulation results in a high-fidelity
CarSim environment are reported, which show the capability of the Deep EDMD
approach in multi-step prediction of vehicle dynamics at a wide operating
range. Also, the proposed approach outperforms the EDMD method, the multi-layer
perception (MLP) method, and the Extreme Learning Machines-based EDMD
(ELM-EDMD) method in terms of modeling performance. Finally, we design a linear
MPC with Deep EDMD (DE-MPC) for realizing reference tracking and test the
controller in the CarSim environment.Comment: 12 pages, 10 figures, 1 table, and 2 algorithm
SMars: Semi-Supervised Learning for Mars Semantic Segmentation
Deep learning has become a powerful tool for Mars exploration. Mars terrain
semantic segmentation is an important Martian vision task, which is the base of
rover autonomous planning and safe driving. However, there is a lack of
sufficient detailed and high-confidence data annotations, which are exactly
required by most deep learning methods to obtain a good model. To address this
problem, we propose our solution from the perspective of joint data and method
design. We first present a newdataset S5Mars for Semi-SuperviSed learning on
Mars Semantic Segmentation, which contains 6K high-resolution images and is
sparsely annotated based on confidence, ensuring the high quality of labels.
Then to learn from this sparse data, we propose a semi-supervised learning
(SSL) framework for Mars image semantic segmentation, to learn representations
from limited labeled data. Different from the existing SSL methods which are
mostly targeted at the Earth image data, our method takes into account Mars
data characteristics. Specifically, we first investigate the impact of current
widely used natural image augmentations on Mars images. Based on the analysis,
we then proposed two novel and effective augmentations for SSL of Mars
segmentation, AugIN and SAM-Mix, which serve as strong augmentations to boost
the model performance. Meanwhile, to fully leverage the unlabeled data, we
introduce a soft-to-hard consistency learning strategy, learning from different
targets based on prediction confidence. Experimental results show that our
method can outperform state-of-the-art SSL approaches remarkably. Our proposed
dataset is available at https://jhang2020.github.io/S5Mars.github.io/
Learning-based Predictive Control for Nonlinear Systems with Unknown Dynamics Subject to Safety Constraints
Model predictive control (MPC) has been widely employed as an effective
method for model-based constrained control. For systems with unknown dynamics,
reinforcement learning (RL) and adaptive dynamic programming (ADP) have
received notable attention to solve the adaptive optimal control problems.
Recently, works on the use of RL in the framework of MPC have emerged, which
can enhance the ability of MPC for data-driven control. However, the safety
under state constraints and the closed-loop robustness are difficult to be
verified due to approximation errors of RL with function approximation
structures. Aiming at the above problem, we propose a data-driven robust MPC
solution based on incremental RL, called data-driven robust learning-based
predictive control (dr-LPC), for perturbed unknown nonlinear systems subject to
safety constraints. A data-driven robust MPC (dr-MPC) is firstly formulated
with a learned predictor. The incremental Dual Heuristic Programming (DHP)
algorithm using an actor-critic architecture is then utilized to solve the
online optimization problem of dr-MPC. In each prediction horizon, the actor
and critic learn time-varying laws for approximating the optimal control policy
and costate respectively, which is different from classical MPCs. The state and
control constraints are enforced in the learning process via building a
Hamilton-Jacobi-Bellman (HJB) equation and a regularized actor-critic learning
structure using logarithmic barrier functions. The closed-loop robustness and
safety of the dr-LPC are proven under function approximation errors. Simulation
results on two control examples have been reported, which show that the dr-LPC
can outperform the DHP and dr-MPC in terms of state regulation, and its average
computational time is much smaller than that with the dr-MPC in both examples.Comment: The paper has been submitted at a IEEE Journal for possible
publicatio
Overestimation of thermal emittance in solenoid scans due to coupled transverse motion
The solenoid scan is a widely used method for the in-situ measurement of the
thermal emittance in a photocathode gun. The popularity of this method is due
to its simplicity and convenience since all rf photocathode guns are equipped
with an emittance compensation solenoid. This paper shows that the solenoid
scan measurement overestimates the thermal emittance in the ordinary
measurement configuration due to a weak quadrupole field (present in either the
rf gun or gun solenoid) followed by a rotation in the solenoid. This coupled
transverse dynamics aberration introduces a correlation between the beam's
horizontal and vertical motion leading to an increase in the measured 2D
transverse emittance, thus the overestimation of the thermal emittance. This
effect was systematically studied using both analytic expressions and numerical
simulations. These studies were experimentally verified using an L-band
1.6-cell rf photocathode gun with a cesium telluride cathode, which shows a
thermal emittance overestimation of 35% with a rms laser spot size of 2.7 mm.
The paper concludes by showing that the accuracy of the solenoid scan can be
improved by using a quadrupole magnet corrector, consisting of a pair of normal
and skew quadrupole magnets.Comment: 12 pages, 13 figure
- …