415 research outputs found
Adaptive Tracking of a Single-Rigid-Body Character in Various Environments
Since the introduction of DeepMimic [Peng et al. 2018], subsequent research
has focused on expanding the repertoire of simulated motions across various
scenarios. In this study, we propose an alternative approach for this goal, a
deep reinforcement learning method based on the simulation of a
single-rigid-body character. Using the centroidal dynamics model (CDM) to
express the full-body character as a single rigid body (SRB) and training a
policy to track a reference motion, we can obtain a policy that is capable of
adapting to various unobserved environmental changes and controller transitions
without requiring any additional learning. Due to the reduced dimension of
state and action space, the learning process is sample-efficient. The final
full-body motion is kinematically generated in a physically plausible way,
based on the state of the simulated SRB character. The SRB simulation is
formulated as a quadratic programming (QP) problem, and the policy outputs an
action that allows the SRB character to follow the reference motion. We
demonstrate that our policy, efficiently trained within 30 minutes on an
ultraportable laptop, has the ability to cope with environments that have not
been experienced during learning, such as running on uneven terrain or pushing
a box, and transitions between learned policies, without any additional
learning
CAT:Collaborative Adversarial Training
Adversarial training can improve the robustness of neural networks. Previous
methods focus on a single adversarial training strategy and do not consider the
model property trained by different strategies. By revisiting the previous
methods, we find different adversarial training methods have distinct
robustness for sample instances. For example, a sample instance can be
correctly classified by a model trained using standard adversarial training
(AT) but not by a model trained using TRADES, and vice versa. Based on this
observation, we propose a collaborative adversarial training framework to
improve the robustness of neural networks. Specifically, we use different
adversarial training methods to train robust models and let models interact
with their knowledge during the training process. Collaborative Adversarial
Training (CAT) can improve both robustness and accuracy. Extensive experiments
on various networks and datasets validate the effectiveness of our method. CAT
achieves state-of-the-art adversarial robustness without using any additional
data on CIFAR-10 under the Auto-Attack benchmark. Code is available at
https://github.com/liuxingbin/CAT.Comment: Tech repor
UniWorld: Autonomous Driving Pre-training via World Models
In this paper, we draw inspiration from Alberto Elfes' pioneering work in
1989, where he introduced the concept of the occupancy grid as World Models for
robots. We imbue the robot with a spatial-temporal world model, termed
UniWorld, to perceive its surroundings and predict the future behavior of other
participants. UniWorld involves initially predicting 4D geometric occupancy as
the World Models for foundational stage and subsequently fine-tuning on
downstream tasks. UniWorld can estimate missing information concerning the
world state and predict plausible future states of the world. Besides,
UniWorld's pre-training process is label-free, enabling the utilization of
massive amounts of image-LiDAR pairs to build a Foundational Model.The proposed
unified pre-training framework demonstrates promising results in key tasks such
as motion prediction, multi-camera 3D object detection, and surrounding
semantic scene completion. When compared to monocular pre-training methods on
the nuScenes dataset, UniWorld shows a significant improvement of about 1.5% in
IoU for motion prediction, 2.0% in mAP and 2.0% in NDS for multi-camera 3D
object detection, as well as a 3% increase in mIoU for surrounding semantic
scene completion. By adopting our unified pre-training method, a 25% reduction
in 3D training annotation costs can be achieved, offering significant practical
value for the implementation of real-world autonomous driving. Codes are
publicly available at https://github.com/chaytonmin/UniWorld.Comment: 8 pages, 5 figures. arXiv admin note: substantial text overlap with
arXiv:2305.1882
LSGAN-AT: enhancing malware detector robustness against adversarial examples
Adversarial Malware Example (AME)-based adversarial training can effectively enhance the robustness of Machine Learning (ML)-based malware detectors against AME. AME quality is a key factor to the robustness enhancement. Generative Adversarial Network (GAN) is a kind of AME generation method, but the existing GAN-based AME generation methods have the issues of inadequate optimization, mode collapse and training instability. In this paper, we propose a novel approach (denote as LSGAN-AT) to enhance ML-based malware detector robustness against Adversarial Examples, which includes LSGAN module and AT module. LSGAN module can generate more effective and smoother AME by utilizing brand-new network structures and Least Square (LS) loss to optimize boundary samples. AT module makes adversarial training using AME generated by LSGAN to generate ML-based Robust Malware Detector (RMD). Extensive experiment results validate the better transferability of AME in terms of attacking 6 ML detectors and the RMD transferability in terms of resisting the MalGAN black-box attack. The results also verify the performance of the generated RMD in the recognition rate of AME. © 2021, The Author(s)
Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?
Centralized Training with Decentralized Execution (CTDE) has recently emerged
as a popular framework for cooperative Multi-Agent Reinforcement Learning
(MARL), where agents can use additional global state information to guide
training in a centralized way and make their own decisions only based on
decentralized local policies. Despite the encouraging results achieved, CTDE
makes an independence assumption on agent policies, which limits agents to
adopt global cooperative information from each other during centralized
training. Therefore, we argue that existing CTDE methods cannot fully utilize
global information for training, leading to an inefficient joint-policy
exploration and even suboptimal results. In this paper, we introduce a novel
Centralized Advising and Decentralized Pruning (CADP) framework for multi-agent
reinforcement learning, that not only enables an efficacious message exchange
among agents during training but also guarantees the independent policies for
execution. Firstly, CADP endows agents the explicit communication channel to
seek and take advices from different agents for more centralized training. To
further ensure the decentralized execution, we propose a smooth model pruning
mechanism to progressively constraint the agent communication into a closed one
without degradation in agent cooperation capability. Empirical evaluations on
StarCraft II micromanagement and Google Research Football benchmarks
demonstrate that the proposed framework achieves superior performance compared
with the state-of-the-art counterparts. Our code will be made publicly
available
- …