Search CORE

415 research outputs found

Adaptive Tracking of a Single-Rigid-Body Character in Various Environments

Author: Ahn Jaewon
Gu Taehong
Kwon Taesoo
Lee Yoonsang
Publication venue
Publication date: 14/08/2023
Field of study

Since the introduction of DeepMimic [Peng et al. 2018], subsequent research has focused on expanding the repertoire of simulated motions across various scenarios. In this study, we propose an alternative approach for this goal, a deep reinforcement learning method based on the simulation of a single-rigid-body character. Using the centroidal dynamics model (CDM) to express the full-body character as a single rigid body (SRB) and training a policy to track a reference motion, we can obtain a policy that is capable of adapting to various unobserved environmental changes and controller transitions without requiring any additional learning. Due to the reduced dimension of state and action space, the learning process is sample-efficient. The final full-body motion is kinematically generated in a physically plausible way, based on the state of the simulated SRB character. The SRB simulation is formulated as a quadratic programming (QP) problem, and the policy outputs an action that allows the SRB character to follow the reference motion. We demonstrate that our policy, efficiently trained within 30 minutes on an ultraportable laptop, has the ability to cope with environments that have not been experienced during learning, such as running on uneven terrain or pushing a box, and transitions between learned policies, without any additional learning

arXiv.org e-Print Archive

CAT:Collaborative Adversarial Training

Author: Ji Rongrong
Kuang Huafeng
Lin Xianming
Liu Xingbin
Wu Yongjian
Publication venue
Publication date: 27/03/2023
Field of study

Adversarial training can improve the robustness of neural networks. Previous methods focus on a single adversarial training strategy and do not consider the model property trained by different strategies. By revisiting the previous methods, we find different adversarial training methods have distinct robustness for sample instances. For example, a sample instance can be correctly classified by a model trained using standard adversarial training (AT) but not by a model trained using TRADES, and vice versa. Based on this observation, we propose a collaborative adversarial training framework to improve the robustness of neural networks. Specifically, we use different adversarial training methods to train robust models and let models interact with their knowledge during the training process. Collaborative Adversarial Training (CAT) can improve both robustness and accuracy. Extensive experiments on various networks and datasets validate the effectiveness of our method. CAT achieves state-of-the-art adversarial robustness without using any additional data on CIFAR-10 under the Auto-Attack benchmark. Code is available at https://github.com/liuxingbin/CAT.Comment: Tech repor

arXiv.org e-Print Archive

UniWorld: Autonomous Driving Pre-training via World Models

Author: Dai Bin
Min Chen
Nie Yiming
Xiao Liang
Zhao Dawei
Publication venue
Publication date: 14/08/2023
Field of study

In this paper, we draw inspiration from Alberto Elfes' pioneering work in 1989, where he introduced the concept of the occupancy grid as World Models for robots. We imbue the robot with a spatial-temporal world model, termed UniWorld, to perceive its surroundings and predict the future behavior of other participants. UniWorld involves initially predicting 4D geometric occupancy as the World Models for foundational stage and subsequently fine-tuning on downstream tasks. UniWorld can estimate missing information concerning the world state and predict plausible future states of the world. Besides, UniWorld's pre-training process is label-free, enabling the utilization of massive amounts of image-LiDAR pairs to build a Foundational Model.The proposed unified pre-training framework demonstrates promising results in key tasks such as motion prediction, multi-camera 3D object detection, and surrounding semantic scene completion. When compared to monocular pre-training methods on the nuScenes dataset, UniWorld shows a significant improvement of about 1.5% in IoU for motion prediction, 2.0% in mAP and 2.0% in NDS for multi-camera 3D object detection, as well as a 3% increase in mIoU for surrounding semantic scene completion. By adopting our unified pre-training method, a 25% reduction in 3D training annotation costs can be achieved, offering significant practical value for the implementation of real-world autonomous driving. Codes are publicly available at https://github.com/chaytonmin/UniWorld.Comment: 8 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:2305.1882

arXiv.org e-Print Archive

LSGAN-AT: enhancing malware detector robustness against adversarial examples

Author: Chang X.
Rodríguez R.J.
Wang J.
Wang Y.
Zhang J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Adversarial Malware Example (AME)-based adversarial training can effectively enhance the robustness of Machine Learning (ML)-based malware detectors against AME. AME quality is a key factor to the robustness enhancement. Generative Adversarial Network (GAN) is a kind of AME generation method, but the existing GAN-based AME generation methods have the issues of inadequate optimization, mode collapse and training instability. In this paper, we propose a novel approach (denote as LSGAN-AT) to enhance ML-based malware detector robustness against Adversarial Examples, which includes LSGAN module and AT module. LSGAN module can generate more effective and smoother AME by utilizing brand-new network structures and Least Square (LS) loss to optimize boundary samples. AT module makes adversarial training using AME generated by LSGAN to generate ML-based Robust Malware Detector (RMD). Extensive experiment results validate the better transferability of AME in terms of attacking 6 ML detectors and the RMD transferability in terms of resisting the MalGAN black-box attack. The results also verify the performance of the generated RMD in the recognition rate of AME. © 2021, The Author(s)

Repositorio Universidad de Zaragoza

Deep reinforcement learning:case study with standard RL testing domains

Author: Wang X.
Publication venue
Publication date: 31/03/2016
Field of study

Pure OAI Repository

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Author: Chen Kaixuan
Huang Yanhao
Liu Shunyu
Qing Yunpeng
Song Jie
Song Mingli
Zheng Tongya
Zhou Yihe
Publication venue
Publication date: 26/05/2023
Field of study

Centralized Training with Decentralized Execution (CTDE) has recently emerged as a popular framework for cooperative Multi-Agent Reinforcement Learning (MARL), where agents can use additional global state information to guide training in a centralized way and make their own decisions only based on decentralized local policies. Despite the encouraging results achieved, CTDE makes an independence assumption on agent policies, which limits agents to adopt global cooperative information from each other during centralized training. Therefore, we argue that existing CTDE methods cannot fully utilize global information for training, leading to an inefficient joint-policy exploration and even suboptimal results. In this paper, we introduce a novel Centralized Advising and Decentralized Pruning (CADP) framework for multi-agent reinforcement learning, that not only enables an efficacious message exchange among agents during training but also guarantees the independent policies for execution. Firstly, CADP endows agents the explicit communication channel to seek and take advices from different agents for more centralized training. To further ensure the decentralized execution, we propose a smooth model pruning mechanism to progressively constraint the agent communication into a closed one without degradation in agent cooperation capability. Empirical evaluations on StarCraft II micromanagement and Google Research Football benchmarks demonstrate that the proposed framework achieves superior performance compared with the state-of-the-art counterparts. Our code will be made publicly available

arXiv.org e-Print Archive