Search CORE

25 research outputs found

Advancing Vision Transformers with Group-Mix Attention

Author: Ding Xiaohan
Ge Chongjian
Luo Ping
Song Yibing
Tong Zhan
Wang Jiangliu
Yuan Li
Publication venue
Publication date: 25/11/2023
Field of study

Vision Transformers (ViTs) have been shown to enhance visual recognition through modeling long-range dependencies with multi-head self-attention (MHSA), which is typically formulated as Query-Key-Value computation. However, the attention map generated from the Query and Key captures only token-to-token correlations at one single granularity. In this paper, we argue that self-attention should have a more comprehensive mechanism to capture correlations among tokens and groups (i.e., multiple adjacent tokens) for higher representational capacity. Thereby, we propose Group-Mix Attention (GMA) as an advanced replacement for traditional self-attention, which can simultaneously capture token-to-token, token-to-group, and group-to-group correlations with various group sizes. To this end, GMA splits the Query, Key, and Value into segments uniformly and performs different group aggregations to generate group proxies. The attention map is computed based on the mixtures of tokens and group proxies and used to re-combine the tokens and groups in Value. Based on GMA, we introduce a powerful backbone, namely GroupMixFormer, which achieves state-of-the-art performance in image classification, object detection, and semantic segmentation with fewer parameters than existing models. For instance, GroupMixFormer-L (with 70.3M parameters and 384^2 input) attains 86.2% Top-1 accuracy on ImageNet-1K without external data, while GroupMixFormer-B (with 45.8M parameters) attains 51.2% mIoU on ADE20K

arXiv.org e-Print Archive

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

Author: Chen Junsong
Ge Chongjian
Hong Lanqing
Li Zhenguo
Lu Huchuan
Luo Ping
Wang Zhongdao
Xie Enze
Publication venue
Publication date: 19/04/2023
Field of study

Perception systems in modern autonomous driving vehicles typically take inputs from complementary multi-modal sensors, e.g., LiDAR and cameras. However, in real-world applications, sensor corruptions and failures lead to inferior performances, thus compromising autonomous safety. In this paper, we propose a robust framework, called MetaBEV, to address extreme real-world environments involving overall six sensor corruptions and two extreme sensor-missing situations. In MetaBEV, signals from multiple sensors are first processed by modal-specific encoders. Subsequently, a set of dense BEV queries are initialized, termed meta-BEV. These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities. The updated BEV representations are further leveraged for multiple 3D prediction tasks. Additionally, we introduce a new M2oE structure to alleviate the performance drop on distinct tasks in multi-task joint learning. Finally, MetaBEV is evaluated on the nuScenes dataset with 3D object detection and BEV map segmentation tasks. Experiments show MetaBEV outperforms prior arts by a large margin on both full and corrupted modalities. For instance, when the LiDAR signal is missing, MetaBEV improves 35.5% detection NDS and 17.7% segmentation mIoU upon the vanilla BEVFusion model; and when the camera signal is absent, MetaBEV still achieves 69.2% NDS and 53.7% mIoU, which is even higher than previous works that perform on full-modalities. Moreover, MetaBEV performs fairly against previous methods in both canonical perception and multi-task learning settings, refreshing state-of-the-art nuScenes BEV map segmentation with 70.4% mIoU.Comment: Project page: https://chongjiange.github.io/metabev.htm

arXiv.org e-Print Archive

DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving

Author: Chen Junsong
Ge Chongjian
Ji Wenxuan
Kim Sukmin
Li Zhenguo
Luo Ping
Wang Tianqi
Xie Enze
Publication venue
Publication date: 03/04/2023
Field of study

Safety is the primary priority of autonomous driving. Nevertheless, no published dataset currently supports the direct and explainable safety evaluation for autonomous driving. In this work, we propose DeepAccident, a large-scale dataset generated via a realistic simulator containing diverse accident scenarios that frequently occur in real-world driving. The proposed DeepAccident dataset contains 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset with 40k annotated samples. In addition, we propose a new task, end-to-end motion and accident prediction, based on the proposed dataset, which can be used to directly evaluate the accident prediction ability for different autonomous driving algorithms. Furthermore, for each scenario, we set four vehicles along with one infrastructure to record data, thus providing diverse viewpoints for accident scenarios and enabling V2X (vehicle-to-everything) research on perception and prediction tasks. Finally, we present a baseline V2X model named V2XFormer that demonstrates superior performance for motion and accident prediction and 3D object detection compared to the single-vehicle model

arXiv.org e-Print Archive

Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training

Author: Abbeel Pieter
Ge Chongjian
James Stephen
Jiao Jianbo
Liu Yun-hui
Song Yibing
Tong Zhan
Wang Jiangliu
Publication venue
Publication date: 25/09/2023
Field of study

This work aims to improve unsupervised audio-visual pre-training. Inspired by the efficacy of data augmentation in visual contrastive learning, we propose a novel speed co-augmentation method that randomly changes the playback speeds of both audio and video data. Despite its simplicity, the speed co-augmentation method possesses two compelling attributes: (1) it increases the diversity of audio-visual pairs and doubles the size of negative pairs, resulting in a significant enhancement in the learned representations, and (2) it changes the strict correlation between audio-visual pairs but introduces a partial relationship between the augmented pairs, which is modeled by our proposed SoftInfoNCE loss to further boost the performance. Experimental results show that the proposed method significantly improves the learned representations when compared to vanilla audio-visual contrastive learning.Comment: Published at the CVPR 2023 Sight and Sound worksho

arXiv.org e-Print Archive

AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation

Author: Bai Haotian
Ge Chongjian
Ji Yuanfeng
Li Zhen
Luo Ping
Ma Wanling
Wan Xiang
Yang Jie
Zhang Lingyan
Zhang Ruimao
Zhu Ye
Publication venue
Publication date: 01/09/2022
Field of study

Despite the considerable progress in automatic abdominal multi-organ segmentation from CT/MRI scans in recent years, a comprehensive evaluation of the models' capabilities is hampered by the lack of a large-scale benchmark from diverse clinical scenarios. Constraint by the high cost of collecting and labeling 3D medical data, most of the deep learning models to date are driven by datasets with a limited number of organs of interest or samples, which still limits the power of modern deep models and makes it difficult to provide a fully comprehensive and fair estimate of various methods. To mitigate the limitations, we present AMOS, a large-scale, diverse, clinical dataset for abdominal organ segmentation. AMOS provides 500 CT and 100 MRI scans collected from multi-center, multi-vendor, multi-modality, multi-phase, multi-disease patients, each with voxel-level annotations of 15 abdominal organs, providing challenging examples and test-bed for studying robust segmentation algorithms under diverse targets and scenarios. We further benchmark several state-of-the-art medical segmentation models to evaluate the status of the existing methods on this new challenging dataset. We have made our datasets, benchmark servers, and baselines publicly available, and hope to inspire future research. Information can be found at https://amos22.grand-challenge.org

arXiv.org e-Print Archive

Dynamic doping and Cottrell atmosphere optimize the thermoelectric performance of n-type PbTe

Author: Abdellaoui Lamya
Berkels Benjamin
Cojocaru-Mirédin Oana
Doberstein Christian
Ge Bangzhi
Qiao Guanjun
Scheu Christina
Wuttig Matthias
Yu Yuan
Zhang Siyuan
Zhang Xiangzhao
Zhou Chongjian
Publication venue: 'Elsevier BV'
Publication date: 20/03/2022
Field of study

High thermoelectric energy conversion efficiency requires a large figure-of-merit, zT, over a broad temperature range. To achieve this, we optimize the carrier concentrations of n-type PbTe from room up to hot-end temperatures by co-doping Bi and Ag. Bi is an efficient n-type dopant in PbTe, often leading to excessive carrier concentration at room temperature. As revealed by density functional theory calculations, the formation of Bi and Ag defect complexes is exploited to optimize the room temperature carrier concentration. At elevated temperatures, we demonstrate the dynamic dissolution of Ag2Te precipitates in PbTe in situ by heating in a scanning transmission electron microscope. The release of n-type Ag interstitials with increasing temperature fulfills the requirement of higher carrier concentrations at the hot end. Moreover, as characterized by atom probe tomography, Ag atoms aggregate along parallel dislocation arrays to form Cottrell atmospheres. This results in enhanced phonon scattering and leads to a low lattice thermal conductivity. As a result of the synergy of dynamic doping and phonon scattering at decorated dislocations, an average zT of 1.0 is achieved in n-type Bi/Ag-codoped PbTe between 400 and 825 K. Introducing dopants with temperature-dependent solubility and strong interaction with dislocation cores enables simultaneous optimization of the average power factor and thermal conductivity, providing a new concept to exploit in the field of thermoelectrics

arXiv.org e-Print Archive

MPG.PuRe

Associations of risk factor burden and genetic predisposition with the 10-year risk of atrial fibrillation: observations from a large prospective study of 348,904 participants

Author: Bingheim Elizabeth
Cai Miao
Chen Ge
Gao Yanhui
Li Haitao
Lin Hualiang
Lip Gregory YH
Qian Zhengmin Min
Vaughn Michael GG
Wang ChongJian
Wang Xiaojie
Zhang Junguo
Publication venue: Springer Science and Business Media LLC
Publication date: 01/01/2023
Field of study

BackgroundUnderstanding the effects of risk factor burden and genetic predisposition on the long-term risk of atrial fibrillation (AF) is important to improve public health initiatives. However, the 10-year risk of AF considering risk factor burden and genetic predisposition is unknown.MethodsA total of 348,904 genetically unrelated participants without AF at baseline from the UK were categorized into three groups: index ages 45 years (n = 84,206), 55 years (n=117,520), and 65 years (n=147,178). Optimal, borderline, or elevated risk factor burden was determined by body mass index, blood pressure, diabetes mellitus, alcohol consumption, smoking status, and history of myocardial infarction or heart failure. Genetic predisposition was estimated using the polygenic risk score (PRS), constructed using 165 predefined genetic risk variants. The combined effects of risk factor burden and PRS on the risk of incident AF in 10 years were estimated for each index age. Fine and Gray models were developed to predict the 10-year risk of AF.ResultsThe overall 10-year risk of AF was 0.67% (95% CI: 0.61-0.73%) for index age 45 years, 2.05% (95% CI: 1.96-2.13%) for index age 55 years, and 6.34% (95% CI: 6.21-6.46%) for index age 65 years, respectively. An optimal risk factor burden was associated with later AF onset regardless of genetic predisposition and sex (P ConclusionsRisk factor burden together with a genetic predisposition is associated with the 10-year risk of AF. Our results may be helpful in selecting high-risk individuals for primary prevention of AF and facilitating subsequent health interventions

University of Liverpool Repository

VBN

Application of X-Ray Inspection for Ultra High Voltage Gas-Insulated Switchgear

Author: Chongjian Ge
Jiachen Wang
Weidong Ding
Yishu Liu
Zhongbo Zheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref