11 research outputs found

    A Study of Unsupervised Evaluation Metrics for Practical and Automatic Domain Adaptation

    Full text link
    Unsupervised domain adaptation (UDA) methods facilitate the transfer of models to target domains without labels. However, these methods necessitate a labeled target validation set for hyper-parameter tuning and model selection. In this paper, we aim to find an evaluation metric capable of assessing the quality of a transferred model without access to target validation labels. We begin with the metric based on mutual information of the model prediction. Through empirical analysis, we identify three prevalent issues with this metric: 1) It does not account for the source structure. 2) It can be easily attacked. 3) It fails to detect negative transfer caused by the over-alignment of source and target features. To address the first two issues, we incorporate source accuracy into the metric and employ a new MLP classifier that is held out during training, significantly improving the result. To tackle the final issue, we integrate this enhanced metric with data augmentation, resulting in a novel unsupervised UDA metric called the Augmentation Consistency Metric (ACM). Additionally, we empirically demonstrate the shortcomings of previous experiment settings and conduct large-scale experiments to validate the effectiveness of our proposed metric. Furthermore, we employ our metric to automatically search for the optimal hyper-parameter set, achieving superior performance compared to manually tuned sets across four common benchmarks. Codes will be available soon

    SelFLoc: Selective Feature Fusion for Large-scale Point Cloud-based Place Recognition

    Full text link
    Point cloud-based place recognition is crucial for mobile robots and autonomous vehicles, especially when the global positioning sensor is not accessible. LiDAR points are scattered on the surface of objects and buildings, which have strong shape priors along different axes. To enhance message passing along particular axes, Stacked Asymmetric Convolution Block (SACB) is designed, which is one of the main contributions in this paper. Comprehensive experiments demonstrate that asymmetric convolution and its corresponding strategies employed by SACB can contribute to the more effective representation of point cloud feature. On this basis, Selective Feature Fusion Block (SFFB), which is formed by stacking point- and channel-wise gating layers in a predefined sequence, is proposed to selectively boost salient local features in certain key regions, as well as to align the features before fusion phase. SACBs and SFFBs are combined to construct a robust and accurate architecture for point cloud-based place recognition, which is termed SelFLoc. Comparative experimental results show that SelFLoc achieves the state-of-the-art (SOTA) performance on the Oxford and other three in-house benchmarks with an improvement of 1.6 absolute percentages on mean average recall@1

    M3^3CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

    Full text link
    Masked point modeling has become a promising scheme of self-supervised pre-training for point clouds. Existing methods reconstruct either the original points or related features as the objective of pre-training. However, considering the diversity of downstream tasks, it is necessary for the model to have both low- and high-level representation modeling capabilities to capture geometric details and semantic contexts during pre-training. To this end, M3^3CS is proposed to enable the model with the above abilities. Specifically, with masked point cloud as input, M3^3CS introduces two decoders to predict masked representations and the original points simultaneously. While an extra decoder doubles parameters for the decoding process and may lead to overfitting, we propose siamese decoders to keep the amount of learnable parameters unchanged. Further, we propose an online codebook projecting continuous tokens into discrete ones before reconstructing masked points. In such way, we can enforce the decoder to take effect through the combinations of tokens rather than remembering each token. Comprehensive experiments show that M3^3CS achieves superior performance at both classification and segmentation tasks, outperforming existing methods

    CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale Attention

    Full text link
    While features of different scales are perceptually important to visual inputs, existing vision transformers do not yet take advantage of them explicitly. To this end, we first propose a cross-scale vision transformer, CrossFormer. It introduces a cross-scale embedding layer (CEL) and a long-short distance attention (LSDA). On the one hand, CEL blends each token with multiple patches of different scales, providing the self-attention module itself with cross-scale features. On the other hand, LSDA splits the self-attention module into a short-distance one and a long-distance counterpart, which not only reduces the computational burden but also keeps both small-scale and large-scale features in the tokens. Moreover, through experiments on CrossFormer, we observe another two issues that affect vision transformers' performance, i.e., the enlarging self-attention maps and amplitude explosion. Thus, we further propose a progressive group size (PGS) paradigm and an amplitude cooling layer (ACL) to alleviate the two issues, respectively. The CrossFormer incorporating with PGS and ACL is called CrossFormer++. Extensive experiments show that CrossFormer++ outperforms the other vision transformers on image classification, object detection, instance segmentation, and semantic segmentation tasks. The code will be available at: https://github.com/cheerss/CrossFormer.Comment: 16 pages, 7 figure

    UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

    Full text link
    In the context of autonomous driving, the significance of effective feature learning is widely acknowledged. While conventional 3D self-supervised pre-training methods have shown widespread success, most methods follow the ideas originally designed for 2D images. In this paper, we present UniPAD, a novel self-supervised learning paradigm applying 3D volumetric differentiable rendering. UniPAD implicitly encodes 3D space, facilitating the reconstruction of continuous 3D shape structures and the intricate appearance characteristics of their 2D projections. The flexibility of our method enables seamless integration into both 2D and 3D frameworks, enabling a more holistic comprehension of the scenes. We manifest the feasibility and effectiveness of UniPAD by conducting extensive experiments on various downstream 3D tasks. Our method significantly improves lidar-, camera-, and lidar-camera-based baseline by 9.1, 7.7, and 6.9 NDS, respectively. Notably, our pre-training pipeline achieves 73.2 NDS for 3D object detection and 79.4 mIoU for 3D semantic segmentation on the nuScenes validation set, achieving state-of-the-art results in comparison with previous methods. The code will be available at https://github.com/Nightmare-n/UniPAD.Comment: CVPR202

    CVR-LSE: Compact Vectorization Representation of Local Static Environments for Unmanned Ground Vehicles

    Full text link
    According to the requirement of general static obstacle detection, this paper proposes a compact vectorization representation approach of local static environments for unmanned ground vehicles. At first, by fusing the data of LiDAR and IMU, high-frequency pose information is obtained. Then, through the two-dimensional (2D) obstacle points generation, the process of grid map maintenance with a fixed size is proposed. Finally, the local static environment is described via multiple convex polygons, which is realized throungh the double threshold-based boundary simplification and the convex polygon segmentation. Our proposed approach has been applied in a practical driverless project in the park, and the qualitative experimental results on typical scenes verify the effectiveness and robustness. In addition, the quantitative evaluation shows the superior performance on making use of fewer number of points information (decreased by about 60%) to represent the local static environment compared with the traditional grid map-based methods. Furthermore, the performance of running time (15ms) shows that the proposed approach can be used for real-time local static environment perception. The corresponding code can be accessed at https://github.com/ghm0819/cvr_lse

    Suppressed polysulfide shuttling and improved Li<sup>+</sup> transport in Li S batteries enabled by NbN modified PP separator

    No full text
    The practical application of Lithium-Sulfur battery is limited due to its rapid capacity fading and early battery failure caused by the shuttling of soluble polysulfide. Thus, the key to improve the performance of a Lithium-Sulfur battery is to suppress the shuttling effect. Herein, we report a NbN modified PP separator with high performance for Lithium-Sulfur battery applications. The NbN functionalized layer of the separator can strongly interact with soluble polysulfide species, as demonstrated by both experimental investigation and theoretical verification. As a result of the high polysulfide affinity, the composite separator effectively prevents the shuttling of soluble polysulfide through the separator. Besides, the high electrolyte wettability of NbN modification layer lowers the resistance of Li+ transport through the separator. Due to these favorable features, Lithium-Sulfur battery assembled with the multifunctional NbN modified separator exhibits excellent cycling performance with a reversible capacity of 554.6 mAh g−1 after 200 cycles at 0.2C. The significantly improved rate performance of the battery based on the modified separator is also demonstrated. Our results show that NbN is a promising material for the modification of separators in Lithium-Sulfur battery and the strategy of separator engineering is effective for improving the performance of Lithium-Sulfur battery

    Multifunctional Polypropylene Separator via Cooperative Modification and Its Application in the Lithium–Sulfur Battery

    No full text
    The continuous shuttling of dissolved polysulfides between the electrodes is the primary cause for the rapid decay of lithium–sulfur batteries. Modulation of the separator–electrolyte interface through separator modification is a promising strategy to inhibit polysulfide shuttling. In this work, we develop a graphene oxide and ferrocene comodified polypropylene separator with multifunctionality at the separator–electrolyte interface. The graphene oxide on the functionalized separator could physically adsorb the polysulfide while the ferrocene component could effectively facilitate the conversion of the adsorbed polysulfide. Due to the combination of these beneficial functionalities, the separator exhibits an excellent battery performance, with a high reversible capacity of 409 mAh g–1 after 500 cycles at 0.2 C. We anticipate that the combinatorial separator functionalization proposed herein is an effective approach for improving the performance of lithium–sulfur batteries
    corecore