54 research outputs found

    Sparse4D v3: Advancing End-to-End 3D Detection and Tracking

    Full text link
    In autonomous driving perception systems, 3D detection and tracking are the two fundamental tasks. This paper delves deeper into this field, building upon the Sparse4D framework. We introduce two auxiliary training tasks (Temporal Instance Denoising and Quality Estimation) and propose decoupled attention to make structural improvements, leading to significant enhancements in detection performance. Additionally, we extend the detector into a tracker using a straightforward approach that assigns instance ID during inference, further highlighting the advantages of query-based algorithms. Extensive experiments conducted on the nuScenes benchmark validate the effectiveness of the proposed improvements. With ResNet50 as the backbone, we witnessed enhancements of 3.0\%, 2.2\%, and 7.6\% in mAP, NDS, and AMOTA, achieving 46.9\%, 56.1\%, and 49.0\%, respectively. Our best model achieved 71.9\% NDS and 67.7\% AMOTA on the nuScenes test set. Code will be released at \url{https://github.com/linxuewu/Sparse4D}

    Sparse4D: Multi-view 3D Object Detection with Sparse Spatial-Temporal Fusion

    Full text link
    Bird-eye-view (BEV) based methods have made great progress recently in multi-view 3D detection task. Comparing with BEV based methods, sparse based methods lag behind in performance, but still have lots of non-negligible merits. To push sparse 3D detection further, in this work, we introduce a novel method, named Sparse4D, which does the iterative refinement of anchor boxes via sparsely sampling and fusing spatial-temporal features. (1) Sparse 4D Sampling: for each 3D anchor, we assign multiple 4D keypoints, which are then projected to multi-view/scale/timestamp image features to sample corresponding features; (2) Hierarchy Feature Fusion: we hierarchically fuse sampled features of different view/scale, different timestamp and different keypoints to generate high-quality instance feature. In this way, Sparse4D can efficiently and effectively achieve 3D detection without relying on dense view transformation nor global attention, and is more friendly to edge devices deployment. Furthermore, we introduce an instance-level depth reweight module to alleviate the ill-posed issue in 3D-to-2D projection. In experiment, our method outperforms all sparse based methods and most BEV based methods on detection task in the nuScenes dataset

    Accelerating Large Batch Training via Gradient Signal to Noise Ratio (GSNR)

    Full text link
    As models for nature language processing (NLP), computer vision (CV) and recommendation systems (RS) require surging computation, a large number of GPUs/TPUs are paralleled as a large batch (LB) to improve training throughput. However, training such LB tasks often meets large generalization gap and downgrades final precision, which limits enlarging the batch size. In this work, we develop the variance reduced gradient descent technique (VRGD) based on the gradient signal to noise ratio (GSNR) and apply it onto popular optimizers such as SGD/Adam/LARS/LAMB. We carry out a theoretical analysis of convergence rate to explain its fast training dynamics, and a generalization analysis to demonstrate its smaller generalization gap on LB training. Comprehensive experiments demonstrate that VRGD can accelerate training (1∼2×1\sim 2 \times), narrow generalization gap and improve final accuracy. We push the batch size limit of BERT pretraining up to 128k/64k and DLRM to 512k without noticeable accuracy loss. We improve ImageNet Top-1 accuracy at 96k by 0.52pp0.52pp than LARS. The generalization gap of BERT and ImageNet training is significantly reduce by over 65%65\%.Comment: 25 pages, 5 figure

    Transcriptome analysis of the adenoma–carcinoma sequences identifies novel biomarkers associated with development of canine colorectal cancer

    Get PDF
    The concept of adenoma-to-cancer transformation in human colorectal cancer (CRC) is widely accepted. However, the relationship between transcriptome features and adenoma to carcinoma transformation in canines is not clear. We collected transcriptome data from 8 normal colon tissues, 4 adenoma tissues, and 15 cancer tissues. Differential analysis was unable to determine the dynamic changes of genes but revealed that PFKFB3 may play a key role in this process. Enrichment analysis explained metabolic dysregulation, immunosuppression, and typical cancer pathways in canine colorectal tumors. MFuzz generated specific dynamic expression patterns of five differentially expressed genes (DEGs). Weighted correlation network analysis showed that DEGs in cluster 3 were associated with malignant tissues, revealing the key role of inflammatory and immune pathways in canine CRC, and the S100A protein family was also found to be involved in the malignant transformation of canine colorectal tumors. By comparing strategies between humans and dogs, we found five novel markers that may be drivers of CRC. Among them, GTBP4 showed excellent diagnostic and prognostic ability. This study was the first systematic exploration of transformation in canine CRC, complemented the molecular characteristics of the development and progression of canine CRC, and provided new potential biomarkers and comparative oncologic evidence for biomarker studies in human colorectal cancer

    Proteins Identified from Saliva and Salivary Glands of the Chinese Gall Aphid Schlechtendalia chinensis

    Full text link
    Aphid saliva plays an essential role in the interaction between aphids and their host plants. Several aphid salivary proteins have been identified but none from galling aphids. Here the salivary proteins from the Chinese gall aphid are analyzed, Schlechtendalia chinensis, via an LC-MS/MS analysis. A total of 31 proteins are identified directly from saliva collected via an artificial diet, and 141 proteins are identified from extracts derived from dissected salivary glands. Among these identified proteins, 17 are found in both collected saliva and dissected salivary glands. In comparison with salivary proteins from ten other free-living Hemipterans, the most striking feature of the salivary protein from S. chinensis is the existence of high proportion of proteins with binding activity, including DNA-, protein-, ATP-, and iron-binding proteins. These proteins maybe involved in gall formation. These results provide a framework for future research to elucidate the molecular basis for gall induction by galling aphids

    Heterogeneous Knowledge Fusion: A Novel Approach for Personalized Recommendation via LLM

    Full text link
    The analysis and mining of user heterogeneous behavior are of paramount importance in recommendation systems. However, the conventional approach of incorporating various types of heterogeneous behavior into recommendation models leads to feature sparsity and knowledge fragmentation issues. To address this challenge, we propose a novel approach for personalized recommendation via Large Language Model (LLM), by extracting and fusing heterogeneous knowledge from user heterogeneous behavior information. In addition, by combining heterogeneous knowledge and recommendation tasks, instruction tuning is performed on LLM for personalized recommendations. The experimental results demonstrate that our method can effectively integrate user heterogeneous behavior and significantly improve recommendation performance.Comment: Accepted at RecSys 202

    Multicystic Changes of Juvenile Nasopharyngeal Angiofibroma: The First Case Report in the Literature

    Full text link
    Multicystic changes of juvenile nasopharyngeal angiofibroma: the first case report in the literature. Otolaryngologists, pathologists, and radiologists had better pay attention to this infrequent incidence

    Preparation of N-, O-, and S-tri-doped biochar through one-pot pyrolysis of poplar and urea formaldehyde and its enhanced removal of tetracycline from wastewater

    Get PDF
    In this study, biochar was prepared via hybrid doping of N, O, and S by applying one-pot pyrolysis of poplar wood and S-containing urea formaldehyde at 900 °C. Different doping ratios were adopted, and the contents of O, N, and S were in the ranges of 2.78 – 5.56 %, 2.16 – 4.92 %, and 1.42 – 4.98 %, respectively. This hybrid doping significantly enhanced the efficiency of the removal of tetracycline (40 mg/L) from wastewater to 71.84 % in comparison with that attained by using normal poplar biochar (29.45 %). The adsorption kinetics and isotherms indicated that the adsorption process was favorable and was dominated by chemisorption instead of physisorption; the dominant adsorption process may be justified by the existence of abundant functional groups. The adsorption capacity was barely related to the surface area (R2 = 0.478), while it was closely related to the concentration of graphitic N (R2 = 0.985) because graphitic N enhanced the π–π interactions. The adsorption capacity was also highly related to the proportion of oxidized N and oxidized S owing to hydrogen bonding, which may have overlapped with the contribution of O-containing functional groups. This study presents a simple hybrid doping method for biochar modification and provides fundamental insights into the specific effects of O-, N- and S-containing functional groups on the performance of biochar for tetracycline removal

    CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

    Full text link
    Multi-modality (MM) image fusion aims to render fused images that maintain the merits of different modalities, e.g., functional highlight and detailed textures. To tackle the challenge in modeling cross-modality features and decomposing desirable modality-specific and modality-shared features, we propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network. Firstly, CDDFuse uses Restormer blocks to extract cross-modality shallow features. We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information. A correlation-driven loss is further proposed to make the low-frequency features correlated while the high-frequency features uncorrelated based on the embedded information. Then, the LT-based global fusion and INN-based local fusion layers output the fused image. Extensive experiments demonstrate that our CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion. We also show that CDDFuse can boost the performance in downstream infrared-visible semantic segmentation and object detection in a unified benchmark. The code is available at https://github.com/Zhaozixiang1228/MMIF-CDDFuse.Comment: Accepted by CVPR 202
    • …
    corecore