10 research outputs found

    MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report

    Full text link
    Autonomous driving without high-definition (HD) maps demands a higher level of active scene understanding. In this competition, the organizers provided the multi-perspective camera images and standard-definition (SD) maps to explore the boundaries of scene reasoning capabilities. We found that most existing algorithms construct Bird's Eye View (BEV) features from these multi-perspective images and use multi-task heads to delineate road centerlines, boundary lines, pedestrian crossings, and other areas. However, these algorithms perform poorly at the far end of roads and struggle when the primary subject in the image is occluded. Therefore, in this competition, we not only used multi-perspective images as input but also incorporated SD maps to address this issue. We employed map encoder pre-training to enhance the network's geometric encoding capabilities and utilized YOLOX to improve traffic element detection precision. Additionally, for area detection, we innovatively introduced LDTR and auxiliary tasks to achieve higher precision. As a result, our final OLUS score is 0.58

    Multi-source Distilling Domain Adaptation

    Full text link
    Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA). Conventional DA methods usually assume that the labeled data is sampled from a single source distribution. However, in practice, labeled data may be collected from multiple sources, while naive application of the single-source DA algorithms may lead to suboptimal solutions. In this paper, we propose a novel multi-source distilling domain adaptation (MDDA) network, which not only considers the different distances among multiple sources and the target, but also investigates the different similarities of the source samples to the target ones. Specifically, the proposed MDDA includes four stages: (1) pre-train the source classifiers separately using the training data from each source; (2) adversarially map the target into the feature space of each source respectively by minimizing the empirical Wasserstein distance between source and target; (3) select the source training samples that are closer to the target to fine-tune the source classifiers; and (4) classify each encoded target feature by corresponding source classifier, and aggregate different predictions using respective domain weight, which corresponds to the discrepancy between each source and target. Extensive experiments are conducted on public DA benchmarks, and the results demonstrate that the proposed MDDA significantly outperforms the state-of-the-art approaches. Our source code is released at: https://github.com/daoyuan98/MDDA.Comment: Accepted by AAAI 202

    LDTR: Transformer-based Lane Detection with Anchor-chain Representation

    Full text link
    Despite recent advances in lane detection methods, scenarios with limited- or no-visual-clue of lanes due to factors such as lighting conditions and occlusion remain challenging and crucial for automated driving. Moreover, current lane representations require complex post-processing and struggle with specific instances. Inspired by the DETR architecture, we propose LDTR, a transformer-based model to address these issues. Lanes are modeled with a novel anchor-chain, regarding a lane as a whole from the beginning, which enables LDTR to handle special lanes inherently. To enhance lane instance perception, LDTR incorporates a novel multi-referenced deformable attention module to distribute attention around the object. Additionally, LDTR incorporates two line IoU algorithms to improve convergence efficiency and employs a Gaussian heatmap auxiliary branch to enhance model representation capability during training. To evaluate lane detection models, we rely on Frechet distance, parameterized F1-score, and additional synthetic metrics. Experimental results demonstrate that LDTR achieves state-of-the-art performance on well-known datasets.Comment: Accepted by CVM 2024 and CVMJ. 16 pages, 14 figure

    An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos

    No full text
    Emotion recognition in user-generated videos plays an important role in human-centered computing. Existing methods mainly employ traditional two-stage shallow pipeline, i.e. extracting visual and/or audio features and training classifiers. In this paper, we propose to recognize video emotions in an end-to-end manner based on convolutional neural networks (CNNs). Specifically, we develop a deep Visual-Audio Attention Network (VAANet), a novel architecture that integrates spatial, channel-wise, and temporal attentions into a visual 3D CNN and temporal attentions into an audio 2D CNN. Further, we design a special classification loss, i.e. polarity-consistent cross-entropy loss, based on the polarity-emotion hierarchy constraint to guide the attention generation. Extensive experiments conducted on the challenging VideoEmotion-8 and Ekman-6 datasets demonstrate that the proposed VAANet outperforms the state-of-the-art approaches for video emotion recognition. Our source code is released at: https://github.com/maysonma/VAANet

    Augmented Multi-Component Recurrent Graph Convolutional Network for Traffic Flow Forecasting

    No full text
    Due to the periodic and dynamic changes of traffic flow and the spatial–temporal coupling interaction of complex road networks, traffic flow forecasting is highly challenging and rarely yields satisfactory prediction results. In this paper, we propose a novel methodology named the Augmented Multi-component Recurrent Graph Convolutional Network (AM-RGCN) for traffic flow forecasting by addressing the problems above. We first introduce the augmented multi-component module to the traffic forecasting model to tackle the problem of periodic temporal shift emerging in traffic series. Then, we propose an encoder–decoder architecture for spatial–temporal prediction. Specifically, we propose the Temporal Correlation Learner (TCL) which incorporates one-dimensional convolution into LSTM to utilize the intrinsic temporal characteristics of traffic flow. Moreover, we combine TCL with the graph convolutional network to handle the spatial–temporal coupling interaction of the road network. Similarly, the decoder consists of TCL and convolutional neural networks to obtain high-dimensional representations from multi-step predictions based on spatial–temporal sequences. Extensive experiments on two real-world road traffic datasets, PEMSD4 and PEMSD8, demonstrate that our AM-RGCN achieves the best results

    Augmented Multi-Component Recurrent Graph Convolutional Network for Traffic Flow Forecasting

    No full text
    Due to the periodic and dynamic changes of traffic flow and the spatial–temporal coupling interaction of complex road networks, traffic flow forecasting is highly challenging and rarely yields satisfactory prediction results. In this paper, we propose a novel methodology named the Augmented Multi-component Recurrent Graph Convolutional Network (AM-RGCN) for traffic flow forecasting by addressing the problems above. We first introduce the augmented multi-component module to the traffic forecasting model to tackle the problem of periodic temporal shift emerging in traffic series. Then, we propose an encoder–decoder architecture for spatial–temporal prediction. Specifically, we propose the Temporal Correlation Learner (TCL) which incorporates one-dimensional convolution into LSTM to utilize the intrinsic temporal characteristics of traffic flow. Moreover, we combine TCL with the graph convolutional network to handle the spatial–temporal coupling interaction of the road network. Similarly, the decoder consists of TCL and convolutional neural networks to obtain high-dimensional representations from multi-step predictions based on spatial–temporal sequences. Extensive experiments on two real-world road traffic datasets, PEMSD4 and PEMSD8, demonstrate that our AM-RGCN achieves the best results

    The prognosis and metabolite changes of NSCLC patients receiving first‐line immunotherapy combined chemotherapy in different M1c categories according to 9th edition of TNM classification

    No full text
    Abstract Background The 9th edition of the TNM Classification for lung cancer delineates M1c into two subcategories: M1c1 (Multiple extrathoracic lesions within a single organ system) and M1c2 (Multiple extrathoracic lesions involving multiple organ systems). Existing research indicates that patients with lung cancer in stage M1c1 exhibit superior overall survival compared to those in stage M1c2. The primary frontline therapy for patients with advanced non‐small cell lung cancer (NSCLC), lacking driver gene mutations, involves the use of immune checkpoint inhibitors (ICIs) combined with chemotherapy. Nevertheless, a dearth of evidence exists regarding potential survival disparities between NSCLC patients with M1c1 and M1c2 undergoing first‐line immune‐chemotherapy, and reliable biomarkers for predicting treatment outcomes are elusive. Serum metabolic profiles may elucidate distinct prognostic mechanisms, necessitating the identification of divergent metabolites in M1c1 and M1c2 undergoing combination therapy. This study seeks to scrutinize survival discrepancies between various metastatic patterns (M1c1 and M1c2) and pinpoint metabolites associated with treatment outcomes in NSCLC patients undergoing first‐line ICIs combined with chemotherapy. Method In this study, 33 NSCLC patients lacking driver gene mutations diagnosed with M1c1, and 22 similarly diagnosed with M1c2 according to the 9th edition of TNM Classification, were enrolled. These patients received first‐line PD‐1 inhibitor plus chemotherapy. The relationship between metastatic patterns and progression‐free survival (PFS) in patients undergoing combination therapy was analyzed using univariate and multivariate Cox regression models. Serum samples were obtained from all patients before treatment initiation for untargeted metabolomics analysis, aiming to identify differential metabolites. Results In the univariate analysis of PFS, NSCLC patients in M1c1 receiving first‐line PD‐1 inhibitor plus chemotherapy exhibited an extended PFS (HR = 0.49, 95% CI, 0.27–0.88, p = 0.017). In multivariate PFS analyses, these M1c1 patients receiving first‐line PD‐1 inhibitor plus chemotherapy also demonstrated prolonged PFS (HR = 0.45, 95% CI, 0.22–0.92, p = 0.028). The serum metabolic profiles of M1c1 and M1c2 undergoing first‐line PD‐1 inhibitors plus chemotherapy displayed notable distinctions. In comparison to M1c1 patients, M1c2 patients exhibited alterations in various pathways pretreatment, including platelet activation, linoleic acid metabolism, and the VEGF signaling pathway. Diminished levels of lipid‐associated metabolites (diacylglycerol, sphingomyelin) were correlated with adverse outcomes. Conclusion NSCLC patients in M1c1, devoid of driver gene mutations, receiving first‐line PD‐1 inhibitors combined with chemotherapy, experienced superior outcomes compared to M1c2 patients. Moreover, metabolomic profiles strongly correlated with the prognosis of these patients, and M1c2 patients with unfavorable outcomes manifested distinct changes in metabolic pathways before treatment. These changes predominantly involved alterations in lipid metabolism, such as decreased diacylglycerol and sphingomyelin, which may impact tumor migration and invasion

    Association of metabolomics with PD-1 inhibitor plus chemotherapy outcomes in patients with advanced non-small-cell lung cancer

    No full text
    Background Combining immune checkpoint inhibitors (ICIs) with chemotherapy has become a standard treatment for patients with non-small cell lung cancer (NSCLC) lacking driver gene mutations. Reliable biomarkers are essential for predicting treatment outcomes. Emerging evidence from various cancers suggests that early assessment of serum metabolites could serve as valuable biomarkers for predicting outcomes. This study aims to identify metabolites linked to treatment outcomes in patients with advanced NSCLC undergoing first-line or second-line therapy with programmed cell death 1 (PD-1) inhibitors plus chemotherapy.Method 200 patients with advanced NSCLC receiving either first-line or second-line PD-1 inhibitor plus chemotherapy, and 50 patients undergoing first-line chemotherapy were enrolled in this study. The 200 patients receiving combination therapy were divided into a Discovery set (n=50) and a Validation set (n=150). These sets were further categorized into respond and non-respond groups based on progression-free survival PFS criteria (PFS≥12 and PFS<12 months). Serum samples were collected from all patients before treatment initiation for untargeted metabolomics analysis, with the goal of identifying and validating biomarkers that can predict the efficacy of immunotherapy plus chemotherapy. Additionally, the validated metabolites were grouped into high and low categories based on their medians, and their relationship with PFS was analyzed using Cox regression models in patients receiving combination therapy.Results After the impact of chemotherapy was accounted for, two significant differential metabolites were identified in both the Discovery and Validation sets: N-(3-Indolylacetyl)-L-alanine and methomyl (VIP>1 and p<0.05). Notably, upregulation of both metabolites was observed in the group with a poorer prognosis. In the univariate analysis of PFS, lower levels of N-(3-Indolylacetyl)-L-alanine were associated with longer PFS (HR=0.59, 95% CI, 0.41 to 0.84, p=0.003), and a prolonged PFS was also indicated by lower levels of methomyl (HR=0.67, 95% CI, 0.47 to 0.96, p=0.029). In multivariate analyses of PFS, lower levels of N-(3-Indolylacetyl)-L-alanine were significantly associated with a longer PFS (HR=0.60, 95% CI, 0.37 to 0.98, p=0.041).Conclusion Improved outcomes were associated with lower levels of N-(3-Indolylacetyl)-L-alanine in patients with stage IIIB-IV NSCLC lacking driver gene mutations, who underwent first-line or second-line therapy with PD-1 inhibitors combined with chemotherapy. Further exploration of the potential predictive value of pretreatment detection of N-(3-Indolylacetyl)-L-alanine in peripheral blood for the efficacy of combination therapy is warranted.Statement The combination of ICIs and chemotherapy has established itself as the new standard of care for first-line or second-line treatment in patients with advanced NSCLC lacking oncogenic driver alterations. Therefore, identifying biomarkers that can predict the efficacy and prognosis of immunotherapy plus chemotherapy is of paramount importance. Currently, the only validated predictive biomarker is programmed cell death ligand-1 (PD-L1), but its predictive value is not absolute. Our study suggests that the detection of N-(3-Indolylacetyl)-L-alanine in patient serum with untargeted metabolomics prior to combined therapy may predict the efficacy of treatment. Compared with detecting PD-L1 expression, the advantage of our biomarker is that it is more convenient, more dynamic, and seems to work synergistically with PD-L1 expression
    corecore