187 research outputs found

    Learnable Graph Matching: A Practical Paradigm for Data Association

    Full text link
    Data association is at the core of many computer vision tasks, e.g., multiple object tracking, image matching, and point cloud registration. Existing methods usually solve the data association problem by network flow optimization, bipartite matching, or end-to-end learning directly. Despite their popularity, we find some defects of the current solutions: they mostly ignore the intra-view context information; besides, they either train deep association models in an end-to-end way and hardly utilize the advantage of optimization-based assignment methods, or only use an off-the-shelf neural network to extract features. In this paper, we propose a general learnable graph matching method to address these issues. Especially, we model the intra-view relationships as an undirected graph. Then data association turns into a general graph matching problem between graphs. Furthermore, to make optimization end-to-end differentiable, we relax the original graph matching problem into continuous quadratic programming and then incorporate training into a deep graph neural network with KKT conditions and implicit function theorem. In MOT task, our method achieves state-of-the-art performance on several MOT datasets. For image matching, our method outperforms state-of-the-art methods with half training data and iterations on a popular indoor dataset, ScanNet. Code will be available at https://github.com/jiaweihe1996/GMTracker.Comment: Submitted to TPAMI on Mar 21, 2022. arXiv admin note: substantial text overlap with arXiv:2103.1617

    3D Video Object Detection with Learnable Object-Centric Global Optimization

    Full text link
    We explore long-term temporal visual correspondence-based optimization for 3D video object detection in this work. Visual correspondence refers to one-to-one mappings for pixels across multiple images. Correspondence-based optimization is the cornerstone for 3D scene reconstruction but is less studied in 3D video object detection, because moving objects violate multi-view geometry constraints and are treated as outliers during scene reconstruction. We address this issue by treating objects as first-class citizens during correspondence-based optimization. In this work, we propose BA-Det, an end-to-end optimizable object detector with object-centric temporal correspondence learning and featuremetric object bundle adjustment. Empirically, we verify the effectiveness and efficiency of BA-Det for multiple baseline 3D detectors under various setups. Our BA-Det achieves SOTA performance on the large-scale Waymo Open Dataset (WOD) with only marginal computation cost. Our code is available at https://github.com/jiaweihe1996/BA-Det.Comment: CVPR202

    A complete characterization of split digraphs with a strong arc decomposition

    Full text link
    A \textbf{strong arc decomposition} of a (multi-)digraph D(V,A)D(V, A) is a partition of its arc set AA into two disjoint arc sets A1A_1 and A2A_2 such that both of the spanning subdigraphs D(V,A1)D(V, A_1) and D(V,A2)D(V, A_2) are strong. In this paper, we fully characterize all split digraphs that do not have a strong decomposition. This resolves two problems proposed by Bang-Jensen and Wang and contributes to a series of efforts aimed at addressing this problem for specific graph classes. This work continues the research on semicomplete composition [Bang-Jensen, Gutin and Yeo, J. Graph Theory, 2020]; on locally semicomplete digraphs [Bang-Jensen and Huang, J. Combin. Theory Ser. B, 2010]; on a type of tournaments [Bang-Jensen and Yeo, Combinatorica, 2004].Comment: 34 page

    RRSR:Reciprocal Reference-based Image Super-Resolution with Progressive Feature Alignment and Selection

    Full text link
    Reference-based image super-resolution (RefSR) is a promising SR branch and has shown great potential in overcoming the limitations of single image super-resolution. While previous state-of-the-art RefSR methods mainly focus on improving the efficacy and robustness of reference feature transfer, it is generally overlooked that a well reconstructed SR image should enable better SR reconstruction for its similar LR images when it is referred to as. Therefore, in this work, we propose a reciprocal learning framework that can appropriately leverage such a fact to reinforce the learning of a RefSR network. Besides, we deliberately design a progressive feature alignment and selection module for further improving the RefSR task. The newly proposed module aligns reference-input images at multi-scale feature spaces and performs reference-aware feature selection in a progressive manner, thus more precise reference features can be transferred into the input features and the network capability is enhanced. Our reciprocal learning paradigm is model-agnostic and it can be applied to arbitrary RefSR models. We empirically show that multiple recent state-of-the-art RefSR models can be consistently improved with our reciprocal learning paradigm. Furthermore, our proposed model together with the reciprocal learning strategy sets new state-of-the-art performances on multiple benchmarks.Comment: 8 figures, 17 page

    Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

    Full text link
    In autonomous driving, predicting future events in advance and evaluating the foreseeable risks empowers autonomous vehicles to better plan their actions, enhancing safety and efficiency on the road. To this end, we propose Drive-WM, the first driving world model compatible with existing end-to-end planning models. Through a joint spatial-temporal modeling facilitated by view factorization, our model generates high-fidelity multiview videos in driving scenes. Building on its powerful generation ability, we showcase the potential of applying the world model for safe driving planning for the first time. Particularly, our Drive-WM enables driving into multiple futures based on distinct driving maneuvers, and determines the optimal trajectory according to the image-based rewards. Evaluation on real-world driving datasets verifies that our method could generate high-quality, consistent, and controllable multiview videos, opening up possibilities for real-world simulations and safe planning.Comment: Project page: https://drive-wm.github.io. Code: https://github.com/BraveGroup/Drive-W

    BMAD: Benchmarks for Medical Anomaly Detection

    Full text link
    Anomaly detection (AD) is a fundamental research problem in machine learning and computer vision, with practical applications in industrial inspection, video surveillance, and medical diagnosis. In medical imaging, AD is especially vital for detecting and diagnosing anomalies that may indicate rare diseases or conditions. However, there is a lack of a universal and fair benchmark for evaluating AD methods on medical images, which hinders the development of more generalized and robust AD methods in this specific domain. To bridge this gap, we introduce a comprehensive evaluation benchmark for assessing anomaly detection methods on medical images. This benchmark encompasses six reorganized datasets from five medical domains (i.e. brain MRI, liver CT, retinal OCT, chest X-ray, and digital histopathology) and three key evaluation metrics, and includes a total of fourteen state-of-the-art AD algorithms. This standardized and well-curated medical benchmark with the well-structured codebase enables comprehensive comparisons among recently proposed anomaly detection methods. It will facilitate the community to conduct a fair comparison and advance the field of AD on medical imaging. More information on BMAD is available in our GitHub repository: https://github.com/DorisBao/BMA

    Enhancing End-to-End Autonomous Driving with Latent World Model

    Full text link
    End-to-end autonomous driving has garnered widespread attention. Current end-to-end approaches largely rely on the supervision from perception tasks such as detection, tracking, and map segmentation to aid in learning scene representations. However, these methods require extensive annotations, hindering the data scalability. To address this challenge, we propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels. Specifically, our framework \textbf{LAW} uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame. The predicted latent features are supervised by the actually observed features in the future. This supervision jointly optimizes the latent feature learning and action prediction, which greatly enhances the driving performance. As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations

    The p38 MAPK Inhibitor SB203580 Abrogates Tumor Necrosis Factor-Induced Proliferative Expansion of Mouse CD4+Foxp3+ Regulatory T Cells

    Get PDF
    There is now compelling evidence that tumor necrosis factor (TNF) preferentially activates and expands CD4+Foxp3+ regulatory T cells (Tregs) through TNF receptor type II (TNFR2). However, it remains unclear which signaling transduction pathway(s) of TNFR2 is required for the stimulation of Tregs. Previously, it was shown that the interaction of TNF–TNFR2 resulted in the activation of a number of signaling pathways, including p38 MAPK, NF-κB, in T cells. We thus examined the role of p38 MAPK and NF-κB in TNF-mediated activation of Tregs, by using specific small molecule inhibitors. The results show that treatment with specific p38 MAPK inhibitor SB203580, rather than NF-κB inhibitors (Sulfasalazine and Bay 11-7082), abrogated TNF-induced expansion of Tregs in vitro. Furthermore, upregulation of TNFR2 and Foxp3 expression in Tregs by TNF was also markedly inhibited by SB203580. The proliferative expansion and the upregulation of TNFR2 expression on Tregs in LPS-treated mice were mediated by TNF–TNFR2 interaction, as shown by our previous study. The expansion of Tregs in LPS-treated mice were also markedly inhibited by in vivo treatment with SB203580. Taken together, our data clearly indicate that the activation of p38 MAPK is attributable to TNF/TNFR2-mediated activation and proliferative expansion of Tregs. Our results also suggest that targeting of p38 MAPK by pharmacological agent may represent a novel strategy to up- or downregulation of Treg activity for therapeutic purposes
    • …
    corecore