187 research outputs found
Learnable Graph Matching: A Practical Paradigm for Data Association
Data association is at the core of many computer vision tasks, e.g., multiple
object tracking, image matching, and point cloud registration. Existing methods
usually solve the data association problem by network flow optimization,
bipartite matching, or end-to-end learning directly. Despite their popularity,
we find some defects of the current solutions: they mostly ignore the
intra-view context information; besides, they either train deep association
models in an end-to-end way and hardly utilize the advantage of
optimization-based assignment methods, or only use an off-the-shelf neural
network to extract features. In this paper, we propose a general learnable
graph matching method to address these issues. Especially, we model the
intra-view relationships as an undirected graph. Then data association turns
into a general graph matching problem between graphs. Furthermore, to make
optimization end-to-end differentiable, we relax the original graph matching
problem into continuous quadratic programming and then incorporate training
into a deep graph neural network with KKT conditions and implicit function
theorem. In MOT task, our method achieves state-of-the-art performance on
several MOT datasets. For image matching, our method outperforms
state-of-the-art methods with half training data and iterations on a popular
indoor dataset, ScanNet. Code will be available at
https://github.com/jiaweihe1996/GMTracker.Comment: Submitted to TPAMI on Mar 21, 2022. arXiv admin note: substantial
text overlap with arXiv:2103.1617
3D Video Object Detection with Learnable Object-Centric Global Optimization
We explore long-term temporal visual correspondence-based optimization for 3D
video object detection in this work. Visual correspondence refers to one-to-one
mappings for pixels across multiple images. Correspondence-based optimization
is the cornerstone for 3D scene reconstruction but is less studied in 3D video
object detection, because moving objects violate multi-view geometry
constraints and are treated as outliers during scene reconstruction. We address
this issue by treating objects as first-class citizens during
correspondence-based optimization. In this work, we propose BA-Det, an
end-to-end optimizable object detector with object-centric temporal
correspondence learning and featuremetric object bundle adjustment.
Empirically, we verify the effectiveness and efficiency of BA-Det for multiple
baseline 3D detectors under various setups. Our BA-Det achieves SOTA
performance on the large-scale Waymo Open Dataset (WOD) with only marginal
computation cost. Our code is available at
https://github.com/jiaweihe1996/BA-Det.Comment: CVPR202
A complete characterization of split digraphs with a strong arc decomposition
A \textbf{strong arc decomposition} of a (multi-)digraph is a
partition of its arc set into two disjoint arc sets and such
that both of the spanning subdigraphs and are strong.
In this paper, we fully characterize all split digraphs that do not have a
strong decomposition. This resolves two problems proposed by Bang-Jensen and
Wang and contributes to a series of efforts aimed at addressing this problem
for specific graph classes. This work continues the research on semicomplete
composition [Bang-Jensen, Gutin and Yeo, J. Graph Theory, 2020]; on locally
semicomplete digraphs [Bang-Jensen and Huang, J. Combin. Theory Ser. B, 2010];
on a type of tournaments [Bang-Jensen and Yeo, Combinatorica, 2004].Comment: 34 page
RRSR:Reciprocal Reference-based Image Super-Resolution with Progressive Feature Alignment and Selection
Reference-based image super-resolution (RefSR) is a promising SR branch and
has shown great potential in overcoming the limitations of single image
super-resolution. While previous state-of-the-art RefSR methods mainly focus on
improving the efficacy and robustness of reference feature transfer, it is
generally overlooked that a well reconstructed SR image should enable better SR
reconstruction for its similar LR images when it is referred to as. Therefore,
in this work, we propose a reciprocal learning framework that can appropriately
leverage such a fact to reinforce the learning of a RefSR network. Besides, we
deliberately design a progressive feature alignment and selection module for
further improving the RefSR task. The newly proposed module aligns
reference-input images at multi-scale feature spaces and performs
reference-aware feature selection in a progressive manner, thus more precise
reference features can be transferred into the input features and the network
capability is enhanced. Our reciprocal learning paradigm is model-agnostic and
it can be applied to arbitrary RefSR models. We empirically show that multiple
recent state-of-the-art RefSR models can be consistently improved with our
reciprocal learning paradigm. Furthermore, our proposed model together with the
reciprocal learning strategy sets new state-of-the-art performances on multiple
benchmarks.Comment: 8 figures, 17 page
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving
In autonomous driving, predicting future events in advance and evaluating the
foreseeable risks empowers autonomous vehicles to better plan their actions,
enhancing safety and efficiency on the road. To this end, we propose Drive-WM,
the first driving world model compatible with existing end-to-end planning
models. Through a joint spatial-temporal modeling facilitated by view
factorization, our model generates high-fidelity multiview videos in driving
scenes. Building on its powerful generation ability, we showcase the potential
of applying the world model for safe driving planning for the first time.
Particularly, our Drive-WM enables driving into multiple futures based on
distinct driving maneuvers, and determines the optimal trajectory according to
the image-based rewards. Evaluation on real-world driving datasets verifies
that our method could generate high-quality, consistent, and controllable
multiview videos, opening up possibilities for real-world simulations and safe
planning.Comment: Project page: https://drive-wm.github.io. Code:
https://github.com/BraveGroup/Drive-W
BMAD: Benchmarks for Medical Anomaly Detection
Anomaly detection (AD) is a fundamental research problem in machine learning
and computer vision, with practical applications in industrial inspection,
video surveillance, and medical diagnosis. In medical imaging, AD is especially
vital for detecting and diagnosing anomalies that may indicate rare diseases or
conditions. However, there is a lack of a universal and fair benchmark for
evaluating AD methods on medical images, which hinders the development of more
generalized and robust AD methods in this specific domain. To bridge this gap,
we introduce a comprehensive evaluation benchmark for assessing anomaly
detection methods on medical images. This benchmark encompasses six reorganized
datasets from five medical domains (i.e. brain MRI, liver CT, retinal OCT,
chest X-ray, and digital histopathology) and three key evaluation metrics, and
includes a total of fourteen state-of-the-art AD algorithms. This standardized
and well-curated medical benchmark with the well-structured codebase enables
comprehensive comparisons among recently proposed anomaly detection methods. It
will facilitate the community to conduct a fair comparison and advance the
field of AD on medical imaging. More information on BMAD is available in our
GitHub repository: https://github.com/DorisBao/BMA
Enhancing End-to-End Autonomous Driving with Latent World Model
End-to-end autonomous driving has garnered widespread attention. Current
end-to-end approaches largely rely on the supervision from perception tasks
such as detection, tracking, and map segmentation to aid in learning scene
representations. However, these methods require extensive annotations,
hindering the data scalability. To address this challenge, we propose a novel
self-supervised method to enhance end-to-end driving without the need for
costly labels. Specifically, our framework \textbf{LAW} uses a LAtent World
model to predict future latent features based on the predicted ego actions and
the latent feature of the current frame. The predicted latent features are
supervised by the actually observed features in the future. This supervision
jointly optimizes the latent feature learning and action prediction, which
greatly enhances the driving performance. As a result, our approach achieves
state-of-the-art performance in both open-loop and closed-loop benchmarks
without costly annotations
The p38 MAPK Inhibitor SB203580 Abrogates Tumor Necrosis Factor-Induced Proliferative Expansion of Mouse CD4+Foxp3+ Regulatory T Cells
There is now compelling evidence that tumor necrosis factor (TNF) preferentially activates and expands CD4+Foxp3+ regulatory T cells (Tregs) through TNF receptor type II (TNFR2). However, it remains unclear which signaling transduction pathway(s) of TNFR2 is required for the stimulation of Tregs. Previously, it was shown that the interaction of TNF–TNFR2 resulted in the activation of a number of signaling pathways, including p38 MAPK, NF-κB, in T cells. We thus examined the role of p38 MAPK and NF-κB in TNF-mediated activation of Tregs, by using specific small molecule inhibitors. The results show that treatment with specific p38 MAPK inhibitor SB203580, rather than NF-κB inhibitors (Sulfasalazine and Bay 11-7082), abrogated TNF-induced expansion of Tregs in vitro. Furthermore, upregulation of TNFR2 and Foxp3 expression in Tregs by TNF was also markedly inhibited by SB203580. The proliferative expansion and the upregulation of TNFR2 expression on Tregs in LPS-treated mice were mediated by TNF–TNFR2 interaction, as shown by our previous study. The expansion of Tregs in LPS-treated mice were also markedly inhibited by in vivo treatment with SB203580. Taken together, our data clearly indicate that the activation of p38 MAPK is attributable to TNF/TNFR2-mediated activation and proliferative expansion of Tregs. Our results also suggest that targeting of p38 MAPK by pharmacological agent may represent a novel strategy to up- or downregulation of Treg activity for therapeutic purposes
- …