122,102 research outputs found
Robust Multi-Object Tracking: A Labeled Random Finite Set Approach
The labeled random finite set based generalized multi-Bernoulli filter is a tractable analytic solution for the multi-object tracking problem. The robustness of this filter is dependent on certain knowledge regarding the multi-object system being available to the filter. This dissertation presents techniques for robust tracking, constructed upon the labeled random finite set framework, where complete information regarding the system is unavailable
You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking
Firstly, a new multi-object tracking framework is proposed in this paper
based on multi-modal fusion. By integrating object detection and multi-object
tracking into the same model, this framework avoids the complex data
association process in the classical TBD paradigm, and requires no additional
training. Secondly, confidence of historical trajectory regression is explored,
possible states of a trajectory in the current frame (weak object or strong
object) are analyzed and a confidence fusion module is designed to guide
non-maximum suppression of trajectory and detection for ordered association.
Finally, extensive experiments are conducted on the KITTI and Waymo datasets.
The results show that the proposed method can achieve robust tracking by using
only two modal detectors and it is more accurate than many of the latest TBD
paradigm-based multi-modal tracking methods. The source codes of the proposed
method are available at https://github.com/wangxiyang2022/YONTD-MOTComment: 10 pages, 9 figure
Robust Distributed Fusion with Labeled Random Finite Sets
This paper considers the problem of the distributed fusion of multi-object
posteriors in the labeled random finite set filtering framework, using
Generalized Covariance Intersection (GCI) method. Our analysis shows that GCI
fusion with labeled multi-object densities strongly relies on label
consistencies between local multi-object posteriors at different sensor nodes,
and hence suffers from a severe performance degradation when perfect label
consistencies are violated. Moreover, we mathematically analyze this phenomenon
from the perspective of Principle of Minimum Discrimination Information and the
so called yes-object probability. Inspired by the analysis, we propose a novel
and general solution for the distributed fusion with labeled multi-object
densities that is robust to label inconsistencies between sensors.
Specifically, the labeled multi-object posteriors are firstly marginalized to
their unlabeled posteriors which are then fused using GCI method. We also
introduce a principled method to construct the labeled fused density and
produce tracks formally. Based on the developed theoretical framework, we
present tractable algorithms for the family of generalized labeled
multi-Bernoulli (GLMB) filters including -GLMB, marginalized
-GLMB and labeled multi-Bernoulli filters. The robustness and
efficiency of the proposed distributed fusion algorithm are demonstrated in
challenging tracking scenarios via numerical experiments.Comment: 17pages, 23 figure
Adaptive detection and tracking using multimodal information
This thesis describes work on fusing data from multiple sources of information, and focuses on two main areas: adaptive detection and adaptive object tracking in automated vision scenarios. The work on adaptive object detection explores a new paradigm in dynamic parameter selection, by selecting thresholds for object detection to maximise agreement between pairs of sources. Object tracking, a complementary technique to object detection, is also explored in a multi-source context and an efficient framework for robust tracking, termed the Spatiogram Bank tracker, is proposed as a means to overcome the difficulties of traditional histogram tracking. As well as performing theoretical analysis of the proposed methods, specific example applications are given for both the detection and the tracking aspects, using thermal infrared and visible spectrum video data, as well as other multi-modal information sources
Multi-Object Tracking: A Computer Vision Paradigm
This paper delves into advancements and hurdles encountered in multi-object tracking, a critical aspect of computer vision, with a special emphasis on \u27referring understanding.\u27 This technique integrates natural language queries into multi-object tracking tasks, thus broadening the scope for practical applications. The innovative referring multi-object tracking (RMOT) approach emerges as a promising solution in this regard. The effectiveness of RMOT was tested using the Refer-KITTI dataset, a dataset specializing in traffic scenes. The evaluation revealed RMOT\u27s ability to handle a diverse range of referent objects, its robust temporal dynamics, and a high level of adaptability. While the paper acknowledges the significant strides made with this approach, it also illuminates a few inherent limitations and new challenges such as multi-object prediction and cross-frame association. In addressing these issues, the paper attempts to retrain an end-to-end differentiable framework for RMOT, building on the latest DETR framework, suggesting promising prospects for future advancements in this domain. The ultimate goal of this paper is to refine the RMOT model further, promote a more profound understanding of the computer vision landscape, and underscore the technology\u27s potential for future research and applications
Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking
This work proposes an end-to-end multi-camera 3D multi-object tracking (MOT)
framework. It emphasizes spatio-temporal continuity and integrates both past
and future reasoning for tracked objects. Thus, we name it "Past-and-Future
reasoning for Tracking" (PF-Track). Specifically, our method adapts the
"tracking by attention" framework and represents tracked instances coherently
over time with object queries. To explicitly use historical cues, our "Past
Reasoning" module learns to refine the tracks and enhance the object features
by cross-attending to queries from previous frames and other objects. The
"Future Reasoning" module digests historical information and predicts robust
future trajectories. In the case of long-term occlusions, our method maintains
the object positions and enables re-association by integrating motion
predictions. On the nuScenes dataset, our method improves AMOTA by a large
margin and remarkably reduces ID-Switches by 90% compared to prior approaches,
which is an order of magnitude less. The code and models are made available at
https://github.com/TRI-ML/PF-Track.Comment: CVPR 2023 Camera Ready, 15 pages, 8 figure
- …