Search CORE

209 research outputs found

Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism

Author: Chu Qi
Li Hongsheng
Liu Bin
Ouyang Wanli
Wang Xiaogang
Yu Nenghai
Publication venue
Publication date: 13/08/2017
Field of study

In this paper, we propose a CNN-based framework for online MOT. This framework utilizes the merits of single object trackers in adapting appearance models and searching for target in the next frame. Simply applying single object tracker for MOT will encounter the problem in computational efficiency and drifted results caused by occlusion. Our framework achieves computational efficiency by sharing features and using ROI-Pooling to obtain individual features for each target. Some online learned target-specific CNN layers are used for adapting the appearance model for each target. In the framework, we introduce spatial-temporal attention mechanism (STAM) to handle the drift caused by occlusion and interaction among targets. The visibility map of the target is learned and used for inferring the spatial attention map. The spatial attention map is then applied to weight the features. Besides, the occlusion status can be estimated from the visibility map, which controls the online updating process via weighted loss on training samples with different occlusion statuses in different frames. It can be considered as temporal attention mechanism. The proposed algorithm achieves 34.3% and 46.0% in MOTA on challenging MOT15 and MOT16 benchmark dataset respectively.Comment: Accepted at International Conference on Computer Vision (ICCV) 201

arXiv.org e-Print Archive

Crossref

Detection Recovery in Online Multi-Object Tracking with Sparse Graph Tracker

Author: Hyun Jeongseok
Kang Myunggu
Wee Dongyoon
Yeung Dit-Yan
Publication venue
Publication date: 19/09/2023
Field of study

In existing joint detection and tracking methods, pairwise relational features are used to match previous tracklets to current detections. However, the features may not be discriminative enough for a tracker to identify a target from a large number of detections. Selecting only high-scored detections for tracking may lead to missed detections whose confidence score is low. Consequently, in the online setting, this results in disconnections of tracklets which cannot be recovered. In this regard, we present Sparse Graph Tracker (SGT), a novel online graph tracker using higher-order relational features which are more discriminative by aggregating the features of neighboring detections and their relations. SGT converts video data into a graph where detections, their connections, and the relational features of two connected nodes are represented by nodes, edges, and edge features, respectively. The strong edge features allow SGT to track targets with tracking candidates selected by top-K scored detections with large K. As a result, even low-scored detections can be tracked, and the missed detections are also recovered. The robustness of K value is shown through the extensive experiments. In the MOT16/17/20 and HiEve Challenge, SGT outperforms the state-of-the-art trackers with real-time inference speed. Especially, a large improvement in MOTA is shown in the MOT20 and HiEve Challenge. Code is available at https://github.com/HYUNJS/SGT.Comment: Accepted to WACV 2023; fix figure

arXiv.org e-Print Archive

Deconvolutional networks for point-cloud vehicle detection and tracking in driving scenarios

Author: Andrade-Cetto Juan
del Pino Bastida Iván
Moreno-Noguer Francesc
Sanfeliu Cortés Alberto
Solà Ortega Joan
Vaquero Gómez Víctor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Vehicle detection and tracking is a core ingredient for developing autonomous driving applications in urban scenarios. Recent image-based Deep Learning (DL) techniques are obtaining breakthrough results in these perceptive tasks. However, DL research has not yet advanced much towards processing 3D point clouds from lidar range-finders. These sensors are very common in autonomous vehicles since, despite not providing as semantically rich information as images, their performance is more robust under harsh weather conditions than vision sensors. In this paper we present a full vehicle detection and tracking system that works with 3D lidar information only. Our detection step uses a Convolutional Neural Network (CNN) that receives as input a featured representation of the 3D information provided by a Velodyne HDL-64 sensor and returns a per-point classification of whether it belongs to a vehicle or not. The classified point cloud is then geometrically processed to generate observations for a multi-object tracking system implemented via a number of Multi-Hypothesis Extended Kalman Filters (MH-EKF) that estimate the position and velocity of the surrounding vehicles. The system is thoroughly evaluated on the KITTI tracking dataset, and we show the performance boost provided by our CNN-based vehicle detector over a standard geometric approach. Our lidar-based approach uses about a 4% of the data needed for an image-based detector with similarly competitive results.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Learning Non-Uniform Hypergraph for Multi-Object Tracking

Author: Bian Xiao
Du Dawei
Li Shengkun
Lyu Siwei
Wen Longyin
Publication venue
Publication date: 09/12/2018
Field of study

The majority of Multi-Object Tracking (MOT) algorithms based on the tracking-by-detection scheme do not use higher order dependencies among objects or tracklets, which makes them less effective in handling complex scenarios. In this work, we present a new near-online MOT algorithm based on non-uniform hypergraph, which can model different degrees of dependencies among tracklets in a unified objective. The nodes in the hypergraph correspond to the tracklets and the hyperedges with different degrees encode various kinds of dependencies among them. Specifically, instead of setting the weights of hyperedges with different degrees empirically, they are learned automatically using the structural support vector machine algorithm (SSVM). Several experiments are carried out on various challenging datasets (i.e., PETS09, ParkingLot sequence, SubwayFace, and MOT16 benchmark), to demonstrate that our method achieves favorable performance against the state-of-the-art MOT methods.Comment: 11 pages, 4 figures, accepted by AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Protein Tracking by CNN-Based Candidate Pruning and Two-Step Linking with Bayesian Network

Author: Dmitrieva M
Johnston DS
Richens J
Rittscher J
Zenner HL
Publication venue: IEEE International Workshop on Machine Learning for Signal Processing, MLSP
Publication date: 01/01/2019
Field of study

Protein trafficking plays a vital role in understanding many biological processes and disease. Automated tracking of protein vesicles is challenging due to their erratic behaviour, changing appearance, and visual clutter. In this paper we present a novel tracking approach which utilizes a two-step linking process that exploits a probabilistic graphical model to predict tracklet linkage. The vesicles are initially detected with help of a candidate selection process, where the candidates are identified by a multi-scale spot enhancing filter. Subsequently, these candidates are pruned and selected by a light weight convolutional neural network. At the linking stage, the tracklets are formed based on the distance and the detection assignment which is implemented via combinatorial optimization algorithm. Each tracklet is described by a number of parameters used to evaluate the probability of tracklets connection by the inference over the Bayesian network. The tracking results are presented for confocal fluorescence microscopy data of protein trafficking in epithelial cells. The proposed method achieves a root mean square error (RMSE) of 1.39 for the vesicle localisation and of 0.7 representing the degree of track matching with ground truth. The presented method is also evaluated against the state-of-the-art “Trackmate“ framework

Crossref

Oxford University Research Archive

Apollo (Cambridge)