2,318 research outputs found
Traffic-Aware Multi-Camera Tracking of Vehicles Based on ReID and Camera Link Model
Multi-target multi-camera tracking (MTMCT), i.e., tracking multiple targets
across multiple cameras, is a crucial technique for smart city applications. In
this paper, we propose an effective and reliable MTMCT framework for vehicles,
which consists of a traffic-aware single camera tracking (TSCT) algorithm, a
trajectory-based camera link model (CLM) for vehicle re-identification (ReID),
and a hierarchical clustering algorithm to obtain the cross camera vehicle
trajectories. First, the TSCT, which jointly considers vehicle appearance,
geometric features, and some common traffic scenarios, is proposed to track the
vehicles in each camera separately. Second, the trajectory-based CLM is adopted
to facilitate the relationship between each pair of adjacently connected
cameras and add spatio-temporal constraints for the subsequent vehicle ReID
with temporal attention. Third, the hierarchical clustering algorithm is used
to merge the vehicle trajectories among all the cameras to obtain the final
MTMCT results. Our proposed MTMCT is evaluated on the CityFlow dataset and
achieves a new state-of-the-art performance with IDF1 of 74.93%.Comment: Accepted by ACM International Conference on Multimedia 202
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects
Tracking humans that are interacting with the other subjects or environment
remains unsolved in visual tracking, because the visibility of the human of
interests in videos is unknown and might vary over time. In particular, it is
still difficult for state-of-the-art human trackers to recover complete human
trajectories in crowded scenes with frequent human interactions. In this work,
we consider the visibility status of a subject as a fluent variable, whose
change is mostly attributed to the subject's interaction with the surrounding,
e.g., crossing behind another object, entering a building, or getting into a
vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the
causal-effect relations between an object's visibility fluent and its
activities, and develop a probabilistic graph model to jointly reason the
visibility fluent change (e.g., from visible to invisible) and track humans in
videos. We formulate this joint task as an iterative search of a feasible
causal graph structure that enables fast search algorithm, e.g., dynamic
programming method. We apply the proposed method on challenging video sequences
to evaluate its capabilities of estimating visibility fluent changes of
subjects and tracking subjects of interests over time. Results with comparisons
demonstrate that our method outperforms the alternative trackers and can
recover complete trajectories of humans in complicated scenarios with frequent
human interactions.Comment: accepted by CVPR 201
A Comprehensive Survey on Deep-Learning-based Vehicle Re-Identification: Models, Data Sets and Challenges
Vehicle re-identification (ReID) endeavors to associate vehicle images
collected from a distributed network of cameras spanning diverse traffic
environments. This task assumes paramount importance within the spectrum of
vehicle-centric technologies, playing a pivotal role in deploying Intelligent
Transportation Systems (ITS) and advancing smart city initiatives. Rapid
advancements in deep learning have significantly propelled the evolution of
vehicle ReID technologies in recent years. Consequently, undertaking a
comprehensive survey of methodologies centered on deep learning for vehicle
re-identification has become imperative and inescapable. This paper extensively
explores deep learning techniques applied to vehicle ReID. It outlines the
categorization of these methods, encompassing supervised and unsupervised
approaches, delves into existing research within these categories, introduces
datasets and evaluation criteria, and delineates forthcoming challenges and
potential research directions. This comprehensive assessment examines the
landscape of deep learning in vehicle ReID and establishes a foundation and
starting point for future works. It aims to serve as a complete reference by
highlighting challenges and emerging trends, fostering advancements and
applications in vehicle ReID utilizing deep learning models
- …