411 research outputs found
Unsupervised Person Re-identification by Deep Learning Tracklet Association
Mostexistingpersonre-identification(re-id)methods relyon supervised model
learning on per-camera-pair manually labelled pairwise training data. This
leads to poor scalability in practical re-id deployment due to the lack of
exhaustive identity labelling of image positive and negative pairs for every
camera pair. In this work, we address this problem by proposing an unsupervised
re-id deep learning approach capable of incrementally discovering and
exploiting the underlying re-id discriminative information from automatically
generated person tracklet data from videos in an end-to-end model optimisation.
We formulate a Tracklet Association Unsupervised Deep Learning (TAUDL)
framework characterised by jointly learning per-camera (within-camera) tracklet
association (labelling) and cross-camera tracklet correlation by maximising the
discovery of most likely tracklet relationships across camera views. Extensive
experiments demonstrate the superiority of the proposed TAUDL model over the
state-of-the-art unsupervised and domain adaptation re- id methods using six
person re-id benchmarking datasets.Comment: ECCV 2018 Ora
An equalised global graphical model-based approach for multi-camera object tracking
Non-overlapping multi-camera visual object tracking typically consists of two
steps: single camera object tracking and inter-camera object tracking. Most of
tracking methods focus on single camera object tracking, which happens in the
same scene, while for real surveillance scenes, inter-camera object tracking is
needed and single camera tracking methods can not work effectively. In this
paper, we try to improve the overall multi-camera object tracking performance
by a global graph model with an improved similarity metric. Our method treats
the similarities of single camera tracking and inter-camera tracking
differently and obtains the optimization in a global graph model. The results
show that our method can work better even in the condition of poor single
camera object tracking.Comment: 13 pages, 17 figure
Tracking Persons-of-Interest via Unsupervised Representation Adaptation
Multi-face tracking in unconstrained videos is a challenging problem as faces
of one person often appear drastically different in multiple shots due to
significant variations in scale, pose, expression, illumination, and make-up.
Existing multi-target tracking methods often use low-level features which are
not sufficiently discriminative for identifying faces with such large
appearance variations. In this paper, we tackle this problem by learning
discriminative, video-specific face representations using convolutional neural
networks (CNNs). Unlike existing CNN-based approaches which are only trained on
large-scale face image datasets offline, we use the contextual constraints to
generate a large number of training samples for a given video, and further
adapt the pre-trained face CNN to specific videos using discovered training
samples. Using these training samples, we optimize the embedding space so that
the Euclidean distances correspond to a measure of semantic face similarity via
minimizing a triplet loss function. With the learned discriminative features,
we apply the hierarchical clustering algorithm to link tracklets across
multiple shots to generate trajectories. We extensively evaluate the proposed
algorithm on two sets of TV sitcoms and YouTube music videos, analyze the
contribution of each component, and demonstrate significant performance
improvement over existing techniques.Comment: Project page: http://vllab1.ucmerced.edu/~szhang/FaceTracking
Real-Time Visual Tracking and Identification for a Team of Homogeneous Humanoid Robots
The use of a team of humanoid robots to collaborate in completing a task is
an increasingly important field of research. One of the challenges in achieving
collaboration, is mutual identification and tracking of the robots. This work
presents a real-time vision-based approach to the detection and tracking of
robots of known appearance, based on the images captured by a stationary robot.
A Histogram of Oriented Gradients descriptor is used to detect the robots and
the robot headings are estimated by a multiclass classifier. The tracked robots
report their own heading estimate from magnetometer readings. For tracking, a
cost function based on position and heading is applied to each of the
tracklets, and a globally optimal labeling of the detected robots is found
using the Hungarian algorithm. The complete identification and tracking system
was tested using two igus Humanoid Open Platform robots on a soccer field. We
expect that a similar system can be used with other humanoid robots, such as
Nao and DARwIn-OPComment: 20th RoboCup International Symposium, Leipzig, Germany, 201
Addressing Ambiguity in Multi-target Tracking by Hierarchical Strategy
This paper presents a novel hierarchical approach for the simultaneous
tracking of multiple targets in a video. We use a network flow approach to link
detections in low-level and tracklets in high-level. At each step of the
hierarchy, the confidence of candidates is measured by using a new scoring
system, ConfRank, that considers the quality and the quantity of its
neighborhood. The output of the first stage is a collection of safe tracklets
and unlinked high-confidence detections. For each individual detection, we
determine if it belongs to an existing or is a new tracklet. We show the effect
of our framework to recover missed detections and reduce switch identity. The
proposed tracker is referred to as TVOD for multi-target tracking using the
visual tracker and generic object detector. We achieve competitive results with
lower identity switches on several datasets comparing to state-of-the-art.Comment: 5 pages, Accepted in International Conference of Image Processing,
201
Who did What at Where and When: Simultaneous Multi-Person Tracking and Activity Recognition
We present a bootstrapping framework to simultaneously improve multi-person
tracking and activity recognition at individual, interaction and social group
activity levels. The inference consists of identifying trajectories of all
pedestrian actors, individual activities, pairwise interactions, and collective
activities, given the observed pedestrian detections. Our method uses a
graphical model to represent and solve the joint tracking and recognition
problems via multi-stages: (1) activity-aware tracking, (2) joint interaction
recognition and occlusion recovery, and (3) collective activity recognition. We
solve the where and when problem with visual tracking, as well as the who and
what problem with recognition. High-order correlations among the visible and
occluded individuals, pairwise interactions, groups, and activities are then
solved using a hypergraph formulation within the Bayesian framework.
Experiments on several benchmarks show the advantages of our approach over
state-of-art methods
Machine Learning Methods for Data Association in Multi-Object Tracking
Data association is a key step within the multi-object tracking pipeline that
is notoriously challenging due to its combinatorial nature. A popular and
general way to formulate data association is as the NP-hard multidimensional
assignment problem (MDAP). Over the last few years, data-driven approaches to
assignment have become increasingly prevalent as these techniques have started
to mature. We focus this survey solely on learning algorithms for the
assignment step of multi-object tracking, and we attempt to unify various
methods by highlighting their connections to linear assignment as well as to
the MDAP. First, we review probabilistic and end-to-end optimization approaches
to data association, followed by methods that learn association affinities from
data. We then compare the performance of the methods presented in this survey,
and conclude by discussing future research directions.Comment: Accepted for publication in ACM Computing Survey
Spatial-Temporal Relation Networks for Multi-Object Tracking
Recent progress in multiple object tracking (MOT) has shown that a robust
similarity score is key to the success of trackers. A good similarity score is
expected to reflect multiple cues, e.g. appearance, location, and topology,
over a long period of time. However, these cues are heterogeneous, making them
hard to be combined in a unified network. As a result, existing methods usually
encode them in separate networks or require a complex training approach. In
this paper, we present a unified framework for similarity measurement which
could simultaneously encode various cues and perform reasoning across both
spatial and temporal domains. We also study the feature representation of a
tracklet-object pair in depth, showing a proper design of the pair features can
well empower the trackers. The resulting approach is named spatial-temporal
relation networks (STRN). It runs in a feed-forward way and can be trained in
an end-to-end manner. The state-of-the-art accuracy was achieved on all of the
MOT15-17 benchmarks using public detection and online settings
MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking
In the recent past, the computer vision community has developed centralized
benchmarks for the performance evaluation of a variety of tasks, including
generic object and pedestrian detection, 3D reconstruction, optical flow,
single-object short-term tracking, and stereo estimation. Despite potential
pitfalls of such benchmarks, they have proved to be extremely helpful to
advance the state of the art in the respective area. Interestingly, there has
been rather limited work on the standardization of quantitative benchmarks for
multiple target tracking. One of the few exceptions is the well-known PETS
dataset, targeted primarily at surveillance applications. Despite being widely
used, it is often applied inconsistently, for example involving using different
subsets of the available data, different ways of training the models, or
differing evaluation scripts. This paper describes our work toward a novel
multiple object tracking benchmark aimed to address such issues. We discuss the
challenges of creating such a framework, collecting existing and new data,
gathering state-of-the-art methods to be tested on the datasets, and finally
creating a unified evaluation system. With MOTChallenge we aim to pave the way
toward a unified evaluation framework for a more meaningful quantification of
multi-target tracking
- …