178,009 research outputs found
Object-Aware Tracking and Mapping
Reasoning about geometric properties of digital cameras and optical physics enabled
researchers to build methods that localise cameras in 3D space from a video
stream, while – often simultaneously – constructing a model of the environment.
Related techniques have evolved substantially since the 1980s, leading to increasingly
accurate estimations. Traditionally, however, the quality of results is strongly
affected by the presence of moving objects, incomplete data, or difficult surfaces
– i.e. surfaces that are not Lambertian or lack texture. One insight of this work is
that these problems can be addressed by going beyond geometrical and optical constraints,
in favour of object level and semantic constraints. Incorporating specific
types of prior knowledge in the inference process, such as motion or shape priors,
leads to approaches with distinct advantages and disadvantages.
After introducing relevant concepts in Chapter 1 and Chapter 2, methods for building
object-centric maps in dynamic environments using motion priors are investigated
in Chapter 5. Chapter 6 addresses the same problem as Chapter 5, but presents
an approach which relies on semantic priors rather than motion cues. To fully exploit
semantic information, Chapter 7 discusses the conditioning of shape representations
on prior knowledge and the practical application to monocular, object-aware
reconstruction systems
Detection-aware multi-object tracking evaluation
Master Universitario en Deep Learning for Audio and Video Signal ProcessingMulti-Object Tracking (MOT) is a hot topic in the computer vision field. It is a
complex task that requires a detector, to identify objects, and a tracker, to follow
them. It is useful for self-driving, surveillance and robot vision, between others, where
research teams and companies are trying to improve their models. In order to determine
which model performs better, they are scored using tracking metrics.
In this thesis we experiment with MOT metrics aware of detection by using correlation matrices. By analyzing the results, we realize that tracking metrics incur in
certain issues that prevent them for correctly reflecting tracking performance. The
performance of the detector is relevant when scoring tracking models. The problem
observed is that tracking metrics weigh differently elements that evaluate detection
performance. Thus, improving one detector’s aspect with a high weight in the MOT
metric will significantly improve the tracker’s score, but not necessarily indicating the
amount of effort done by the tracker. That is, trackers are not evaluated in a balanced
way.
In order to solve this issue with the tracker scoring, we present a new multi-object
tracking metric, based on the effort done by the tracker given a certain set of detections.
This effort is calculated based on the improvement of bounding boxes over the ones
given by the detector and the precision to keep the trace of the objects in a sequence.
The metric has been tested for two widely employed datasets and shows us its reliability
scoring tracking metrics. Also, it do not incur in the problem presented above
Uncertainty-aware Unsupervised Multi-Object Tracking
Without manually annotated identities, unsupervised multi-object trackers are
inferior to learning reliable feature embeddings. It causes the
similarity-based inter-frame association stage also be error-prone, where an
uncertainty problem arises. The frame-by-frame accumulated uncertainty prevents
trackers from learning the consistent feature embedding against time variation.
To avoid this uncertainty problem, recent self-supervised techniques are
adopted, whereas they failed to capture temporal relations. The interframe
uncertainty still exists. In fact, this paper argues that though the
uncertainty problem is inevitable, it is possible to leverage the uncertainty
itself to improve the learned consistency in turn. Specifically, an
uncertainty-based metric is developed to verify and rectify the risky
associations. The resulting accurate pseudo-tracklets boost learning the
feature consistency. And accurate tracklets can incorporate temporal
information into spatial transformation. This paper proposes a tracklet-guided
augmentation strategy to simulate tracklets' motion, which adopts a
hierarchical uncertainty-based sampling mechanism for hard sample mining. The
ultimate unsupervised MOT framework, namely U2MOT, is proven effective on
MOT-Challenges and VisDrone-MOT benchmark. U2MOT achieves a SOTA performance
among the published supervised and unsupervised trackers.Comment: Accepted by International Conference on Computer Vision (ICCV) 202
Learning Background-Aware Correlation Filters for Visual Tracking
Correlation Filters (CFs) have recently demonstrated excellent performance in
terms of rapidly tracking objects under challenging photometric and geometric
variations. The strength of the approach comes from its ability to efficiently
learn - "on the fly" - how the object is changing over time. A fundamental
drawback to CFs, however, is that the background of the object is not be
modelled over time which can result in suboptimal results. In this paper we
propose a Background-Aware CF that can model how both the foreground and
background of the object varies over time. Our approach, like conventional CFs,
is extremely computationally efficient - and extensive experiments over
multiple tracking benchmarks demonstrate the superior accuracy and real-time
performance of our method compared to the state-of-the-art trackers including
those based on a deep learning paradigm
Instance Flow Based Online Multiple Object Tracking
We present a method to perform online Multiple Object Tracking (MOT) of known
object categories in monocular video data. Current Tracking-by-Detection MOT
approaches build on top of 2D bounding box detections. In contrast, we exploit
state-of-the-art instance aware semantic segmentation techniques to compute 2D
shape representations of target objects in each frame. We predict position and
shape of segmented instances in subsequent frames by exploiting optical flow
cues. We define an affinity matrix between instances of subsequent frames which
reflects locality and visual similarity. The instance association is solved by
applying the Hungarian method. We evaluate different configurations of our
algorithm using the MOT 2D 2015 train dataset. The evaluation shows that our
tracking approach is able to track objects with high relative motions. In
addition, we provide results of our approach on the MOT 2D 2015 test set for
comparison with previous works. We achieve a MOTA score of 32.1
- …