93,596 research outputs found
Online Domain Adaptation for Multi-Object Tracking
Automatically detecting, labeling, and tracking objects in videos depends
first and foremost on accurate category-level object detectors. These might,
however, not always be available in practice, as acquiring high-quality large
scale labeled training datasets is either too costly or impractical for all
possible real-world application scenarios. A scalable solution consists in
re-using object detectors pre-trained on generic datasets. This work is the
first to investigate the problem of on-line domain adaptation of object
detectors for causal multi-object tracking (MOT). We propose to alleviate the
dataset bias by adapting detectors from category to instances, and back: (i) we
jointly learn all target models by adapting them from the pre-trained one, and
(ii) we also adapt the pre-trained model on-line. We introduce an on-line
multi-task learning algorithm to efficiently share parameters and reduce drift,
while gradually improving recall. Our approach is applicable to any linear
object detector, and we evaluate both cheap "mini-Fisher Vectors" and expensive
"off-the-shelf" ConvNet features. We quantitatively measure the benefit of our
domain adaptation strategy on the KITTI tracking benchmark and on a new dataset
(PASCAL-to-KITTI) we introduce to study the domain mismatch problem in MOT.Comment: To appear at BMVC 201
A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain
Detecting camouflaged moving foreground objects has been known to be
difficult due to the similarity between the foreground objects and the
background. Conventional methods cannot distinguish the foreground from
background due to the small differences between them and thus suffer from
under-detection of the camouflaged foreground objects. In this paper, we
present a fusion framework to address this problem in the wavelet domain. We
first show that the small differences in the image domain can be highlighted in
certain wavelet bands. Then the likelihood of each wavelet coefficient being
foreground is estimated by formulating foreground and background models for
each wavelet band. The proposed framework effectively aggregates the
likelihoods from different wavelet bands based on the characteristics of the
wavelet transform. Experimental results demonstrated that the proposed method
significantly outperformed existing methods in detecting camouflaged foreground
objects. Specifically, the average F-measure for the proposed algorithm was
0.87, compared to 0.71 to 0.8 for the other state-of-the-art methods.Comment: 13 pages, accepted by IEEE TI
On the Design and Analysis of Multiple View Descriptors
We propose an extension of popular descriptors based on gradient orientation
histograms (HOG, computed in a single image) to multiple views. It hinges on
interpreting HOG as a conditional density in the space of sampled images, where
the effects of nuisance factors such as viewpoint and illumination are
marginalized. However, such marginalization is performed with respect to a very
coarse approximation of the underlying distribution. Our extension leverages on
the fact that multiple views of the same scene allow separating intrinsic from
nuisance variability, and thus afford better marginalization of the latter. The
result is a descriptor that has the same complexity of single-view HOG, and can
be compared in the same manner, but exploits multiple views to better trade off
insensitivity to nuisance variability with specificity to intrinsic
variability. We also introduce a novel multi-view wide-baseline matching
dataset, consisting of a mixture of real and synthetic objects with ground
truthed camera motion and dense three-dimensional geometry
- …