4,094 research outputs found
Online Domain Adaptation for Multi-Object Tracking
Automatically detecting, labeling, and tracking objects in videos depends
first and foremost on accurate category-level object detectors. These might,
however, not always be available in practice, as acquiring high-quality large
scale labeled training datasets is either too costly or impractical for all
possible real-world application scenarios. A scalable solution consists in
re-using object detectors pre-trained on generic datasets. This work is the
first to investigate the problem of on-line domain adaptation of object
detectors for causal multi-object tracking (MOT). We propose to alleviate the
dataset bias by adapting detectors from category to instances, and back: (i) we
jointly learn all target models by adapting them from the pre-trained one, and
(ii) we also adapt the pre-trained model on-line. We introduce an on-line
multi-task learning algorithm to efficiently share parameters and reduce drift,
while gradually improving recall. Our approach is applicable to any linear
object detector, and we evaluate both cheap "mini-Fisher Vectors" and expensive
"off-the-shelf" ConvNet features. We quantitatively measure the benefit of our
domain adaptation strategy on the KITTI tracking benchmark and on a new dataset
(PASCAL-to-KITTI) we introduce to study the domain mismatch problem in MOT.Comment: To appear at BMVC 201
WordFences: Text localization and recognition
En col·laboració amb la Universitat de Barcelona (UB) i la Universitat Rovira i Virgili (URV)In recent years, text recognition has achieved remarkable success in recognizing scanned
document text. However, word recognition in natural images is still an open problem,
which generally requires time consuming post-processing steps. We present a novel architecture
for individual word detection in scene images based on semantic segmentation.
Our contributions are twofold: the concept of WordFence, which detects border areas
surrounding each individual word and a unique pixelwise weighted softmax loss function
which penalizes background and emphasizes small text regions. WordFence ensures that
each word is detected individually, and the new loss function provides a strong training
signal to both text and word border localization. The proposed technique avoids intensive
post-processing by combining semantic word segmentation with a voting scheme
for merging segmentations of multiple scales, producing an end-to-end word detection
system. We achieve superior localization recall on common benchmark datasets - 92%
recall on ICDAR11 and ICDAR13 and 63% recall on SVT. Furthermore, end-to-end
word recognition achieves state-of-the-art 86% F-Score on ICDAR13
Contour Detection from Deep Patch-level Boundary Prediction
In this paper, we present a novel approach for contour detection with
Convolutional Neural Networks. A multi-scale CNN learning framework is designed
to automatically learn the most relevant features for contour patch detection.
Our method uses patch-level measurements to create contour maps with
overlapping patches. We show the proposed CNN is able to to detect large-scale
contours in an image efficienly. We further propose a guided filtering method
to refine the contour maps produced from large-scale contours. Experimental
results on the major contour benchmark databases demonstrate the effectiveness
of the proposed technique. We show our method can achieve good detection of
both fine-scale and large-scale contours.Comment: IEEE International Conference on Signal and Image Processing 201
- …