7 research outputs found
Deep Learning for Logo Detection: A Survey
When logos are increasingly created, logo detection has gradually become a
research hotspot across many domains and tasks. Recent advances in this area
are dominated by deep learning-based solutions, where many datasets, learning
strategies, network architectures, etc. have been employed. This paper reviews
the advance in applying deep learning techniques to logo detection. Firstly, we
discuss a comprehensive account of public datasets designed to facilitate
performance evaluation of logo detection algorithms, which tend to be more
diverse, more challenging, and more reflective of real life. Next, we perform
an in-depth analysis of the existing logo detection strategies and the
strengths and weaknesses of each learning strategy. Subsequently, we summarize
the applications of logo detection in various fields, from intelligent
transportation and brand monitoring to copyright and trademark compliance.
Finally, we analyze the potential challenges and present the future directions
for the development of logo detection to complete this survey
Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark
Most existing works on few-shot object detection (FSOD) focus on a setting
where both pre-training and few-shot learning datasets are from a similar
domain. However, few-shot algorithms are important in multiple domains; hence
evaluation needs to reflect the broad applications. We propose a Multi-dOmain
Few-Shot Object Detection (MoFSOD) benchmark consisting of 10 datasets from a
wide range of domains to evaluate FSOD algorithms. We comprehensively analyze
the impacts of freezing layers, different architectures, and different
pre-training datasets on FSOD performance. Our empirical results show several
key factors that have not been explored in previous works: 1) contrary to
previous belief, on a multi-domain benchmark, fine-tuning (FT) is a strong
baseline for FSOD, performing on par or better than the state-of-the-art (SOTA)
algorithms; 2) utilizing FT as the baseline allows us to explore multiple
architectures, and we found them to have a significant impact on down-stream
few-shot tasks, even with similar pre-training performances; 3) by decoupling
pre-training and few-shot learning, MoFSOD allows us to explore the impact of
different pre-training datasets, and the right choice can boost the performance
of the down-stream tasks significantly. Based on these findings, we list
possible avenues of investigation for improving FSOD performance and propose
two simple modifications to existing algorithms that lead to SOTA performance
on the MoFSOD benchmark. The code is available at
https://github.com/amazon-research/few-shot-object-detection-benchmark.Comment: Accepted at ECCV 202
An online algorithm for separating sparse and low-dimensional signal sequences from their sum, and its applications in video processing
In signal processing, ``low-rank + sparse\u27\u27 is an important assumption when separating two signals from their sum. Many applications, e.g., video foreground/background separation are well-formulated by this assumption. In this work, with the ``low-rank + sparse\u27\u27 assumption, we design and evaluate an online algorithm, called practical recursive projected compressive sensing (prac-ReProCS) for recovering a time sequence of sparse vectors St and a time sequence of dense vectors Lt from their sum, Mt = St + Lt, when the Lt\u27s lie in a slowly changing low-dimensional subspace of the full space.
In the first part of this work (Chapter 1-5), we study and discuss the prac-ReProCS algorithm, the practical version of the original ReProCS algorithm. We apply prac-ReProCS to a key application -- video layering, where the goal is to separate a video sequence into a slowly changing background sequence and a sparse foreground sequence that consists of one or more moving regions/objects on-the-fly. Via experiments we show that prac-ReProCS has significantly better performance compared with other state-of-the-art robust-pca methods when applied to video foreground-background separation.
In the second part of this work (Chapter 6), we study the problem of video denoising. We apply prac-ReProCS to video denoising as a preprocessing step. We develop a novel approach to video denoising that is based on the idea that many noisy or corrupted videos can be split into three parts -- the ``low-rank laye\u27\u27, the ``sparse layer\u27\u27 and a small residual which is small and bounded. We show using extensive experiments, layering-then-denoising is effective, especially for long videos with small-sized images that those corrupted by general large variance noise or by large sparse noise, e.g., salt-and-pepper noise.
In the last part of this work (Chapter 7), we discuss an independent problem called logo detection and propose a future research direction where prac-ReProCS can be combined with deep learning solutions