7 research outputs found

    Deep Learning for Logo Detection: A Survey

    Full text link
    When logos are increasingly created, logo detection has gradually become a research hotspot across many domains and tasks. Recent advances in this area are dominated by deep learning-based solutions, where many datasets, learning strategies, network architectures, etc. have been employed. This paper reviews the advance in applying deep learning techniques to logo detection. Firstly, we discuss a comprehensive account of public datasets designed to facilitate performance evaluation of logo detection algorithms, which tend to be more diverse, more challenging, and more reflective of real life. Next, we perform an in-depth analysis of the existing logo detection strategies and the strengths and weaknesses of each learning strategy. Subsequently, we summarize the applications of logo detection in various fields, from intelligent transportation and brand monitoring to copyright and trademark compliance. Finally, we analyze the potential challenges and present the future directions for the development of logo detection to complete this survey

    Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark

    Full text link
    Most existing works on few-shot object detection (FSOD) focus on a setting where both pre-training and few-shot learning datasets are from a similar domain. However, few-shot algorithms are important in multiple domains; hence evaluation needs to reflect the broad applications. We propose a Multi-dOmain Few-Shot Object Detection (MoFSOD) benchmark consisting of 10 datasets from a wide range of domains to evaluate FSOD algorithms. We comprehensively analyze the impacts of freezing layers, different architectures, and different pre-training datasets on FSOD performance. Our empirical results show several key factors that have not been explored in previous works: 1) contrary to previous belief, on a multi-domain benchmark, fine-tuning (FT) is a strong baseline for FSOD, performing on par or better than the state-of-the-art (SOTA) algorithms; 2) utilizing FT as the baseline allows us to explore multiple architectures, and we found them to have a significant impact on down-stream few-shot tasks, even with similar pre-training performances; 3) by decoupling pre-training and few-shot learning, MoFSOD allows us to explore the impact of different pre-training datasets, and the right choice can boost the performance of the down-stream tasks significantly. Based on these findings, we list possible avenues of investigation for improving FSOD performance and propose two simple modifications to existing algorithms that lead to SOTA performance on the MoFSOD benchmark. The code is available at https://github.com/amazon-research/few-shot-object-detection-benchmark.Comment: Accepted at ECCV 202

    An online algorithm for separating sparse and low-dimensional signal sequences from their sum, and its applications in video processing

    Get PDF
    In signal processing, ``low-rank + sparse\u27\u27 is an important assumption when separating two signals from their sum. Many applications, e.g., video foreground/background separation are well-formulated by this assumption. In this work, with the ``low-rank + sparse\u27\u27 assumption, we design and evaluate an online algorithm, called practical recursive projected compressive sensing (prac-ReProCS) for recovering a time sequence of sparse vectors St and a time sequence of dense vectors Lt from their sum, Mt = St + Lt, when the Lt\u27s lie in a slowly changing low-dimensional subspace of the full space. In the first part of this work (Chapter 1-5), we study and discuss the prac-ReProCS algorithm, the practical version of the original ReProCS algorithm. We apply prac-ReProCS to a key application -- video layering, where the goal is to separate a video sequence into a slowly changing background sequence and a sparse foreground sequence that consists of one or more moving regions/objects on-the-fly. Via experiments we show that prac-ReProCS has significantly better performance compared with other state-of-the-art robust-pca methods when applied to video foreground-background separation. In the second part of this work (Chapter 6), we study the problem of video denoising. We apply prac-ReProCS to video denoising as a preprocessing step. We develop a novel approach to video denoising that is based on the idea that many noisy or corrupted videos can be split into three parts -- the ``low-rank laye\u27\u27, the ``sparse layer\u27\u27 and a small residual which is small and bounded. We show using extensive experiments, layering-then-denoising is effective, especially for long videos with small-sized images that those corrupted by general large variance noise or by large sparse noise, e.g., salt-and-pepper noise. In the last part of this work (Chapter 7), we discuss an independent problem called logo detection and propose a future research direction where prac-ReProCS can be combined with deep learning solutions
    corecore