Estimation of the frequency and duration of logos in videos is important and
challenging in the advertisement industry as a way of estimating the impact of
ad purchases. Since logos occupy only a small area in the videos, the popular
methods of image retrieval could fail. This paper develops an algorithm called
Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm
based on the spatial distribution of local image descriptors that measure the
distance between the query image (the logo) and a collection of video images.
VLR uses local features to overcome the weakness of global feature-based models
such as convolutional neural networks (CNN). Meanwhile, VLR is flexible and
does not require training after setting some hyper-parameters. The performance
of VLR is evaluated on two challenging open benchmark tasks (SoccerNet and
Standford I2V), and compared with other state-of-the-art logo retrieval or
detection algorithms. Overall, VLR shows significantly higher accuracy compared
with the existing methods.Comment: Accepted by ICIP 20. Contact author: Bochen Guan ([email protected]