Search CORE

29 research outputs found

Inferring Attention Shift Ranks of Objects for Image Saliency

Author: Jiao Jianbo
Lau Rynson W.H.
Siris Avishek
Tam Gary K.L.
Xie Xianghua
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Crossref

University of Birmingham Research Portal

Cronfa at Swansea University

What Catches the Eye? Visualizing and Understanding Deep Saliency Models

Author: Borji A
He S
Mi Y
Pugeault N
Publication venue: arXiv.org
Publication date: 22/03/2018
Field of study

Deep convolutional neural networks have demonstrated high performances for fixation prediction in recent years. How they achieve this, however, is less explored and they remain to be black box models. Here, we attempt to shed light on the internal structure of deep saliency models and study what features they extract for fixation prediction. Specifically, we use a simple yet powerful architecture, consisting of only one CNN and a single resolution input, combined with a new loss function for pixel-wise fixation prediction during free viewing of natural scenes. We show that our simple method is on par or better than state-of-the-art complicated saliency models. Furthermore, we propose a method, related to saliency model evaluation metrics, to visualize deep models for fixation prediction. Our method reveals the inner representations of deep models for fixation prediction and provides evidence that saliency, as experienced by humans, is likely to involve high-level semantic knowledge in addition to low-level perceptual cues. Our results can be useful to measure the gap between current saliency models and the human inter-observer model and to build new models to close this gap.Engineering and Physical Sciences Research Council (EPSRC

arXiv.org e-Print Archive

Open Research Exeter

Multi-scale Interactive Network for Salient Object Detection

Author: Lu Huchuan
Pang Youwei
Zhang Lihe
Zhao Xiaoqi
Publication venue
Publication date: 17/07/2020
Field of study

Deep-learning based salient object detection methods achieve great progress. However, the variable scale and unknown category of salient objects are great challenges all the time. These are closely related to the utilization of multi-level and multi-scale features. In this paper, we propose the aggregate interaction modules to integrate the features from adjacent levels, in which less noise is introduced because of only using small up-/down-sampling rates. To obtain more efficient multi-scale features from the integrated features, the self-interaction modules are embedded in each decoder unit. Besides, the class imbalance issue caused by the scale variation weakens the effect of the binary cross entropy loss and results in the spatial inconsistency of the predictions. Therefore, we exploit the consistency-enhanced loss to highlight the fore-/back-ground difference and preserve the intra-class consistency. Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches. The source code will be publicly available at https://github.com/lartpang/MINet.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

Crossref

The effect of downsampling-upsampling strategy on foreground detection algorithms

Author: García-González Jorge
Luque-Baena Rafael Marcos
López-Rubio Ezequiel
Molina-Cabello Miguel Ángel
Publication venue: Springer Nature
Publication date: 01/01/2020
Field of study

Publisher's Bespoke License Versión definitiva disponible en el DOI indicado. Molina-Cabello, M. A., Garcia-Gonzalez, J., Luque-Baena, R. M., & López-Rubio, E. (2020). The effect of downsampling–upsampling strategy on foreground detection algorithms. Artificial Intelligence Review, 53, 4935-4965.In video surveillance systems which incorporate stationary cameras, the first phase of movement object detection is crucial for the correct modelling of the behavior of these objects, as well as being the most complex in terms of execution time. There are many algorithms that provide a reliable and adequate segmentation mask, obtaining real-time ratios for reduced image sizes. However, due to the increased performance of camera hardware, the application of previous methods to sequences with higher resolutions (from 640x480 to 1920x1080) is not carried out in real time, compromising their use in real video surveillance systems. In this paper we propose a methodology to reduce the computational requirements of the algorithms, consisting of a reduction of the input frame and, subsequently, an interpolation of the segmentation mask of each method to recover the original frame size. In addition, the viability of this meta-model is analyzed together with the different selected algorithms, evaluating the quality of the resulting segmentation and its gain in terms of computation time

Repositorio Institucional Universidad de Málaga