29 research outputs found

    What Catches the Eye? Visualizing and Understanding Deep Saliency Models

    Get PDF
    Deep convolutional neural networks have demonstrated high performances for fixation prediction in recent years. How they achieve this, however, is less explored and they remain to be black box models. Here, we attempt to shed light on the internal structure of deep saliency models and study what features they extract for fixation prediction. Specifically, we use a simple yet powerful architecture, consisting of only one CNN and a single resolution input, combined with a new loss function for pixel-wise fixation prediction during free viewing of natural scenes. We show that our simple method is on par or better than state-of-the-art complicated saliency models. Furthermore, we propose a method, related to saliency model evaluation metrics, to visualize deep models for fixation prediction. Our method reveals the inner representations of deep models for fixation prediction and provides evidence that saliency, as experienced by humans, is likely to involve high-level semantic knowledge in addition to low-level perceptual cues. Our results can be useful to measure the gap between current saliency models and the human inter-observer model and to build new models to close this gap.Engineering and Physical Sciences Research Council (EPSRC

    Multi-scale Interactive Network for Salient Object Detection

    Full text link
    Deep-learning based salient object detection methods achieve great progress. However, the variable scale and unknown category of salient objects are great challenges all the time. These are closely related to the utilization of multi-level and multi-scale features. In this paper, we propose the aggregate interaction modules to integrate the features from adjacent levels, in which less noise is introduced because of only using small up-/down-sampling rates. To obtain more efficient multi-scale features from the integrated features, the self-interaction modules are embedded in each decoder unit. Besides, the class imbalance issue caused by the scale variation weakens the effect of the binary cross entropy loss and results in the spatial inconsistency of the predictions. Therefore, we exploit the consistency-enhanced loss to highlight the fore-/back-ground difference and preserve the intra-class consistency. Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches. The source code will be publicly available at https://github.com/lartpang/MINet.Comment: Accepted by CVPR 202

    The effect of downsampling-upsampling strategy on foreground detection algorithms

    Get PDF
    Publisher's Bespoke License Versi贸n definitiva disponible en el DOI indicado. Molina-Cabello, M. A., Garcia-Gonzalez, J., Luque-Baena, R. M., & L贸pez-Rubio, E. (2020). The effect of downsampling鈥搖psampling strategy on foreground detection algorithms. Artificial Intelligence Review, 53, 4935-4965.In video surveillance systems which incorporate stationary cameras, the first phase of movement object detection is crucial for the correct modelling of the behavior of these objects, as well as being the most complex in terms of execution time. There are many algorithms that provide a reliable and adequate segmentation mask, obtaining real-time ratios for reduced image sizes. However, due to the increased performance of camera hardware, the application of previous methods to sequences with higher resolutions (from 640x480 to 1920x1080) is not carried out in real time, compromising their use in real video surveillance systems. In this paper we propose a methodology to reduce the computational requirements of the algorithms, consisting of a reduction of the input frame and, subsequently, an interpolation of the segmentation mask of each method to recover the original frame size. In addition, the viability of this meta-model is analyzed together with the different selected algorithms, evaluating the quality of the resulting segmentation and its gain in terms of computation time
    corecore