20,989 research outputs found

    SE2Net: Siamese Edge-Enhancement Network for Salient Object Detection

    Full text link
    Deep convolutional neural network significantly boosted the capability of salient object detection in handling large variations of scenes and object appearances. However, convolution operations seek to generate strong responses on individual pixels, while lack the ability to maintain the spatial structure of objects. Moreover, the down-sampling operations, such as pooling and striding, lose spatial details of the salient objects. In this paper, we propose a simple yet effective Siamese Edge-Enhancement Network (SE2Net) to preserve the edge structure for salient object detection. Specifically, a novel multi-stage siamese network is built to aggregate the low-level and high-level features, and parallelly estimate the salient maps of edges and regions. As a result, the predicted regions become more accurate by enhancing the responses at edges, and the predicted edges become more semantic by suppressing the false positives in background. After the refined salient maps of edges and regions are produced by the SE2Net, an edge-guided inference algorithm is designed to further improve the resulting salient masks along the predicted edges. Extensive experiments on several benchmark datasets have been conducted, which show that our method is superior than the state-of-the-art approaches

    Edge-guided Non-local Fully Convolutional Network for Salient Object Detection

    Full text link
    Fully Convolutional Neural Network (FCN) has been widely applied to salient object detection recently by virtue of high-level semantic feature extraction, but existing FCN based methods still suffer from continuous striding and pooling operations leading to loss of spatial structure and blurred edges. To maintain the clear edge structure of salient objects, we propose a novel Edge-guided Non-local FCN (ENFNet) to perform edge guided feature learning for accurate salient object detection. In a specific, we extract hierarchical global and local information in FCN to incorporate non-local features for effective feature representations. To preserve good boundaries of salient objects, we propose a guidance block to embed edge prior knowledge into hierarchical feature maps. The guidance block not only performs feature-wise manipulation but also spatial-wise transformation for effective edge embeddings. Our model is trained on the MSRA-B dataset and tested on five popular benchmark datasets. Comparing with the state-of-the-art methods, the proposed method achieves the best performance on all datasets.Comment: 10 pages, 6 figure

    OGNet: Salient Object Detection with Output-guided Attention Module

    Full text link
    Attention mechanisms are widely used in salient object detection models based on deep learning, which can effectively promote the extraction and utilization of useful information by neural networks. However, most of the existing attention modules used in salient object detection are input with the processed feature map itself, which easily leads to the problem of `blind overconfidence'. In this paper, instead of applying the widely used self-attention module, we present an output-guided attention module built with multi-scale outputs to overcome the problem of `blind overconfidence'. We also construct a new loss function, the intractable area F-measure loss function, which is based on the F-measure of the hard-to-handle area to improve the detection effect of the model in the edge areas and confusing areas of an image. Extensive experiments and abundant ablation studies are conducted to evaluate the effect of our methods and to explore the most suitable structure for the model. Tests on several data sets show that our model performs very well, even though it is very lightweight.Comment: submitted to IEEE Transactions on Circuits and Systems for Video Technolog

    Deep Edge-Aware Saliency Detection

    Full text link
    There has been profound progress in visual saliency thanks to the deep learning architectures, however, there still exist three major challenges that hinder the detection performance for scenes with complex compositions, multiple salient objects, and salient objects of diverse scales. In particular, output maps of the existing methods remain low in spatial resolution causing blurred edges due to the stride and pooling operations, networks often neglect descriptive statistical and handcrafted priors that have potential to complement saliency detection results, and deep features at different layers stay mainly desolate waiting to be effectively fused to handle multi-scale salient objects. In this paper, we tackle these issues by a new fully convolutional neural network that jointly learns salient edges and saliency labels in an end-to-end fashion. Our framework first employs convolutional layers that reformulate the detection task as a dense labeling problem, then integrates handcrafted saliency features in a hierarchical manner into lower and higher levels of the deep network to leverage available information for multi-scale response, and finally refines the saliency map through dilated convolutions by imposing context. In this way, the salient edge priors are efficiently incorporated and the output resolution is significantly improved while keeping the memory requirements low, leading to cleaner and sharper object boundaries. Extensive experimental analyses on ten benchmarks demonstrate that our framework achieves consistently superior performance and attains robustness for complex scenes in comparison to the very recent state-of-the-art approaches.Comment: 13 pages, 11 figure

    Salient Object Detection in the Deep Learning Era: An In-Depth Survey

    Full text link
    As an essential problem in computer vision, salient object detection (SOD) has attracted an increasing amount of research attention over the years. Recent advances in SOD are predominantly led by deep learning-based solutions (named deep SOD). To enable in-depth understanding of deep SOD, in this paper, we provide a comprehensive survey covering various aspects, ranging from algorithm taxonomy to unsolved issues. In particular, we first review deep SOD algorithms from different perspectives, including network architecture, level of supervision, learning paradigm, and object-/instance-level detection. Following that, we summarize and analyze existing SOD datasets and evaluation metrics. Then, we benchmark a large group of representative SOD models, and provide detailed analyses of the comparison results. Moreover, we study the performance of SOD algorithms under different attribute settings, which has not been thoroughly explored previously, by constructing a novel SOD dataset with rich attribute annotations covering various salient object types, challenging factors, and scene categories. We further analyze, for the first time in the field, the robustness of SOD models to random input perturbations and adversarial attacks. We also look into the generalization and difficulty of existing SOD datasets. Finally, we discuss several open issues of SOD and outline future research directions.Comment: Published on IEEE TPAMI. All the saliency prediction maps, our constructed dataset with annotations, and codes for evaluation are publicly available at \url{https://github.com/wenguanwang/SODsurvey

    A novel graph structure for salient object detection based on divergence background and compact foreground

    Full text link
    In this paper, we propose an efficient and discriminative model for salient object detection. Our method is carried out in a stepwise mechanism based on both divergence background and compact foreground cues. In order to effectively enhance the distinction between nodes along object boundaries and the similarity among object regions, a graph is constructed by introducing the concept of virtual node. To remove incorrect outputs, a scheme for selecting background seeds and a method for generating compactness foreground regions are introduced, respectively. Different from prior methods, we calculate the saliency value of each node based on the relationship between the corresponding node and the virtual node. In order to achieve significant performance improvement consistently, we propose an Extended Manifold Ranking (EMR) algorithm, which subtly combines suppressed / active nodes and mid-level information. Extensive experimental results demonstrate that the proposed algorithm performs favorably against the state-of-art saliency detection methods in terms of different evaluation metrics on several benchmark datasets.Comment: 22 pages,16 figures, 2 table

    Selectivity or Invariance: Boundary-aware Salient Object Detection

    Full text link
    Typically, a salient object detection (SOD) model faces opposite requirements in processing object interiors and boundaries. The features of interiors should be invariant to strong appearance change so as to pop-out the salient object as a whole, while the features of boundaries should be selective to slight appearance change to distinguish salient objects and background. To address this selectivity-invariance dilemma, we propose a novel boundary-aware network with successive dilation for image-based SOD. In this network, the feature selectivity at boundaries is enhanced by incorporating a boundary localization stream, while the feature invariance at interiors is guaranteed with a complex interior perception stream. Moreover, a transition compensation stream is adopted to amend the probable failures in transitional regions between interiors and boundaries. In particular, an integrated successive dilation module is proposed to enhance the feature invariance at interiors and transitional regions. Extensive experiments on six datasets show that the proposed approach outperforms 16 state-of-the-art methods

    Review of Visual Saliency Detection with Comprehensive Information

    Full text link
    Visual saliency detection model simulates the human visual system to perceive the scene, and has been widely used in many vision tasks. With the acquisition technology development, more comprehensive information, such as depth cue, inter-image correspondence, or temporal relationship, is available to extend image saliency detection to RGBD saliency detection, co-saliency detection, or video saliency detection. RGBD saliency detection model focuses on extracting the salient regions from RGBD images by combining the depth information. Co-saliency detection model introduces the inter-image correspondence constraint to discover the common salient object in an image group. The goal of video saliency detection model is to locate the motion-related salient object in video sequences, which considers the motion cue and spatiotemporal constraint jointly. In this paper, we review different types of saliency detection algorithms, summarize the important issues of the existing methods, and discuss the existent problems and future works. Moreover, the evaluation datasets and quantitative measurements are briefly introduced, and the experimental analysis and discission are conducted to provide a holistic overview of different saliency detection methods.Comment: 18 pages, 11 figures, 7 tables, Accepted by IEEE Transactions on Circuits and Systems for Video Technology 2018, https://rmcong.github.io

    Saliency-Guided Perceptual Grouping Using Motion Cues in Region-Based Artificial Visual Attention

    Full text link
    Region-based artificial attention constitutes a framework for bio-inspired attentional processes on an intermediate abstraction level for the use in computer vision and mobile robotics. Segmentation algorithms produce regions of coherently colored pixels. These serve as proto-objects on which the attentional processes determine image portions of relevance. A single region---which not necessarily represents a full object---constitutes the focus of attention. For many post-attentional tasks, however, such as identifying or tracking objects, single segments are not sufficient. Here, we present a saliency-guided approach that groups regions that potentially belong to the same object based on proximity and similarity of motion. We compare our results to object selection by thresholding saliency maps and a further attention-guided strategy

    SAC-Net: Spatial Attenuation Context for Salient Object Detection

    Full text link
    This paper presents a new deep neural network design for salient object detection by maximizing the integration of local and global image context within, around, and beyond the salient objects. Our key idea is to adaptively propagate and aggregate the image context features with variable attenuation over the entire feature maps. To achieve this, we design the spatial attenuation context (SAC) module to recurrently translate and aggregate the context features independently with different attenuation factors and then to attentively learn the weights to adaptively integrate the aggregated context features. By further embedding the module to process individual layers in a deep network, namely SAC-Net, we can train the network end-to-end and optimize the context features for detecting salient objects. Compared with 29 state-of-the-art methods, experimental results show that our method performs favorably over all the others on six common benchmark data, both quantitatively and visually
    corecore