21 research outputs found
An Iterative Co-Saliency Framework for RGBD Images
As a newly emerging and significant topic in computer vision community,
co-saliency detection aims at discovering the common salient objects in
multiple related images. The existing methods often generate the co-saliency
map through a direct forward pipeline which is based on the designed cues or
initialization, but lack the refinement-cycle scheme. Moreover, they mainly
focus on RGB image and ignore the depth information for RGBD images. In this
paper, we propose an iterative RGBD co-saliency framework, which utilizes the
existing single saliency maps as the initialization, and generates the final
RGBD cosaliency map by using a refinement-cycle model. Three schemes are
employed in the proposed RGBD co-saliency framework, which include the addition
scheme, deletion scheme, and iteration scheme. The addition scheme is used to
highlight the salient regions based on intra-image depth propagation and
saliency propagation, while the deletion scheme filters the saliency regions
and removes the non-common salient regions based on interimage constraint. The
iteration scheme is proposed to obtain more homogeneous and consistent
co-saliency map. Furthermore, a novel descriptor, named depth shape prior, is
proposed in the addition scheme to introduce the depth information to enhance
identification of co-salient objects. The proposed method can effectively
exploit any existing 2D saliency model to work well in RGBD co-saliency
scenarios. The experiments on two RGBD cosaliency datasets demonstrate the
effectiveness of our proposed framework.Comment: 13 pages, 13 figures, Accepted by IEEE Transactions on Cybernetics
2017. Project URL: https://rmcong.github.io/proj_RGBD_cosal_tcyb.htm
DISC: Deep Image Saliency Computing via Progressive Representation Learning
Salient object detection increasingly receives attention as an important
component or step in several pattern recognition and image processing tasks.
Although a variety of powerful saliency models have been intensively proposed,
they usually involve heavy feature (or model) engineering based on priors (or
assumptions) about the properties of objects and backgrounds. Inspired by the
effectiveness of recently developed feature learning, we provide a novel Deep
Image Saliency Computing (DISC) framework for fine-grained image saliency
computing. In particular, we model the image saliency from both the coarse- and
fine-level observations, and utilize the deep convolutional neural network
(CNN) to learn the saliency representation in a progressive manner.
Specifically, our saliency model is built upon two stacked CNNs. The first CNN
generates a coarse-level saliency map by taking the overall image as the input,
roughly identifying saliency regions in the global context. Furthermore, we
integrate superpixel-based local context information in the first CNN to refine
the coarse-level saliency map. Guided by the coarse saliency map, the second
CNN focuses on the local context to produce fine-grained and accurate saliency
map while preserving object details. For a testing image, the two CNNs
collaboratively conduct the saliency computing in one shot. Our DISC framework
is capable of uniformly highlighting the objects-of-interest from complex
background while preserving well object details. Extensive experiments on
several standard benchmarks suggest that DISC outperforms other
state-of-the-art methods and it also generalizes well across datasets without
additional training. The executable version of DISC is available online:
http://vision.sysu.edu.cn/projects/DISC.Comment: This manuscript is the accepted version for IEEE Transactions on
Neural Networks and Learning Systems (T-NNLS), 201
Video Saliency Detection Using Object Proposals
In this paper, we introduce a novel approach to identify salient object regions in videos via object proposals. The core idea is to solve the saliency detection problem by ranking and selecting the salient proposals based on object-level saliency cues. Object proposals offer a more complete and high-level representation, which naturally caters to the needs of salient object detection. As well as introducing this novel solution for video salient object detection, we reorganize various discriminative saliency cues and traditional saliency assumptions on object proposals. With object candidates, a proposal ranking and voting scheme, based on various object-level saliency cues, is designed to screen out nonsalient parts, select salient object regions, and to infer an initial saliency estimate. Then a saliency optimization process that considers temporal consistency and appearance differences between salient and nonsalient regions is used to refine the initial saliency estimates. Our experiments on public datasets (SegTrackV2, Freiburg-Berkeley Motion Segmentation Dataset, and Densely Annotated Video Segmentation) validate the effectiveness, and the proposed method produces significant improvements over state-of-the-art algorithms
Pornographic Image Recognition via Weighted Multiple Instance Learning
In the era of Internet, recognizing pornographic images is of great
significance for protecting children's physical and mental health. However,
this task is very challenging as the key pornographic contents (e.g., breast
and private part) in an image often lie in local regions of small size. In this
paper, we model each image as a bag of regions, and follow a multiple instance
learning (MIL) approach to train a generic region-based recognition model.
Specifically, we take into account the region's degree of pornography, and make
three main contributions. First, we show that based on very few annotations of
the key pornographic contents in a training image, we can generate a bag of
properly sized regions, among which the potential positive regions usually
contain useful contexts that can aid recognition. Second, we present a simple
quantitative measure of a region's degree of pornography, which can be used to
weigh the importance of different regions in a positive image. Third, we
formulate the recognition task as a weighted MIL problem under the
convolutional neural network framework, with a bag probability function
introduced to combine the importance of different regions. Experiments on our
newly collected large scale dataset demonstrate the effectiveness of the
proposed method, achieving an accuracy with 97.52% true positive rate at 1%
false positive rate, tested on 100K pornographic images and 100K normal images.Comment: 9 pages, 3 figure