14 research outputs found
LooseCut: Interactive Image Segmentation with Loosely Bounded Boxes
One popular approach to interactively segment the foreground object of
interest from an image is to annotate a bounding box that covers the foreground
object. Then, a binary labeling is performed to achieve a refined segmentation.
One major issue of the existing algorithms for such interactive image
segmentation is their preference of an input bounding box that tightly encloses
the foreground object. This increases the annotation burden, and prevents these
algorithms from utilizing automatically detected bounding boxes. In this paper,
we develop a new LooseCut algorithm that can handle cases where the input
bounding box only loosely covers the foreground object. We propose a new Markov
Random Fields (MRF) model for segmentation with loosely bounded boxes,
including a global similarity constraint to better distinguish the foreground
and background, and an additional energy term to encourage consistent labeling
of similar-appearance pixels. This MRF model is then solved by an iterated
max-flow algorithm. In the experiments, we evaluate LooseCut in three
publicly-available image datasets, and compare its performance against several
state-of-the-art interactive image segmentation algorithms. We also show that
LooseCut can be used for enhancing the performance of unsupervised video
segmentation and image saliency detection
Image Co-segmentation via Multi-scale Local Shape Transfer
Image co-segmentation is a challenging task in computer vision that aims to
segment all pixels of the objects from a predefined semantic category. In
real-world cases, however, common foreground objects often vary greatly in
appearance, making their global shapes highly inconsistent across images and
difficult to be segmented. To address this problem, this paper proposes a novel
co-segmentation approach that transfers patch-level local object shapes which
appear more consistent across different images. In our framework, a multi-scale
patch neighbourhood system is first generated using proposal flow on arbitrary
image-pair, which is further refined by Locally Linear Embedding. Based on the
patch relationships, we propose an efficient algorithm to jointly segment the
objects in each image while transferring their local shapes across different
images. Extensive experiments demonstrate that the proposed method can robustly
and effectively segment common objects from an image set. On iCoseg, MSRC and
Coseg-Rep dataset, the proposed approach performs comparable or better than the
state-of-thearts, while on a more challenging benchmark Fashionista dataset,
our method achieves significant improvements.Comment: An extention of our previous stud
Co-salient Object Detection Based on Deep Saliency Networks and Seed Propagation over an Integrated Graph
This paper presents a co-salient object detection method to find common
salient regions in a set of images. We utilize deep saliency networks to
transfer co-saliency prior knowledge and better capture high-level semantic
information, and the resulting initial co-saliency maps are enhanced by seed
propagation steps over an integrated graph. The deep saliency networks are
trained in a supervised manner to avoid online weakly supervised learning and
exploit them not only to extract high-level features but also to produce both
intra- and inter-image saliency maps. Through a refinement step, the initial
co-saliency maps can uniformly highlight co-salient regions and locate accurate
object boundaries. To handle input image groups inconsistent in size, we
propose to pool multi-regional descriptors including both within-segment and
within-group information. In addition, the integrated multilayer graph is
constructed to find the regions that the previous steps may not detect by seed
propagation with low-level descriptors. In this work, we utilize the useful
complementary components of high-, low-level information, and several
learning-based steps. Our experiments have demonstrated that the proposed
approach outperforms comparable co-saliency detection methods on widely used
public databases and can also be directly applied to co-segmentation tasks.Comment: 13 pages, 10 figures, 3 table
HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images
Co-saliency detection aims to discover common and salient objects in an image
group containing more than two relevant images. Moreover, depth information has
been demonstrated to be effective for many computer vision tasks. In this
paper, we propose a novel co-saliency detection method for RGBD images based on
hierarchical sparsity reconstruction and energy function refinement. With the
assistance of the intra saliency map, the inter-image correspondence is
formulated as a hierarchical sparsity reconstruction framework. The global
sparsity reconstruction model with a ranking scheme focuses on capturing the
global characteristics among the whole image group through a common foreground
dictionary. The pairwise sparsity reconstruction model aims to explore the
corresponding relationship between pairwise images through a set of pairwise
dictionaries. In order to improve the intra-image smoothness and inter-image
consistency, an energy function refinement model is proposed, which includes
the unary data term, spatial smooth term, and holistic consistency term.
Experiments on two RGBD co-saliency detection benchmarks demonstrate that the
proposed method outperforms the state-of-the-art algorithms both qualitatively
and quantitatively.Comment: 11 pages, 5 figures, Accepted by IEEE Transactions on Multimedia,
https://rmcong.github.io
Dominant Sets for "Constrained" Image Segmentation
Image segmentation has come a long way since the early days of computer
vision, and still remains a challenging task. Modern variations of the
classical (purely bottom-up) approach, involve, e.g., some form of user
assistance (interactive segmentation) or ask for the simultaneous segmentation
of two or more images (co-segmentation). At an abstract level, all these
variants can be thought of as "constrained" versions of the original
formulation, whereby the segmentation process is guided by some external source
of information. In this paper, we propose a new approach to tackle this kind of
problems in a unified way. Our work is based on some properties of a family of
quadratic optimization problems related to dominant sets, a well-known
graph-theoretic notion of a cluster which generalizes the concept of a maximal
clique to edge-weighted graphs. In particular, we show that by properly
controlling a regularization parameter which determines the structure and the
scale of the underlying problem, we are in a position to extract groups of
dominant-set clusters that are constrained to contain predefined elements. In
particular, we shall focus on interactive segmentation and co-segmentation (in
both the unsupervised and the interactive versions). The proposed algorithm can
deal naturally with several type of constraints and input modality, including
scribbles, sloppy contours, and bounding boxes, and is able to robustly handle
noisy annotations on the part of the user. Experiments on standard benchmark
datasets show the effectiveness of our approach as compared to state-of-the-art
algorithms on a variety of natural images under several input conditions and
constraints.Comment: arXiv admin note: text overlap with arXiv:1608.0064
A Review of Co-saliency Detection Technique: Fundamentals, Applications, and Challenges
Co-saliency detection is a newly emerging and rapidly growing research area
in computer vision community. As a novel branch of visual saliency, co-saliency
detection refers to the discovery of common and salient foregrounds from two or
more relevant images, and can be widely used in many computer vision tasks. The
existing co-saliency detection algorithms mainly consist of three components:
extracting effective features to represent the image regions, exploring the
informative cues or factors to characterize co-saliency, and designing
effective computational frameworks to formulate co-saliency. Although numerous
methods have been developed, the literature is still lacking a deep review and
evaluation of co-saliency detection techniques. In this paper, we aim at
providing a comprehensive review of the fundamentals, challenges, and
applications of co-saliency detection. Specifically, we provide an overview of
some related computer vision works, review the history of co-saliency
detection, summarize and categorize the major algorithms in this research area,
discuss some open issues in this area, present the potential applications of
co-saliency detection, and finally point out some unsolved challenges and
promising future works. We expect this review to be beneficial to both fresh
and senior researchers in this field, and give insights to researchers in other
related areas regarding the utility of co-saliency detection algorithms.Comment: 28 pages, 12 figures, 3 table
Review of Visual Saliency Detection with Comprehensive Information
Visual saliency detection model simulates the human visual system to perceive
the scene, and has been widely used in many vision tasks. With the acquisition
technology development, more comprehensive information, such as depth cue,
inter-image correspondence, or temporal relationship, is available to extend
image saliency detection to RGBD saliency detection, co-saliency detection, or
video saliency detection. RGBD saliency detection model focuses on extracting
the salient regions from RGBD images by combining the depth information.
Co-saliency detection model introduces the inter-image correspondence
constraint to discover the common salient object in an image group. The goal of
video saliency detection model is to locate the motion-related salient object
in video sequences, which considers the motion cue and spatiotemporal
constraint jointly. In this paper, we review different types of saliency
detection algorithms, summarize the important issues of the existing methods,
and discuss the existent problems and future works. Moreover, the evaluation
datasets and quantitative measurements are briefly introduced, and the
experimental analysis and discission are conducted to provide a holistic
overview of different saliency detection methods.Comment: 18 pages, 11 figures, 7 tables, Accepted by IEEE Transactions on
Circuits and Systems for Video Technology 2018, https://rmcong.github.io
Enhancing Underexposed Photos using Perceptually Bidirectional Similarity
Although remarkable progress has been made, existing methods for enhancing
underexposed photos tend to produce visually unpleasing results due to the
existence of visual artifacts (e.g., color distortion, loss of details and
uneven exposure). We observed that this is because they fail to ensure the
perceptual consistency of visual information between the source underexposed
image and its enhanced output. To obtain high-quality results free of these
artifacts, we present a novel underexposed photo enhancement approach that is
able to maintain the perceptual consistency. We achieve this by proposing an
effective criterion, referred to as perceptually bidirectional similarity,
which explicitly describes how to ensure the perceptual consistency.
Particularly, we adopt the Retinex theory and cast the enhancement problem as a
constrained illumination estimation optimization, where we formulate
perceptually bidirectional similarity as constraints on illumination and solve
for the illumination which can recover the desired artifact-free enhancement
results. In addition, we describe a video enhancement framework that adopts the
presented illumination estimation for handling underexposed videos. To this
end, a probabilistic approach is introduced to propagate illuminations of
sampled keyframes to the entire video by tackling a Bayesian Maximum A
Posteriori problem. Extensive experiments demonstrate the superiority of our
method over the state-of-the-art methods.Comment: Aceepted to IEEE Transactions on Multimedia (TMM
Uncertainty-Aware Unsupervised Domain Adaptation in Object Detection
Unsupervised domain adaptive object detection aims to adapt detectors from a
labelled source domain to an unlabelled target domain. Most existing works take
a two-stage strategy that first generates region proposals and then detects
objects of interest, where adversarial learning is widely adopted to mitigate
the inter-domain discrepancy in both stages. However, adversarial learning may
impair the alignment of well-aligned samples as it merely aligns the global
distributions across domains. To address this issue, we design an
uncertainty-aware domain adaptation network (UaDAN) that introduces conditional
adversarial learning to align well-aligned and poorly-aligned samples
separately in different manners. Specifically, we design an uncertainty metric
that assesses the alignment of each sample and adjusts the strength of
adversarial learning for well-aligned and poorly-aligned samples adaptively. In
addition, we exploit the uncertainty metric to achieve curriculum learning that
first performs easier image-level alignment and then more difficult
instance-level alignment progressively. Extensive experiments over four
challenging domain adaptive object detection datasets show that UaDAN achieves
superior performance as compared with state-of-the-art methods
Co-Saliency Detection with Co-Attention Fully Convolutional Network
Co-saliency detection aims to detect common salient objects from a group of
relevant images. Some attempts have been made with the Fully Convolutional
Network (FCN) framework and achieve satisfactory detection results. However,
due to stacking convolution layers and pooling operation, the boundary details
tend to be lost. In addition, existing models often utilize the extracted
features without discrimination, leading to redundancy in representation since
actually not all features are helpful to the final prediction and some even
bring distraction. In this paper, we propose a co-attention module embedded FCN
framework, called as Co-Attention FCN (CA-FCN). Specifically, the co-attention
module is plugged into the high-level convolution layers of FCN, which can
assign larger attention weights on the common salient objects and smaller ones
on the background and uncommon distractors to boost final detection
performance. Extensive experiments on three popular co-saliency benchmark
datasets demonstrate the superiority of the proposed CA-FCN, which outperforms
state-of-the-arts in most cases. Besides, the effectiveness of our new
co-attention module is also validated with ablation studies