31 research outputs found
Adaptive Graph Convolutional Network with Attention Graph Clustering for Co-saliency Detection
Co-saliency detection aims to discover the common and salient foregrounds
from a group of relevant images. For this task, we present a novel adaptive
graph convolutional network with attention graph clustering (GCAGC). Three
major contributions have been made, and are experimentally shown to have
substantial practical merits. First, we propose a graph convolutional network
design to extract information cues to characterize the intra- and interimage
correspondence. Second, we develop an attention graph clustering algorithm to
discriminate the common objects from all the salient foreground objects in an
unsupervised fashion. Third, we present a unified framework with
encoder-decoder structure to jointly train and optimize the graph convolutional
network, attention graph cluster, and co-saliency detection decoder in an
end-to-end manner. We evaluate our proposed GCAGC method on three cosaliency
detection benchmark datasets (iCoseg, Cosal2015 and COCO-SEG). Our GCAGC method
obtains significant improvements over the state-of-the-arts on most of them.Comment: CVPR202
GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector
In this paper, we present a novel end-to-end group collaborative learning
network, termed GCoNet+, which can effectively and efficiently (250 fps)
identify co-salient objects in natural scenes. The proposed GCoNet+ achieves
the new state-of-the-art performance for co-salient object detection (CoSOD)
through mining consensus representations based on the following two essential
criteria: 1) intra-group compactness to better formulate the consistency among
co-salient objects by capturing their inherent shared attributes using our
novel group affinity module (GAM); 2) inter-group separability to effectively
suppress the influence of noisy objects on the output by introducing our new
group collaborating module (GCM) conditioning on the inconsistent consensus. To
further improve the accuracy, we design a series of simple yet effective
components as follows: i) a recurrent auxiliary classification module (RACM)
promoting the model learning at the semantic level; ii) a confidence
enhancement module (CEM) helping the model to improve the quality of the final
predictions; and iii) a group-based symmetric triplet (GST) loss guiding the
model to learn more discriminative features. Extensive experiments on three
challenging benchmarks, i.e., CoCA, CoSOD3k, and CoSal2015, demonstrate that
our GCoNet+ outperforms the existing 12 cutting-edge models. Code has been
released at https://github.com/ZhengPeng7/GCoNet_plus
Multi-modal gated recurrent units for image description
Using a natural language sentence to describe the content of an image is a
challenging but very important task. It is challenging because a description
must not only capture objects contained in the image and the relationships
among them, but also be relevant and grammatically correct. In this paper a
multi-modal embedding model based on gated recurrent units (GRU) which can
generate variable-length description for a given image. In the training step,
we apply the convolutional neural network (CNN) to extract the image feature.
Then the feature is imported into the multi-modal GRU as well as the
corresponding sentence representations. The multi-modal GRU learns the
inter-modal relations between image and sentence. And in the testing step, when
an image is imported to our multi-modal GRU model, a sentence which describes
the image content is generated. The experimental results demonstrate that our
multi-modal GRU model obtains the state-of-the-art performance on Flickr8K,
Flickr30K and MS COCO datasets.Comment: 25 pages, 7 figures, 6 tables, magazin
Substantial Phase Exploration for Intuiting Covid using form Expedient with Variance Sensor
This article focuses on implementing wireless sensors for monitoring exact distance between two individuals and to check whether everybody have sanitized their hands for stopping the spread of Corona Virus Disease (COVID). The idea behind this method is executed by implementing an objective function which focuses on maximizing distance, energy of nodes and minimizing the cost of implementation. Also, the proposed model is integrated with a variance detector which is denoted as Controlled Incongruity Algorithm (CIA). This variance detector is will sense the value and it will report to an online monitoring system named Things speak and for visualizing the sensed values it will be simulated using MATLAB. Even loss which is produced by sensors is found to be low when CIA is implemented. To validate the efficiency of proposed method it has been compared with prevailing methods and results prove that the better performance is obtained and the proposed method is improved by 76.8% than other outcomes observed from existing literatures
Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection
Top-down saliency models produce a probability map that peaks at target
locations specified by a task/goal such as object detection. They are usually
trained in a fully supervised setting involving pixel-level annotations of
objects. We propose a weakly supervised top-down saliency framework using only
binary labels that indicate the presence/absence of an object in an image.
First, the probabilistic contribution of each image region to the confidence of
a CNN-based image classifier is computed through a backtracking strategy to
produce top-down saliency. From a set of saliency maps of an image produced by
fast bottom-up saliency approaches, we select the best saliency map suitable
for the top-down task. The selected bottom-up saliency map is combined with the
top-down saliency map. Features having high combined saliency are used to train
a linear SVM classifier to estimate feature saliency. This is integrated with
combined saliency and further refined through a multi-scale
superpixel-averaging of saliency map. We evaluate the performance of the
proposed weakly supervised topdown saliency and achieve comparable performance
with fully supervised approaches. Experiments are carried out on seven
challenging datasets and quantitative results are compared with 40 closely
related approaches across 4 different applications.Comment: 14 pages, 7 figure
Visual Tracking by Sampling in Part Space
In this paper, we present a novel part-based visual tracking method from the perspective of probability sampling. Specifically, we represent the target by a part space with two online learned probabilities to capture the structure of the target. The proposal distribution memorizes the historical performance of different parts, and it is used for the first round of part selection. The acceptance probability validates the specific tracking stability of each part in a frame, and it determines whether to accept its vote or to reject it. By doing this, we transform the complex online part selection problem into a probability learning one, which is easier to tackle. The observation model of each part is constructed by an improved supervised descent method and is learned in an incremental manner. Experimental results on two benchmarks demonstrate the competitive performance of our tracker against state-of-the-art methods