18,197 research outputs found
Structure-Feature based Graph Self-adaptive Pooling
Various methods to deal with graph data have been proposed in recent years.
However, most of these methods focus on graph feature aggregation rather than
graph pooling. Besides, the existing top-k selection graph pooling methods have
a few problems. First, to construct the pooled graph topology, current top-k
selection methods evaluate the importance of the node from a single perspective
only, which is simplistic and unobjective. Second, the feature information of
unselected nodes is directly lost during the pooling process, which inevitably
leads to a massive loss of graph feature information. To solve these problems
mentioned above, we propose a novel graph self-adaptive pooling method with the
following objectives: (1) to construct a reasonable pooled graph topology,
structure and feature information of the graph are considered simultaneously,
which provide additional veracity and objectivity in node selection; and (2) to
make the pooled nodes contain sufficiently effective graph information, node
feature information is aggregated before discarding the unimportant nodes;
thus, the selected nodes contain information from neighbor nodes, which can
enhance the use of features of the unselected nodes. Experimental results on
four different datasets demonstrate that our method is effective in graph
classification and outperforms state-of-the-art graph pooling methods.Comment: 7 pages, 4 figures, The Web Conference 202
Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re-identification
Person re-identification (re-id) aims to match pedestrians observed by
disjoint camera views. It attracts increasing attention in computer vision due
to its importance to surveillance system. To combat the major challenge of
cross-view visual variations, deep embedding approaches are proposed by
learning a compact feature space from images such that the Euclidean distances
correspond to their cross-view similarity metric. However, the global Euclidean
distance cannot faithfully characterize the ideal similarity in a complex
visual feature space because features of pedestrian images exhibit unknown
distributions due to large variations in poses, illumination and occlusion.
Moreover, intra-personal training samples within a local range are robust to
guide deep embedding against uncontrolled variations, which however, cannot be
captured by a global Euclidean distance. In this paper, we study the problem of
person re-id by proposing a novel sampling to mine suitable \textit{positives}
(i.e. intra-class) within a local range to improve the deep embedding in the
context of large intra-class variations. Our method is capable of learning a
deep similarity metric adaptive to local sample structure by minimizing each
sample's local distances while propagating through the relationship between
samples to attain the whole intra-class minimization. To this end, a novel
objective function is proposed to jointly optimize similarity metric learning,
local positive mining and robust deep embedding. This yields local
discriminations by selecting local-ranged positive samples, and the learned
features are robust to dramatic intra-class variations. Experiments on
benchmarks show state-of-the-art results achieved by our method.Comment: Published on Pattern Recognitio
SANet: Structure-Aware Network for Visual Tracking
Convolutional neural network (CNN) has drawn increasing interest in visual
tracking owing to its powerfulness in feature extraction. Most existing
CNN-based trackers treat tracking as a classification problem. However, these
trackers are sensitive to similar distractors because their CNN models mainly
focus on inter-class classification. To address this problem, we use
self-structure information of object to distinguish it from distractors.
Specifically, we utilize recurrent neural network (RNN) to model object
structure, and incorporate it into CNN to improve its robustness to similar
distractors. Considering that convolutional layers in different levels
characterize the object from different perspectives, we use multiple RNNs to
model object structure in different levels respectively. Extensive experiments
on three benchmarks, OTB100, TC-128 and VOT2015, show that the proposed
algorithm outperforms other methods. Code is released at
http://www.dabi.temple.edu/~hbling/code/SANet/SANet.html.Comment: In CVPR Deep Vision Workshop, 201
- …