5,773 research outputs found
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data
Data mixing augmentation has proved effective in training deep models. Recent
methods mix labels mainly based on the mixture proportion of image pixels. As
the main discriminative information of a fine-grained image usually resides in
subtle regions, methods along this line are prone to heavy label noise in
fine-grained recognition. We propose in this paper a novel scheme, termed as
Semantically Proportional Mixing (SnapMix), which exploits class activation map
(CAM) to lessen the label noise in augmenting fine-grained data. SnapMix
generates the target label for a mixed image by estimating its intrinsic
semantic composition, and allows for asymmetric mixing operations and ensures
semantic correspondence between synthetic images and target labels. Experiments
show that our method consistently outperforms existing mixed-based approaches
on various datasets and under different network depths. Furthermore, by
incorporating the mid-level features, the proposed SnapMix achieves top-level
performance, demonstrating its potential to serve as a solid baseline for
fine-grained recognition. Our code is available at
https://github.com/Shaoli-Huang/SnapMix.git.Comment: Accepted by AAAI202
Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
Unlike on images, semantic learning on 3D point clouds using a deep network
is challenging due to the naturally unordered data structure. Among existing
works, PointNet has achieved promising results by directly learning on point
sets. However, it does not take full advantage of a point's local neighborhood
that contains fine-grained structural information which turns out to be helpful
towards better semantic learning. In this regard, we present two new operations
to improve PointNet with a more efficient exploitation of local structures. The
first one focuses on local 3D geometric structures. In analogy to a convolution
kernel for images, we define a point-set kernel as a set of learnable 3D points
that jointly respond to a set of neighboring data points according to their
geometric affinities measured by kernel correlation, adapted from a similar
technique for point cloud registration. The second one exploits local
high-dimensional feature structures by recursive feature aggregation on a
nearest-neighbor-graph computed from 3D positions. Experiments show that our
network can efficiently capture local information and robustly achieve better
performances on major datasets. Our code is available at
http://www.merl.com/research/license#KCNetComment: Accepted in CVPR'18. *indicates equal contributio
- …