5,216 research outputs found
Learnable Pooling Regions for Image Classification
Biologically inspired, from the early HMAX model to Spatial Pyramid Matching, pooling has played an important role in visual recognition pipelines. Spatial pooling, by grouping of local codes, equips these methods with a certain degree of robustness to translation and deformation yet preserving important spatial information. Despite the predominance of this approach in current recognition systems, we have seen little progress to fully adapt the pooling strategy to the task at hand. This paper proposes a model for learning task dependent pooling scheme -- including previously proposed hand-crafted pooling schemes as a particular instantiation. In our work, we investigate the role of different regularization terms showing that the smooth regularization term is crucial to achieve strong performance using the presented architecture. Finally, we propose an efficient and parallel method to train the model. Our experiments show improved performance over hand-crafted pooling schemes on the CIFAR-10 and CIFAR-100 datasets -- in particular improving the state-of-the-art to 56.29% on the latter
Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
Unlike on images, semantic learning on 3D point clouds using a deep network
is challenging due to the naturally unordered data structure. Among existing
works, PointNet has achieved promising results by directly learning on point
sets. However, it does not take full advantage of a point's local neighborhood
that contains fine-grained structural information which turns out to be helpful
towards better semantic learning. In this regard, we present two new operations
to improve PointNet with a more efficient exploitation of local structures. The
first one focuses on local 3D geometric structures. In analogy to a convolution
kernel for images, we define a point-set kernel as a set of learnable 3D points
that jointly respond to a set of neighboring data points according to their
geometric affinities measured by kernel correlation, adapted from a similar
technique for point cloud registration. The second one exploits local
high-dimensional feature structures by recursive feature aggregation on a
nearest-neighbor-graph computed from 3D positions. Experiments show that our
network can efficiently capture local information and robustly achieve better
performances on major datasets. Our code is available at
http://www.merl.com/research/license#KCNetComment: Accepted in CVPR'18. *indicates equal contributio
- …