434 research outputs found
Efficient deformable filter banks
This article describes efficient schemes for the computation of a large number of differently scaled/oriented filtered versions of an image. We generalize the well-known steerable/scalable (“deformable”) filter bank structure by imposing X-Y separability on the basis filters. The resulting systems, designed by an iterative projections technique, achieve substantial reduction of the computational cost. To reduce the memory requirement, we adopt a multirate implementation. Due to the inner sampling rate alteration, the resulting structure is not shift invariant. We introduce a design criterion for multirate deformable structures that jointly controls the approximation error and the shift variance
Improving the Resolution of CNN Feature Maps Efficiently with Multisampling
We describe a new class of subsampling techniques for CNNs, termed
multisampling, that significantly increases the amount of information kept by
feature maps through subsampling layers. One version of our method, which we
call checkered subsampling, significantly improves the accuracy of
state-of-the-art architectures such as DenseNet and ResNet without any
additional parameters and, remarkably, improves the accuracy of certain
pretrained ImageNet models without any training or fine-tuning. We glean new
insight into the nature of data augmentations and demonstrate, for the first
time, that coarse feature maps are significantly bottlenecking the performance
of neural networks in image classification.Comment: Preprin
SFNet: Learning Object-aware Semantic Correspondence
We address the problem of semantic correspondence, that is, establishing a
dense flow field between images depicting different instances of the same
object or scene category. We propose to use images annotated with binary
foreground masks and subjected to synthetic geometric deformations to train a
convolutional neural network (CNN) for this task. Using these masks as part of
the supervisory signal offers a good compromise between semantic flow methods,
where the amount of training data is limited by the cost of manually selecting
point correspondences, and semantic alignment ones, where the regression of a
single global geometric transformation between images may be sensitive to
image-specific details such as background clutter. We propose a new CNN
architecture, dubbed SFNet, which implements this idea. It leverages a new and
differentiable version of the argmax function for end-to-end training, with a
loss that combines mask and flow consistency with smoothness terms.
Experimental results demonstrate the effectiveness of our approach, which
significantly outperforms the state of the art on standard benchmarks.Comment: cvpr 2019 oral pape
- …