434 research outputs found

    Efficient deformable filter banks

    Get PDF
    This article describes efficient schemes for the computation of a large number of differently scaled/oriented filtered versions of an image. We generalize the well-known steerable/scalable (“deformable”) filter bank structure by imposing X-Y separability on the basis filters. The resulting systems, designed by an iterative projections technique, achieve substantial reduction of the computational cost. To reduce the memory requirement, we adopt a multirate implementation. Due to the inner sampling rate alteration, the resulting structure is not shift invariant. We introduce a design criterion for multirate deformable structures that jointly controls the approximation error and the shift variance

    Improving the Resolution of CNN Feature Maps Efficiently with Multisampling

    Full text link
    We describe a new class of subsampling techniques for CNNs, termed multisampling, that significantly increases the amount of information kept by feature maps through subsampling layers. One version of our method, which we call checkered subsampling, significantly improves the accuracy of state-of-the-art architectures such as DenseNet and ResNet without any additional parameters and, remarkably, improves the accuracy of certain pretrained ImageNet models without any training or fine-tuning. We glean new insight into the nature of data augmentations and demonstrate, for the first time, that coarse feature maps are significantly bottlenecking the performance of neural networks in image classification.Comment: Preprin

    Aerial Imagery Pixel-level Segmentation

    Get PDF

    SFNet: Learning Object-aware Semantic Correspondence

    Get PDF
    We address the problem of semantic correspondence, that is, establishing a dense flow field between images depicting different instances of the same object or scene category. We propose to use images annotated with binary foreground masks and subjected to synthetic geometric deformations to train a convolutional neural network (CNN) for this task. Using these masks as part of the supervisory signal offers a good compromise between semantic flow methods, where the amount of training data is limited by the cost of manually selecting point correspondences, and semantic alignment ones, where the regression of a single global geometric transformation between images may be sensitive to image-specific details such as background clutter. We propose a new CNN architecture, dubbed SFNet, which implements this idea. It leverages a new and differentiable version of the argmax function for end-to-end training, with a loss that combines mask and flow consistency with smoothness terms. Experimental results demonstrate the effectiveness of our approach, which significantly outperforms the state of the art on standard benchmarks.Comment: cvpr 2019 oral pape
    corecore