11,913 research outputs found
Compare More Nuanced:Pairwise Alignment Bilinear Network For Few-shot Fine-grained Learning
The recognition ability of human beings is developed in a progressive way.
Usually, children learn to discriminate various objects from coarse to
fine-grained with limited supervision. Inspired by this learning process, we
propose a simple yet effective model for the Few-Shot Fine-Grained (FSFG)
recognition, which tries to tackle the challenging fine-grained recognition
task using meta-learning. The proposed method, named Pairwise Alignment
Bilinear Network (PABN), is an end-to-end deep neural network. Unlike
traditional deep bilinear networks for fine-grained classification, which adopt
the self-bilinear pooling to capture the subtle features of images, the
proposed model uses a novel pairwise bilinear pooling to compare the nuanced
differences between base images and query images for learning a deep distance
metric. In order to match base image features with query image features, we
design feature alignment losses before the proposed pairwise bilinear pooling.
Experiment results on four fine-grained classification datasets and one generic
few-shot dataset demonstrate that the proposed model outperforms both the
state-ofthe-art few-shot fine-grained and general few-shot methods.Comment: ICME 2019 Ora
BiRA-Net: Bilinear Attention Net for Diabetic Retinopathy Grading
Diabetic retinopathy (DR) is a common retinal disease that leads to
blindness. For diagnosis purposes, DR image grading aims to provide automatic
DR grade classification, which is not addressed in conventional research
methods of binary DR image classification. Small objects in the eye images,
like lesions and microaneurysms, are essential to DR grading in medical
imaging, but they could easily be influenced by other objects. To address these
challenges, we propose a new deep learning architecture, called BiRA-Net, which
combines the attention model for feature extraction and bilinear model for
fine-grained classification. Furthermore, in considering the distance between
different grades of different DR categories, we propose a new loss function,
called grading loss, which leads to improved training convergence of the
proposed approach. Experimental results are provided to demonstrate the
superior performance of the proposed approach.Comment: Accepted at ICIP 201
The Devil is in the Decoder: Classification, Regression and GANs
Many machine vision applications, such as semantic segmentation and depth
prediction, require predictions for every pixel of the input image. Models for
such problems usually consist of encoders which decrease spatial resolution
while learning a high-dimensional representation, followed by decoders who
recover the original input resolution and result in low-dimensional
predictions. While encoders have been studied rigorously, relatively few
studies address the decoder side. This paper presents an extensive comparison
of a variety of decoders for a variety of pixel-wise tasks ranging from
classification, regression to synthesis. Our contributions are: (1) Decoders
matter: we observe significant variance in results between different types of
decoders on various problems. (2) We introduce new residual-like connections
for decoders. (3) We introduce a novel decoder: bilinear additive upsampling.
(4) We explore prediction artifacts
- …