1,383 research outputs found
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
We propose a novel attention based deep learning architecture for visual
question answering task (VQA). Given an image and an image related natural
language question, VQA generates the natural language answer for the question.
Generating the correct answers requires the model's attention to focus on the
regions corresponding to the question, because different questions inquire
about the attributes of different image regions. We introduce an attention
based configurable convolutional neural network (ABC-CNN) to learn such
question-guided attention. ABC-CNN determines an attention map for an
image-question pair by convolving the image feature map with configurable
convolutional kernels derived from the question's semantics. We evaluate the
ABC-CNN architecture on three benchmark VQA datasets: Toronto COCO-QA, DAQUAR,
and VQA dataset. ABC-CNN model achieves significant improvements over
state-of-the-art methods on these datasets. The question-guided attention
generated by ABC-CNN is also shown to reflect the regions that are highly
relevant to the questions
Quality Aware Network for Set to Set Recognition
This paper targets on the problem of set to set recognition, which learns the
metric between two image sets. Images in each set belong to the same identity.
Since images in a set can be complementary, they hopefully lead to higher
accuracy in practical applications. However, the quality of each sample cannot
be guaranteed, and samples with poor quality will hurt the metric. In this
paper, the quality aware network (QAN) is proposed to confront this problem,
where the quality of each sample can be automatically learned although such
information is not explicitly provided in the training stage. The network has
two branches, where the first branch extracts appearance feature embedding for
each sample and the other branch predicts quality score for each sample.
Features and quality scores of all samples in a set are then aggregated to
generate the final feature embedding. We show that the two branches can be
trained in an end-to-end manner given only the set-level identity annotation.
Analysis on gradient spread of this mechanism indicates that the quality
learned by the network is beneficial to set-to-set recognition and simplifies
the distribution that the network needs to fit. Experiments on both face
verification and person re-identification show advantages of the proposed QAN.
The source code and network structure can be downloaded at
https://github.com/sciencefans/Quality-Aware-Network.Comment: Accepted at CVPR 201
Fine-grained ship image recognition based on BCNN with inception and AM-Softmax
The fine-grained ship image recognition task aims to identify various classes of ships. However, small inter-class, large intra-class differences between ships, and lacking of training samples are the reasons that make the task difficult. Therefore, to enhance the accuracy of the fine-grained ship image recognition, we design a fine-grained ship image recognition network based on bilinear convolutional neural network (BCNN) with Inception and additive margin Softmax (AM-Softmax). This network improves the BCNN in two aspects. Firstly, by introducing Inception branches to the BCNN network, it is helpful to enhance the ability of extracting comprehensive features from ships. Secondly, by adding margin values to the decision boundary, the AM-Softmax function can better extend the inter-class differences and reduce the intra-class differences. In addition, as there are few publicly available datasets for fine-grained ship image recognition, we construct a Ship-43 dataset containing 47,300 ship images belonging to 43 categories. Experimental results on the constructed Ship-43 dataset demonstrate that our method can effectively improve the accuracy of ship image recognition, which is 4.08% higher than the BCNN model. Moreover, comparison results on the other three public fine-grained datasets (Cub, Cars, and Aircraft) further validate the effectiveness of the proposed method
- …