15,970 research outputs found
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at this http URL
Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery
Fine-grained object recognition that aims to identify the type of an object
among a large number of subcategories is an emerging application with the
increasing resolution that exposes new details in image data. Traditional fully
supervised algorithms fail to handle this problem where there is low
between-class variance and high within-class variance for the classes of
interest with small sample sizes. We study an even more extreme scenario named
zero-shot learning (ZSL) in which no training example exists for some of the
classes. ZSL aims to build a recognition model for new unseen categories by
relating them to seen classes that were previously learned. We establish this
relation by learning a compatibility function between image features extracted
via a convolutional neural network and auxiliary information that describes the
semantics of the classes of interest by using training samples from the seen
classes. Then, we show how knowledge transfer can be performed for the unseen
classes by maximizing this function during inference. We introduce a new data
set that contains 40 different types of street trees in 1-ft spatial resolution
aerial data, and evaluate the performance of this model with manually annotated
attributes, a natural language model, and a scientific taxonomy as auxiliary
information. The experiments show that the proposed model achieves 14.3%
recognition accuracy for the classes with no training examples, which is
significantly better than a random guess accuracy of 6.3% for 16 test classes,
and three other ZSL algorithms.Comment: G. Sumbul, R. G. Cinbis, S. Aksoy, "Fine-Grained Object Recognition
and Zero-Shot Learning in Remote Sensing Imagery", IEEE Transactions on
Geoscience and Remote Sensing (TGRS), in press, 201
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Learning models for semantic classification of insufficient plantar pressure images
Establishing a reliable and stable model to predict a target by using insufficient labeled samples is feasible and
effective, particularly, for a sensor-generated data-set. This paper has been inspired with insufficient data-set
learning algorithms, such as metric-based, prototype networks and meta-learning, and therefore we propose
an insufficient data-set transfer model learning method. Firstly, two basic models for transfer learning are
introduced. A classification system and calculation criteria are then subsequently introduced. Secondly, a dataset
of plantar pressure for comfort shoe design is acquired and preprocessed through foot scan system; and by
using a pre-trained convolution neural network employing AlexNet and convolution neural network (CNN)-
based transfer modeling, the classification accuracy of the plantar pressure images is over 93.5%. Finally,
the proposed method has been compared to the current classifiers VGG, ResNet, AlexNet and pre-trained
CNN. Also, our work is compared with known-scaling and shifting (SS) and unknown-plain slot (PS) partition
methods on the public test databases: SUN, CUB, AWA1, AWA2, and aPY with indices of precision (tr, ts, H)
and time (training and evaluation). The proposed method for the plantar pressure classification task shows high
performance in most indices when comparing with other methods. The transfer learning-based method can be
applied to other insufficient data-sets of sensor imaging fields
- …