21,338 research outputs found
Action recognition using deep learning
PhDIn this thesis we study deep learning architectures for the problem of human action
recognition in image sequences, i.e. the problem of automatically recognizing what
people are doing in a given video. As unlabeled video data is easily accessible these
days, we first explore models that can learn meaningful representations of sequences
without actually having to know what is happening in the sequences at hand. More
specifically, we first explore the convolutional restricted Boltzmann machine (RBM)
and show how a stack of convolutional RBMs can be used to learn and extract features
from sequences in an unsupervised way. Using the classical Fisher vector pipeline
to encode the extracted features we apply them on the task of action classification.
We move on to feature extraction using larger, deep convolutional neural networks
and propose a novel architecture which expresses the processing steps of the classical
Fisher vector pipeline as network layers. By contrast to other methods where these
steps are performed consecutively and the corresponding parameters are learned in
an unsupervised manner, defining them as a single neural network allows us to refine
the whole model discriminatively in an end to end fashion. We show that our
method achieves significant improvements in comparison to the classical Fisher vector
extraction chain and results in a comparable performance to other convolutional networks,
while largely reducing the number of required trainable parameters. Finally,
we explore how the proposed architecture can be modified into a hybrid network that
combines the benefits of both unsupervised and supervised training methods, resulting
in a model that learns a semi-supervised Fisher vector descriptor of the input data.
We evaluate the proposed model at image classification and action recognition problems
and show how the model's classification performance improves as the amount of
unlabeled data increases during training
Weakly-Supervised Neural Text Classification
Deep neural networks are gaining increasing popularity for the classic text
classification task, due to their strong expressive power and less requirement
for feature engineering. Despite such attractiveness, neural text
classification models suffer from the lack of training data in many real-world
applications. Although many semi-supervised and weakly-supervised text
classification models exist, they cannot be easily applied to deep neural
models and meanwhile support limited supervision types. In this paper, we
propose a weakly-supervised method that addresses the lack of training data in
neural text classification. Our method consists of two modules: (1) a
pseudo-document generator that leverages seed information to generate
pseudo-labeled documents for model pre-training, and (2) a self-training module
that bootstraps on real unlabeled data for model refinement. Our method has the
flexibility to handle different types of weak supervision and can be easily
integrated into existing deep neural models for text classification. We have
performed extensive experiments on three real-world datasets from different
domains. The results demonstrate that our proposed method achieves inspiring
performance without requiring excessive training data and outperforms baseline
methods significantly.Comment: CIKM 2018 Full Pape
Unsupervised spectral sub-feature learning for hyperspectral image classification
Spectral pixel classification is one of the principal techniques used in hyperspectral image (HSI) analysis. In this article, we propose an unsupervised feature learning method for classification of hyperspectral images. The proposed method learns a dictionary of sub-feature basis representations from the spectral domain, which allows effective use of the correlated spectral data. The learned dictionary is then used in encoding convolutional samples from the hyperspectral input pixels to an expanded but sparse feature space. Expanded hyperspectral feature representations enable linear separation between object classes present in an image. To evaluate the proposed method, we performed experiments on several commonly used HSI data sets acquired at different locations and by different sensors. Our experimental results show that the proposed method outperforms other pixel-wise classification methods that make use of unsupervised feature extraction approaches. Additionally, even though our approach does not use any prior knowledge, or labelled training data to learn features, it yields either advantageous, or comparable, results in terms of classification accuracy with respect to recent semi-supervised methods
Detection of Review Abuse via Semi-Supervised Binary Multi-Target Tensor Decomposition
Product reviews and ratings on e-commerce websites provide customers with
detailed insights about various aspects of the product such as quality,
usefulness, etc. Since they influence customers' buying decisions, product
reviews have become a fertile ground for abuse by sellers (colluding with
reviewers) to promote their own products or to tarnish the reputation of
competitor's products. In this paper, our focus is on detecting such abusive
entities (both sellers and reviewers) by applying tensor decomposition on the
product reviews data. While tensor decomposition is mostly unsupervised, we
formulate our problem as a semi-supervised binary multi-target tensor
decomposition, to take advantage of currently known abusive entities. We
empirically show that our multi-target semi-supervised model achieves higher
precision and recall in detecting abusive entities as compared to unsupervised
techniques. Finally, we show that our proposed stochastic partial natural
gradient inference for our model empirically achieves faster convergence than
stochastic gradient and Online-EM with sufficient statistics.Comment: Accepted to the 25th ACM SIGKDD Conference on Knowledge Discovery and
Data Mining, 2019. Contains supplementary material. arXiv admin note: text
overlap with arXiv:1804.0383
- …