16,673 research outputs found
Cross-convolutional-layer Pooling for Image Recognition
Recent studies have shown that a Deep Convolutional Neural Network (DCNN)
pretrained on a large image dataset can be used as a universal image
descriptor, and that doing so leads to impressive performance for a variety of
image classification tasks. Most of these studies adopt activations from a
single DCNN layer, usually the fully-connected layer, as the image
representation. In this paper, we proposed a novel way to extract image
representations from two consecutive convolutional layers: one layer is
utilized for local feature extraction and the other serves as guidance to pool
the extracted features. By taking different viewpoints of convolutional layers,
we further develop two schemes to realize this idea. The first one directly
uses convolutional layers from a DCNN. The second one applies the pretrained
CNN on densely sampled image regions and treats the fully-connected activations
of each image region as convolutional feature activations. We then train
another convolutional layer on top of that as the pooling-guidance
convolutional layer. By applying our method to three popular visual
classification tasks, we find our first scheme tends to perform better on the
applications which need strong discrimination on subtle object patterns within
small regions while the latter excels in the cases that require discrimination
on category-level patterns. Overall, the proposed method achieves superior
performance over existing ways of extracting image representations from a DCNN.Comment: Fixed typos. Journal extension of arXiv:1411.7466. Accepted to IEEE
Transactions on Pattern Analysis and Machine Intelligenc
ECG Arrhythmia Classification Using Transfer Learning from 2-Dimensional Deep CNN Features
Due to the recent advances in the area of deep learning, it has been
demonstrated that a deep neural network, trained on a huge amount of data, can
recognize cardiac arrhythmias better than cardiologists. Moreover,
traditionally feature extraction was considered an integral part of ECG pattern
recognition; however, recent findings have shown that deep neural networks can
carry out the task of feature extraction directly from the data itself. In
order to use deep neural networks for their accuracy and feature extraction,
high volume of training data is required, which in the case of independent
studies is not pragmatic. To arise to this challenge, in this work, the
identification and classification of four ECG patterns are studied from a
transfer learning perspective, transferring knowledge learned from the image
classification domain to the ECG signal classification domain. It is
demonstrated that feature maps learned in a deep neural network trained on
great amounts of generic input images can be used as general descriptors for
the ECG signal spectrograms and result in features that enable classification
of arrhythmias. Overall, an accuracy of 97.23 percent is achieved in
classifying near 7000 instances by ten-fold cross validation.Comment: Accepted and presented for IEEE Biomedical Circuits and Systems
(BioCAS) on 17th-19th October 2018 in Ohio, US
Smile detection in the wild based on transfer learning
Smile detection from unconstrained facial images is a specialized and
challenging problem. As one of the most informative expressions, smiles convey
basic underlying emotions, such as happiness and satisfaction, which lead to
multiple applications, e.g., human behavior analysis and interactive
controlling. Compared to the size of databases for face recognition, far less
labeled data is available for training smile detection systems. To leverage the
large amount of labeled data from face recognition datasets and to alleviate
overfitting on smile detection, an efficient transfer learning-based smile
detection approach is proposed in this paper. Unlike previous works which use
either hand-engineered features or train deep convolutional networks from
scratch, a well-trained deep face recognition model is explored and fine-tuned
for smile detection in the wild. Three different models are built as a result
of fine-tuning the face recognition model with different inputs, including
aligned, unaligned and grayscale images generated from the GENKI-4K dataset.
Experiments show that the proposed approach achieves improved state-of-the-art
performance. Robustness of the model to noise and blur artifacts is also
evaluated in this paper
- …