482,163 research outputs found

    A multi-task learning CNN for image steganalysis

    Get PDF
    Convolutional neural network (CNN) based image steganalysis are increasingly popular because of their superiority in accuracy. The most straightforward way to employ CNN for image steganalysis is to learn a CNN-based classifier to distinguish whether secret messages have been embedded into an image. However, it is difficult to learn such a classifier because of the weak stego signals and the limited useful information. To address this issue, in this paper, a multi-task learning CNN is proposed. In addition to the typical use of CNN, learning a CNN-based classifier for the whole image, our multi-task CNN is learned with an auxiliary task of the pixel binary classification, estimating whether each pixel in an image has been modified due to steganography. To the best of our knowledge, we are the first to employ CNN to perform the pixel-level classification of such type. Experimental results have justified the effectiveness and efficiency of the proposed multi-task learning CNN

    Short-segment heart sound classification using an ensemble of deep convolutional neural networks

    Get PDF
    This paper proposes a framework based on deep convolutional neural networks (CNNs) for automatic heart sound classification using short-segments of individual heart beats. We design a 1D-CNN that directly learns features from raw heart-sound signals, and a 2D-CNN that takes inputs of two- dimensional time-frequency feature maps based on Mel-frequency cepstral coefficients (MFCC). We further develop a time-frequency CNN ensemble (TF-ECNN) combining the 1D-CNN and 2D-CNN based on score-level fusion of the class probabilities. On the large PhysioNet CinC challenge 2016 database, the proposed CNN models outperformed traditional classifiers based on support vector machine and hidden Markov models with various hand-crafted time- and frequency-domain features. Best classification scores with 89.22% accuracy and 89.94% sensitivity were achieved by the ECNN, and 91.55% specificity and 88.82% modified accuracy by the 2D-CNN alone on the test set.Comment: 8 pages, 1 figure, conferenc

    When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition

    Get PDF
    Deep learning, in particular Convolutional Neural Network (CNN), has achieved promising results in face recognition recently. However, it remains an open question: why CNNs work well and how to design a 'good' architecture. The existing works tend to focus on reporting CNN architectures that work well for face recognition rather than investigate the reason. In this work, we conduct an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a common ground to make our work easily reproducible. Specifically, we use public database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing CNNs trained on private databases. We propose three CNN architectures which are the first reported architectures trained using LFW data. This paper quantitatively compares the architectures of CNNs and evaluate the effect of different implementation choices. We identify several useful properties of CNN-FRS. For instance, the dimensionality of the learned features can be significantly reduced without adverse effect on face recognition accuracy. In addition, traditional metric learning method exploiting CNN-learned features is evaluated. Experiments show two crucial factors to good CNN-FRS performance are the fusion of multiple CNNs and metric learning. To make our work reproducible, source code and models will be made publicly available.Comment: 7 pages, 4 figures, 7 table

    Genetic CNN

    Full text link
    The deep Convolutional Neural Network (CNN) is the state-of-the-art solution for large-scale visual recognition. Following basic principles such as increasing the depth and constructing highway connections, researchers have manually designed a lot of fixed network structures and verified their effectiveness. In this paper, we discuss the possibility of learning deep network structures automatically. Note that the number of possible network structures increases exponentially with the number of layers in the network, which inspires us to adopt the genetic algorithm to efficiently traverse this large search space. We first propose an encoding method to represent each network structure in a fixed-length binary string, and initialize the genetic algorithm by generating a set of randomized individuals. In each generation, we define standard genetic operations, e.g., selection, mutation and crossover, to eliminate weak individuals and then generate more competitive ones. The competitiveness of each individual is defined as its recognition accuracy, which is obtained via training the network from scratch and evaluating it on a validation set. We run the genetic process on two small datasets, i.e., MNIST and CIFAR10, demonstrating its ability to evolve and find high-quality structures which are little studied before. These structures are also transferrable to the large-scale ILSVRC2012 dataset.Comment: Submitted to CVPR 2017 (10 pages, 5 figures

    Multi-scale Orderless Pooling of Deep Convolutional Activation Features

    Full text link
    Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition. However, global CNN activations lack geometric invariance, which limits their robustness for classification and matching of highly variable scenes. To improve the invariance of CNN activations without degrading their discriminative power, this paper presents a simple but effective scheme called multi-scale orderless pooling (MOP-CNN). This scheme extracts CNN activations for local patches at multiple scale levels, performs orderless VLAD pooling of these activations at each level separately, and concatenates the result. The resulting MOP-CNN representation can be used as a generic feature for either supervised or unsupervised recognition tasks, from image classification to instance-level retrieval; it consistently outperforms global CNN activations without requiring any joint training of prediction layers for a particular target dataset. In absolute terms, it achieves state-of-the-art results on the challenging SUN397 and MIT Indoor Scenes classification datasets, and competitive results on ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets

    One-to-many face recognition with bilinear CNNs

    Full text link
    The recent explosive growth in convolutional neural network (CNN) research has produced a variety of new architectures for deep learning. One intriguing new architecture is the bilinear CNN (B-CNN), which has shown dramatic performance gains on certain fine-grained recognition problems [15]. We apply this new CNN to the challenging new face recognition benchmark, the IARPA Janus Benchmark A (IJB-A) [12]. It features faces from a large number of identities in challenging real-world conditions. Because the face images were not identified automatically using a computerized face detection system, it does not have the bias inherent in such a database. We demonstrate the performance of the B-CNN model beginning from an AlexNet-style network pre-trained on ImageNet. We then show results for fine-tuning using a moderate-sized and public external database, FaceScrub [17]. We also present results with additional fine-tuning on the limited training data provided by the protocol. In each case, the fine-tuned bilinear model shows substantial improvements over the standard CNN. Finally, we demonstrate how a standard CNN pre-trained on a large face database, the recently released VGG-Face model [20], can be converted into a B-CNN without any additional feature training. This B-CNN improves upon the CNN performance on the IJB-A benchmark, achieving 89.5% rank-1 recall.Comment: Published version at WACV 201
    corecore