167 research outputs found

    Deep learning for facial emotion recognition

    Get PDF
    The ability to perceive and interpret human emotions is an essential as-pect of daily life. The recent success of deep learning (DL) has resulted in the ability to utilize automated emotion recognition by classifying af-fective modalities into a given emotional state. Accordingly, DL has set several state-of-the-art benchmarks on static affective corpora collected in controlled environments. Yet, one of the main limitations of DL based intelligent systems is their inability to generalize on data with nonuniform conditions. For instance, when dealing with images in a real life scenario, where extraneous variables such as natural or artificial lighting are sub-ject to constant change, the resulting changes in the data distribution commonly lead to poor classification performance. These and other con-straints, such as: lack of realistic data, changes in facial pose, and high data complexity and dimensionality increase the difficulty of designing DL models for emotion recognition in unconstrained environments. This thesis investigates the development of deep artificial neural net-work learning algorithms for emotion recognition with specific attention to illumination and facial pose invariance. Moreover, this research looks at the development of illumination and rotation invariant face detection architectures based on deep reinforcement learning. The contributions and novelty of this thesis are presented in the form of several deep learning pose and illumination invariant architectures that offer state-of-the-art classification performance on data with nonuniform conditions. Furthermore, a novel deep reinforcement learning architecture for illumination and rotation invariant face detection is also presented. The originality of this work is derived from a variety of novel deep learning paradigms designed for the training of such architectures

    InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

    Full text link
    This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound to the mutual information objective that can be optimized efficiently, and show that our training procedure can be interpreted as a variation of the Wake-Sleep algorithm. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing fully supervised methods

    A survey on generative adversarial networks for imbalance problems in computer vision tasks

    Get PDF
    Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms
    corecore