20 research outputs found

    MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Face Images

    Get PDF
    This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the optimal size-accuracy balance, one needs to search for the final network size (or the compression rate). On the other hand, KD is proved to be useful for model compression for the FER problem, and we discovered that its effects gets more and more significant with the decreasing model size. In addition, we hypothesized that translation invariance achieved using max-pooling layers would not be useful for the FER problem as the expressions are sensitive to small, pixel-wise changes around the eye and the mouth. However, we have found an intriguing improvement on generalization when max-pooling is used. We conducted experiments on two widely-used FER datasets, CK+ and Oulu-CASIA. Our smallest model (MicroExpNet), obtained using knowledge distillation, is less than 1MB in size and works at 1851 frames per second on an Intel i7 CPU. Despite being less accurate than the state-of-the-art, MicroExpNet still provides significant insights for designing a microarchitecture for the FER problem.Comment: International Conference on Image Processing Theory, Tools and Applications (IPTA) 2019 camera ready version. Codes are available at: https://github.com/cuguilke/microexpne

    A physiological signal database of children with different special needs for stress recognition

    Get PDF
    This study presents a new dataset AKTIVES for evaluating the methods for stress detection and game reaction using physiological signals. We collected data from 25 children with obstetric brachial plexus injury, dyslexia, and intellectual disabilities, and typically developed children during game therapy. A wristband was used to record physiological data (blood volume pulse (BVP), electrodermal activity (EDA), and skin temperature (ST)). Furthermore, the facial expressions of children were recorded. Three experts watched the children's videos, and physiological data is labeled "Stress/No Stress" and "Reaction/No Reaction", according to the videos. The technical validation supported high-quality signals and showed consistency between the experts.Scientific and Technological Research Council of Turkey Technology and Innovation Funding Programmes Directorat

    SRMAE: Masked Image Modeling for Scale-Invariant Deep Representations

    Full text link
    Due to the prevalence of scale variance in nature images, we propose to use image scale as a self-supervised signal for Masked Image Modeling (MIM). Our method involves selecting random patches from the input image and downsampling them to a low-resolution format. Our framework utilizes the latest advances in super-resolution (SR) to design the prediction head, which reconstructs the input from low-resolution clues and other patches. After 400 epochs of pre-training, our Super Resolution Masked Autoencoders (SRMAE) get an accuracy of 82.1% on the ImageNet-1K task. Image scale signal also allows our SRMAE to capture scale invariance representation. For the very low resolution (VLR) recognition task, our model achieves the best performance, surpassing DeriveNet by 1.3%. Our method also achieves an accuracy of 74.84% on the task of recognizing low-resolution facial expressions, surpassing the current state-of-the-art FMD by 9.48%
    corecore