47,239 research outputs found
μΌκ΅΄ νμ μΈμ, λμ΄ λ° μ±λ³ μΆμ μ μν λ€μ€ λ°μ΄ν°μ λ€μ€ λλ©μΈ λ€μ€μμ λ€νΈμν¬
νμλ
Όλ¬Έ(μμ¬)--μμΈλνκ΅ λνμ :곡과λν μ κΈ°Β·μ 보곡νλΆ,2019. 8. Cho, Nam Ik.컨볼 루μ
λ΄λ΄ λ€νΈμν¬ (CNN)λ μΌκ΅΄κ³Ό κ΄λ ¨λ λ¬Έμ λ₯Ό ν¬ν¨νμ¬ λ§μ μ»΄ν¨ν° λΉμ μμ
μμ λ§€μ° μ μλν©λλ€. κ·Έλ¬λ μ°λ Ή μΆμ λ° μΌκ΅΄ νμ μΈμ (FER)μ κ²½μ° CNNμ΄ μ 곡 ν μ νλλ μ¬μ ν μ€μ λ¬Έμ μ λν΄ μΆ©λΆνμ§ μμ΅λλ€. CNNμ μΌκ΅΄μ μ£Όλ¦μ λκ»μ μμ λ―Έλ¬ν μ°¨μ΄λ₯Ό λ°κ²¬νμ§ λͺ»νμ§λ§,
μ΄κ²μ μ°λ Ή μΆμ κ³Ό FERμ νμμ μ
λλ€. λν μ€μ μΈκ³μμμ μΌκ΅΄ μ΄λ―Έμ§λ CNNμ΄ νλ ¨ λ°μ΄ν°μμ κ°λ₯ν λ νμ λ 물체λ₯Ό μ°Ύλ λ° κ°κ±΄νμ§ μμ νμ λ° μ‘°λͺ
μΌλ‘ μΈν΄ λ§μ μ°¨μ΄κ° μμ΅λλ€.
λν MTL (Multi Task Learning)μ μ¬λ¬ κ°μ§ μ§κ° μμ
μ λμμ ν¨μ¨μ μΌλ‘ μνν©λλ€. λͺ¨λ²μ μΈ MTL λ°©λ²μμλ μλ‘ λ€λ₯Έ μμ
μ λν λͺ¨λ λ μ΄λΈμ ν¨κ» ν¬ν¨νλ λ°μ΄ν° μ§ν©μ ꡬμ±νλ κ²μ κ³ λ €ν΄μΌν©λλ€. κ·Έλ¬λ λμ μμ
μ΄ λ€κ°νλκ³ λ³΅μ‘ν΄μ§λ©΄ λ κ°λ ₯ν λ μ΄λΈμ κ°μ§ κ³Όλνκ² ν° λ°μ΄ν° μΈνΈκ° νμν μ μμ΅λλ€. λ°λΌμ μνλ λΌλ²¨ λ°μ΄ν°λ₯Ό μμ±νλ λΉμ©μ μ’
μ’
μ₯μ λ¬Όμ΄λ©° νΉν λ€μ€ μμ
νμ΅μ κ²½μ° μ₯μ κ°λ©λλ€.
λ°λΌμ μ°λ¦¬λ κ°λ² νν°μ μΊ‘μ κΈ°λ° λ€νΈμν¬ (MTL) λ° λ°μ΄ν° μ¦λ₯λ₯Ό κΈ°λ°μΌλ‘νλ λ€μ€ μμ
νμ΅μ κΈ°λ°ν μλ‘μ΄ λ° κ°λ
νμ΅ λ°©λ²μ μ μνλ€.The convolutional neural network (CNN) works very well in many computer vision tasks including the face-related problems. However, in the case of age estimation and facial expression recognition (FER), the accuracy provided by the CNN is still not good enough to be used for the real-world problems. It seems that the CNN does not well find the subtle differences in thickness and amount of wrinkles on the face,
which are the essential features for the age estimation and FER. Also, the face images in the real world have many variations due to the face rotation and illumination, where the CNN is not robust in finding the rotated objects when not every possible variation is in the training data.
Moreover, The Multi Task Learning (MTL) Based based methods can be much helpful to achieve the real-time visual understanding of a dynamic scene, as they are able to perform several different perceptual tasks simultaneously and efficiently. In the exemplary MTL methods, we need to consider constructing a dataset that contains all the labels for different tasks together. However, as the target task becomes multi-faceted and more complicated, sometimes unduly large dataset with stronger labels is required. Hence, the cost of generating desired labeled data for complicated learning tasks is often an obstacle, especially for multi-task learning.
Therefore, first to alleviate these problems, we first propose few methods in order to improve single task baseline performance using gabor filters and Capsule Based Networks , Then We propose a new semi-supervised learning method on face-related tasks based on Multi-Task Learning (MTL) and data distillation.1 INTRODUCTION 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Age and Gender Estimation . . . . . . . . . . . . . . . . . . 4
1.2.2 Facial Expression Recognition (FER) . . . . . . . . . . . . . 4
1.2.3 Capsule networks (CapsNet) . . . . . . . . . . . . . . . . . . 5
1.2.4 Semi-Supervised Learning. . . . . . . . . . . . . . . . . . . . 5
1.2.5 Multi-Task Learning. . . . . . . . . . . . . . . . . . . . . . . 6
1.2.6 Knowledge and data distillation. . . . . . . . . . . . . . . . . 6
1.2.7 Domain Adaptation. . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2. GF-CapsNet: Using Gabor Jet and Capsule Networks for Face-Related Tasks 10
2.1 Feeding CNN with Hand-Crafted Features . . . . . . . . . . . . . . . 10
2.1.1 Preparation of Input . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Age and Gender Estimation using the Gabor Responses . . . . 13
2.2 GF-CapsNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Modification of CapsNet . . . . . . . . . . . . . . . . . 16
3. Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasks 20
3.1 MTL learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Data Distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4. Experiments and Results 25
4.1 Experiments on GF-CNN and GF-CapsNet . . . . . . . . . . . . . . 25
4.2 GF-CNN Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.1 GF-CapsNet Results . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Experiment on Distill-2MD-MTL . . . . . . . . . . . . . . . . . . . 33
4.3.1 Semi-Supervised MTL . . . . . . . . . . . . . . . . . . . . . 34
4.3.2 Cross Datasets Cross-Domain Evaluation . . . . . . . . . . . 36
5. Conclusion 38
Abstract (In Korean) 49Maste
Feature Representation Learning with Adaptive Displacement Generation and Transformer Fusion for Micro-Expression Recognition
Micro-expressions are spontaneous, rapid and subtle facial movements that can
neither be forged nor suppressed. They are very important nonverbal
communication clues, but are transient and of low intensity thus difficult to
recognize. Recently deep learning based methods have been developed for
micro-expression (ME) recognition using feature extraction and fusion
techniques, however, targeted feature learning and efficient feature fusion
still lack further study according to the ME characteristics. To address these
issues, we propose a novel framework Feature Representation Learning with
adaptive Displacement Generation and Transformer fusion (FRL-DGT), in which a
convolutional Displacement Generation Module (DGM) with self-supervised
learning is used to extract dynamic features from onset/apex frames targeted to
the subsequent ME recognition task, and a well-designed Transformer Fusion
mechanism composed of three Transformer-based fusion modules (local, global
fusions based on AU regions and full-face fusion) is applied to extract the
multi-level informative features after DGM for the final ME prediction. The
extensive experiments with solid leave-one-subject-out (LOSO) evaluation
results have demonstrated the superiority of our proposed FRL-DGT to
state-of-the-art methods
- β¦