47,239 research outputs found

    μ–Όκ΅΄ ν‘œμ • 인식, λ‚˜μ΄ 및 성별 좔정을 μœ„ν•œ 닀쀑 데이터셋 닀쀑 도메인 λ‹€μ€‘μž‘μ—… λ„€νŠΈμ›Œν¬

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(석사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀,2019. 8. Cho, Nam Ik.컨볼 λ£¨μ…˜ λ‰΄λŸ΄ λ„€νŠΈμ›Œν¬ (CNN)λŠ” μ–Όκ΅΄κ³Ό κ΄€λ ¨λœ 문제λ₯Ό ν¬ν•¨ν•˜μ—¬ λ§Žμ€ 컴퓨터 λΉ„μ „ μž‘μ—…μ—μ„œ 맀우 잘 μž‘λ™ν•©λ‹ˆλ‹€. κ·ΈλŸ¬λ‚˜ μ—°λ Ή μΆ”μ • 및 μ–Όκ΅΄ ν‘œμ • 인식 (FER)의 경우 CNN이 제곡 ν•œ μ •ν™•λ„λŠ” μ—¬μ „νžˆ μ‹€μ œ λ¬Έμ œμ— λŒ€ν•΄ μΆ©λΆ„ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. CNN은 μ–Όκ΅΄μ˜ μ£Όλ¦„μ˜ λ‘κ»˜μ™€ μ–‘μ˜ λ―Έλ¬˜ν•œ 차이λ₯Ό λ°œκ²¬ν•˜μ§€ λͺ»ν–ˆμ§€λ§Œ, 이것은 μ—°λ Ή μΆ”μ •κ³Ό FER에 ν•„μˆ˜μ μž…λ‹ˆλ‹€. λ˜ν•œ μ‹€μ œ μ„Έκ³„μ—μ„œμ˜ μ–Όκ΅΄ μ΄λ―Έμ§€λŠ” CNN이 ν›ˆλ ¨ λ°μ΄ν„°μ—μ„œ κ°€λŠ₯ν•  λ•Œ νšŒμ „ 된 물체λ₯Ό μ°ΎλŠ” 데 κ°•κ±΄ν•˜μ§€ μ•Šμ€ νšŒμ „ 및 μ‘°λͺ…μœΌλ‘œ 인해 λ§Žμ€ 차이가 μžˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ MTL (Multi Task Learning)은 μ—¬λŸ¬ 가지 지각 μž‘μ—…μ„ λ™μ‹œμ— 효율적으둜 μˆ˜ν–‰ν•©λ‹ˆλ‹€. λͺ¨λ²”적 인 MTL λ°©λ²•μ—μ„œλŠ” μ„œλ‘œ λ‹€λ₯Έ μž‘μ—…μ— λŒ€ν•œ λͺ¨λ“  λ ˆμ΄λΈ”μ„ ν•¨κ»˜ ν¬ν•¨ν•˜λŠ” 데이터 집합을 κ΅¬μ„±ν•˜λŠ” 것을 κ³ λ €ν•΄μ•Όν•©λ‹ˆλ‹€. κ·ΈλŸ¬λ‚˜ λŒ€μƒ μž‘μ—…μ΄ λ‹€κ°ν™”λ˜κ³  λ³΅μž‘ν•΄μ§€λ©΄ 더 κ°•λ ₯ν•œ λ ˆμ΄λΈ”μ„ 가진 κ³Όλ„ν•˜κ²Œ 큰 데이터 μ„ΈνŠΈκ°€ ν•„μš”ν•  수 μžˆμŠ΅λ‹ˆλ‹€. λ”°λΌμ„œ μ›ν•˜λŠ” 라벨 데이터λ₯Ό μƒμ„±ν•˜λŠ” λΉ„μš©μ€ μ’…μ’… μž₯애물이며 특히 닀쀑 μž‘μ—… ν•™μŠ΅μ˜ 경우 μž₯μ• κ°€λ©λ‹ˆλ‹€. λ”°λΌμ„œ μš°λ¦¬λŠ” 가버 필터와 캑슐 기반 λ„€νŠΈμ›Œν¬ (MTL) 및 데이터 증λ₯˜λ₯Ό κΈ°λ°˜μœΌλ‘œν•˜λŠ” 닀쀑 μž‘μ—… ν•™μŠ΅μ— κΈ°λ°˜ν•œ μƒˆλ‘œμš΄ 반 감독 ν•™μŠ΅ 방법을 μ œμ•ˆν•œλ‹€.The convolutional neural network (CNN) works very well in many computer vision tasks including the face-related problems. However, in the case of age estimation and facial expression recognition (FER), the accuracy provided by the CNN is still not good enough to be used for the real-world problems. It seems that the CNN does not well find the subtle differences in thickness and amount of wrinkles on the face, which are the essential features for the age estimation and FER. Also, the face images in the real world have many variations due to the face rotation and illumination, where the CNN is not robust in finding the rotated objects when not every possible variation is in the training data. Moreover, The Multi Task Learning (MTL) Based based methods can be much helpful to achieve the real-time visual understanding of a dynamic scene, as they are able to perform several different perceptual tasks simultaneously and efficiently. In the exemplary MTL methods, we need to consider constructing a dataset that contains all the labels for different tasks together. However, as the target task becomes multi-faceted and more complicated, sometimes unduly large dataset with stronger labels is required. Hence, the cost of generating desired labeled data for complicated learning tasks is often an obstacle, especially for multi-task learning. Therefore, first to alleviate these problems, we first propose few methods in order to improve single task baseline performance using gabor filters and Capsule Based Networks , Then We propose a new semi-supervised learning method on face-related tasks based on Multi-Task Learning (MTL) and data distillation.1 INTRODUCTION 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Age and Gender Estimation . . . . . . . . . . . . . . . . . . 4 1.2.2 Facial Expression Recognition (FER) . . . . . . . . . . . . . 4 1.2.3 Capsule networks (CapsNet) . . . . . . . . . . . . . . . . . . 5 1.2.4 Semi-Supervised Learning. . . . . . . . . . . . . . . . . . . . 5 1.2.5 Multi-Task Learning. . . . . . . . . . . . . . . . . . . . . . . 6 1.2.6 Knowledge and data distillation. . . . . . . . . . . . . . . . . 6 1.2.7 Domain Adaptation. . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2. GF-CapsNet: Using Gabor Jet and Capsule Networks for Face-Related Tasks 10 2.1 Feeding CNN with Hand-Crafted Features . . . . . . . . . . . . . . . 10 2.1.1 Preparation of Input . . . . . . . . . . . . . . . . . . . . . . 10 2.1.2 Age and Gender Estimation using the Gabor Responses . . . . 13 2.2 GF-CapsNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.1 Modification of CapsNet . . . . . . . . . . . . . . . . . 16 3. Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasks 20 3.1 MTL learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2 Data Distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4. Experiments and Results 25 4.1 Experiments on GF-CNN and GF-CapsNet . . . . . . . . . . . . . . 25 4.2 GF-CNN Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.1 GF-CapsNet Results . . . . . . . . . . . . . . . . . . . . . . 30 4.3 Experiment on Distill-2MD-MTL . . . . . . . . . . . . . . . . . . . 33 4.3.1 Semi-Supervised MTL . . . . . . . . . . . . . . . . . . . . . 34 4.3.2 Cross Datasets Cross-Domain Evaluation . . . . . . . . . . . 36 5. Conclusion 38 Abstract (In Korean) 49Maste

    Feature Representation Learning with Adaptive Displacement Generation and Transformer Fusion for Micro-Expression Recognition

    Full text link
    Micro-expressions are spontaneous, rapid and subtle facial movements that can neither be forged nor suppressed. They are very important nonverbal communication clues, but are transient and of low intensity thus difficult to recognize. Recently deep learning based methods have been developed for micro-expression (ME) recognition using feature extraction and fusion techniques, however, targeted feature learning and efficient feature fusion still lack further study according to the ME characteristics. To address these issues, we propose a novel framework Feature Representation Learning with adaptive Displacement Generation and Transformer fusion (FRL-DGT), in which a convolutional Displacement Generation Module (DGM) with self-supervised learning is used to extract dynamic features from onset/apex frames targeted to the subsequent ME recognition task, and a well-designed Transformer Fusion mechanism composed of three Transformer-based fusion modules (local, global fusions based on AU regions and full-face fusion) is applied to extract the multi-level informative features after DGM for the final ME prediction. The extensive experiments with solid leave-one-subject-out (LOSO) evaluation results have demonstrated the superiority of our proposed FRL-DGT to state-of-the-art methods
    • …
    corecore