8,119 research outputs found
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives
Over the past few years, adversarial training has become an extremely active
research topic and has been successfully applied to various Artificial
Intelligence (AI) domains. As a potentially crucial technique for the
development of the next generation of emotional AI systems, we herein provide a
comprehensive overview of the application of adversarial training to affective
computing and sentiment analysis. Various representative adversarial training
algorithms are explained and discussed accordingly, aimed at tackling diverse
challenges associated with emotional AI systems. Further, we highlight a range
of potential future research directions. We expect that this overview will help
facilitate the development of adversarial training for affective computing and
sentiment analysis in both the academic and industrial communities
Meta Transfer Learning for Facial Emotion Recognition
The use of deep learning techniques for automatic facial expression
recognition has recently attracted great interest but developed models are
still unable to generalize well due to the lack of large emotion datasets for
deep learning. To overcome this problem, in this paper, we propose utilizing a
novel transfer learning approach relying on PathNet and investigate how
knowledge can be accumulated within a given dataset and how the knowledge
captured from one emotion dataset can be transferred into another in order to
improve the overall performance. To evaluate the robustness of our system, we
have conducted various sets of experiments on two emotion datasets: SAVEE and
eNTERFACE. The experimental results demonstrate that our proposed system leads
to improvement in performance of emotion recognition and performs significantly
better than the recent state-of-the-art schemes adopting fine-\
tuning/pre-trained approaches
Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning
This paper proposes a step toward obtaining general models of knowledge for facial analysis, by addressing the question of multi-source transfer learning. More precisely, the proposed approach consists in two successive training steps: the first one consists in applying a combination operator to define a common embedding for the multiple sources materialized by different existing trained models. The proposed operator relies on an auto-encoder, trained on a large dataset, efficient both in terms of compression ratio and transfer learning performance. In a second step we exploit a distillation approach to obtain a lightweight student model mimicking the collection of the fused existing models. This model outperforms its teacher on novel tasks, achieving results on par with state-of-the-art methods on 15 facial analysis tasks (and domains), at an affordable training cost. Moreover, this student has 75 times less parameters than the original teacher and can be applied to a variety of novel face-related tasks
์ผ๊ตด ํ์ ์ธ์, ๋์ด ๋ฐ ์ฑ๋ณ ์ถ์ ์ ์ํ ๋ค์ค ๋ฐ์ดํฐ์ ๋ค์ค ๋๋ฉ์ธ ๋ค์ค์์ ๋คํธ์ํฌ
ํ์๋
ผ๋ฌธ(์์ฌ)--์์ธ๋ํ๊ต ๋ํ์ :๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ ๋ณด๊ณตํ๋ถ,2019. 8. Cho, Nam Ik.์ปจ๋ณผ ๋ฃจ์
๋ด๋ด ๋คํธ์ํฌ (CNN)๋ ์ผ๊ตด๊ณผ ๊ด๋ จ๋ ๋ฌธ์ ๋ฅผ ํฌํจํ์ฌ ๋ง์ ์ปดํจํฐ ๋น์ ์์
์์ ๋งค์ฐ ์ ์๋ํฉ๋๋ค. ๊ทธ๋ฌ๋ ์ฐ๋ น ์ถ์ ๋ฐ ์ผ๊ตด ํ์ ์ธ์ (FER)์ ๊ฒฝ์ฐ CNN์ด ์ ๊ณต ํ ์ ํ๋๋ ์ฌ์ ํ ์ค์ ๋ฌธ์ ์ ๋ํด ์ถฉ๋ถํ์ง ์์ต๋๋ค. CNN์ ์ผ๊ตด์ ์ฃผ๋ฆ์ ๋๊ป์ ์์ ๋ฏธ๋ฌํ ์ฐจ์ด๋ฅผ ๋ฐ๊ฒฌํ์ง ๋ชปํ์ง๋ง,
์ด๊ฒ์ ์ฐ๋ น ์ถ์ ๊ณผ FER์ ํ์์ ์
๋๋ค. ๋ํ ์ค์ ์ธ๊ณ์์์ ์ผ๊ตด ์ด๋ฏธ์ง๋ CNN์ด ํ๋ จ ๋ฐ์ดํฐ์์ ๊ฐ๋ฅํ ๋ ํ์ ๋ ๋ฌผ์ฒด๋ฅผ ์ฐพ๋ ๋ฐ ๊ฐ๊ฑดํ์ง ์์ ํ์ ๋ฐ ์กฐ๋ช
์ผ๋ก ์ธํด ๋ง์ ์ฐจ์ด๊ฐ ์์ต๋๋ค.
๋ํ MTL (Multi Task Learning)์ ์ฌ๋ฌ ๊ฐ์ง ์ง๊ฐ ์์
์ ๋์์ ํจ์จ์ ์ผ๋ก ์ํํฉ๋๋ค. ๋ชจ๋ฒ์ ์ธ MTL ๋ฐฉ๋ฒ์์๋ ์๋ก ๋ค๋ฅธ ์์
์ ๋ํ ๋ชจ๋ ๋ ์ด๋ธ์ ํจ๊ป ํฌํจํ๋ ๋ฐ์ดํฐ ์งํฉ์ ๊ตฌ์ฑํ๋ ๊ฒ์ ๊ณ ๋ คํด์ผํฉ๋๋ค. ๊ทธ๋ฌ๋ ๋์ ์์
์ด ๋ค๊ฐํ๋๊ณ ๋ณต์กํด์ง๋ฉด ๋ ๊ฐ๋ ฅํ ๋ ์ด๋ธ์ ๊ฐ์ง ๊ณผ๋ํ๊ฒ ํฐ ๋ฐ์ดํฐ ์ธํธ๊ฐ ํ์ํ ์ ์์ต๋๋ค. ๋ฐ๋ผ์ ์ํ๋ ๋ผ๋ฒจ ๋ฐ์ดํฐ๋ฅผ ์์ฑํ๋ ๋น์ฉ์ ์ข
์ข
์ฅ์ ๋ฌผ์ด๋ฉฐ ํนํ ๋ค์ค ์์
ํ์ต์ ๊ฒฝ์ฐ ์ฅ์ ๊ฐ๋ฉ๋๋ค.
๋ฐ๋ผ์ ์ฐ๋ฆฌ๋ ๊ฐ๋ฒ ํํฐ์ ์บก์ ๊ธฐ๋ฐ ๋คํธ์ํฌ (MTL) ๋ฐ ๋ฐ์ดํฐ ์ฆ๋ฅ๋ฅผ ๊ธฐ๋ฐ์ผ๋กํ๋ ๋ค์ค ์์
ํ์ต์ ๊ธฐ๋ฐํ ์๋ก์ด ๋ฐ ๊ฐ๋
ํ์ต ๋ฐฉ๋ฒ์ ์ ์ํ๋ค.The convolutional neural network (CNN) works very well in many computer vision tasks including the face-related problems. However, in the case of age estimation and facial expression recognition (FER), the accuracy provided by the CNN is still not good enough to be used for the real-world problems. It seems that the CNN does not well find the subtle differences in thickness and amount of wrinkles on the face,
which are the essential features for the age estimation and FER. Also, the face images in the real world have many variations due to the face rotation and illumination, where the CNN is not robust in finding the rotated objects when not every possible variation is in the training data.
Moreover, The Multi Task Learning (MTL) Based based methods can be much helpful to achieve the real-time visual understanding of a dynamic scene, as they are able to perform several different perceptual tasks simultaneously and efficiently. In the exemplary MTL methods, we need to consider constructing a dataset that contains all the labels for different tasks together. However, as the target task becomes multi-faceted and more complicated, sometimes unduly large dataset with stronger labels is required. Hence, the cost of generating desired labeled data for complicated learning tasks is often an obstacle, especially for multi-task learning.
Therefore, first to alleviate these problems, we first propose few methods in order to improve single task baseline performance using gabor filters and Capsule Based Networks , Then We propose a new semi-supervised learning method on face-related tasks based on Multi-Task Learning (MTL) and data distillation.1 INTRODUCTION 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Age and Gender Estimation . . . . . . . . . . . . . . . . . . 4
1.2.2 Facial Expression Recognition (FER) . . . . . . . . . . . . . 4
1.2.3 Capsule networks (CapsNet) . . . . . . . . . . . . . . . . . . 5
1.2.4 Semi-Supervised Learning. . . . . . . . . . . . . . . . . . . . 5
1.2.5 Multi-Task Learning. . . . . . . . . . . . . . . . . . . . . . . 6
1.2.6 Knowledge and data distillation. . . . . . . . . . . . . . . . . 6
1.2.7 Domain Adaptation. . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2. GF-CapsNet: Using Gabor Jet and Capsule Networks for Face-Related Tasks 10
2.1 Feeding CNN with Hand-Crafted Features . . . . . . . . . . . . . . . 10
2.1.1 Preparation of Input . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Age and Gender Estimation using the Gabor Responses . . . . 13
2.2 GF-CapsNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Modification of CapsNet . . . . . . . . . . . . . . . . . 16
3. Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasks 20
3.1 MTL learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Data Distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4. Experiments and Results 25
4.1 Experiments on GF-CNN and GF-CapsNet . . . . . . . . . . . . . . 25
4.2 GF-CNN Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.1 GF-CapsNet Results . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Experiment on Distill-2MD-MTL . . . . . . . . . . . . . . . . . . . 33
4.3.1 Semi-Supervised MTL . . . . . . . . . . . . . . . . . . . . . 34
4.3.2 Cross Datasets Cross-Domain Evaluation . . . . . . . . . . . 36
5. Conclusion 38
Abstract (In Korean) 49Maste
- โฆ