Search CORE

2,647 research outputs found

Affect recognition & generation in-the-wild

Author: Kollias Dimitrios
Publication venue: Computing, Imperial College London
Publication date: 01/01/2021
Field of study

Affect recognition based on a subject’s facial expressions has been a topic of major research in the attempt to generate machines that can understand the way subjects feel, act and react. In the past, due to the unavailability of large amounts of data captured in real-life situations, research has mainly focused on controlled environments. However, recently, social media and platforms have been widely used. Moreover, deep learning has emerged as a means to solve visual analysis and recognition problems. This Ph.D. Thesis exploits these advances and makes significant contributions for affect analysis and recognition in-the-wild. We tackle affect analysis and recognition as a dual knowledge generation problem: i) we create new, large and rich in-the-wild databases and ii) we design and train novel deep neural architectures that are able to analyse affect over these databases and to successfully generalise their performance on other datasets. At first, we present the creation of Aff-Wild database annotated according to valence-arousal and an end-to-end CNN-RNN architecture, AffWildNet. Then we use AffWildNet as a robust prior for dimensional and categorical affect recognition and extend it by extracting low-/mid-/high-level latent information and analysing this via multiple RNNs. Additionally, we propose a novel loss function for DNN-based categorical affect recognition. Next, we generate Aff-Wild2, the first database containing annotations for all main behavior tasks: estimate Valence-Arousal; classify into Basic Expressions; detect Action Units. We develop multi-task and multi-modal extensions of AffWildNet by fusing these tasks and propose a novel holistic approach that utilises all existing databases with non-overlapping annotations and couples them through co-annotation and distribution matching. Finally, we present an approach for valence-arousal, or basic expressions’ facial affect synthesis. We generate an image with a given affect, or a sequence of images with evolving affect, by annotating a 4-D database and utilising a 3-D morphable model.Open Acces

Spiral - Imperial College Digital Repository

MasqueArray: Automatic makeup selector/applicator

Author: Jeamsinkul Chujit
Publication venue: RIT Scholar Works
Publication date: 11/11/1998
Field of study

Discusses the design of a computer which selects and applies makeup

RIT Scholar Works

DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation

Author: Li Zhifeng
Su Guinan
Yang Yanwu
Publication venue
Publication date: 12/11/2023
Field of study

In recent years, audio-driven 3D facial animation has gained significant attention, particularly in applications such as virtual reality, gaming, and video conferencing. However, accurately modeling the intricate and subtle dynamics of facial expressions remains a challenge. Most existing studies approach the facial animation task as a single regression problem, which often fail to capture the intrinsic inter-modal relationship between speech signals and 3D facial animation and overlook their inherent consistency. Moreover, due to the limited availability of 3D-audio-visual datasets, approaches learning with small-size samples have poor generalizability that decreases the performance. To address these issues, in this study, we propose a cross-modal dual-learning framework, termed DualTalker, aiming at improving data usage efficiency as well as relating cross-modal dependencies. The framework is trained jointly with the primary task (audio-driven facial animation) and its dual task (lip reading) and shares common audio/motion encoder components. Our joint training framework facilitates more efficient data usage by leveraging information from both tasks and explicitly capitalizing on the complementary relationship between facial motion and audio to improve performance. Furthermore, we introduce an auxiliary cross-modal consistency loss to mitigate the potential over-smoothing underlying the cross-modal complementary representations, enhancing the mapping of subtle facial expression dynamics. Through extensive experiments and a perceptual user study conducted on the VOCA and BIWI datasets, we demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. We have made our code and video demonstrations available at https://github.com/sabrina-su/iadf.git

arXiv.org e-Print Archive

NodKit: a framework for implementing additional gestures on the iOS platform, utilizing head recognition and tracking technology

Author: Θεοδωρίδης Αθανάσιος Θ.
Publication venue
Publication date: 01/01/2013
Field of study

Σημείωση: διατίθεται συμπληρωματικό υλικό σε ξεχωριστό αρχείο

Facial Expression Recognition in the Presence of Head Motion

Author: Fadi Dornaika
Franck Davoine
Publication venue: 'IntechOpen'
Publication date: 01/05/2008
Field of study