164 research outputs found
Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
We present a new loss function, namely Wing loss, for robust facial landmark localisation with Convolutional Neural Networks (CNNs). We first compare and analyse different loss functions including L2, L1 and smooth L1. The analysis of these loss functions suggests that, for the training of a CNN-based localisation model, more attention should be paid to small and medium range errors. To this end, we design a piece-wise loss function. The new loss amplifies the impact of errors from the interval (-w, w) by switching from L1 loss to a modified logarithm function. To address the problem of under-representation of samples with large out-of-plane head rotations in the training set, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation approaches. Last, the proposed approach is extended to create a two-stage framework for robust facial landmark localisation. The experimental results obtained on AFLW and 300W demonstrate the merits of the Wing loss function, and prove the superiority of the proposed method over the state-of-the-art approaches
The Heat is On: Thermal Facial Landmark Tracking
Facial landmark tracking for thermal images requires tracking certain
important regions of subjects' faces, using images from thermal images, which
omit lighting and shading, but show the temperatures of their subjects. The
fluctuations of heat in particular places reflect physiological changes like
bloodflow and perspiration, which can be used to remotely gauge things like
anxiety and excitement. Past work in this domain has been limited to only a
very limited set of architectures and techniques. This work goes further by
trying a comprehensive suit of various models with different components, such
as residual connections, channel and feature-wise attention, as well as the
practice of ensembling components of the network to work in parallel. The best
model integrated convolutional and residual layers followed by a channel-wise
self-attention layer, requiring less than 100K parameters
- …