203 research outputs found
Integrated Face Analytics Networks through Cross-Dataset Hybrid Training
Face analytics benefits many multimedia applications. It consists of a number
of tasks, such as facial emotion recognition and face parsing, and most
existing approaches generally treat these tasks independently, which limits
their deployment in real scenarios. In this paper we propose an integrated Face
Analytics Network (iFAN), which is able to perform multiple tasks jointly for
face analytics with a novel carefully designed network architecture to fully
facilitate the informative interaction among different tasks. The proposed
integrated network explicitly models the interactions between tasks so that the
correlations between tasks can be fully exploited for performance boost. In
addition, to solve the bottleneck of the absence of datasets with comprehensive
training data for various tasks, we propose a novel cross-dataset hybrid
training strategy. It allows "plug-in and play" of multiple datasets annotated
for different tasks without the requirement of a fully labeled common dataset
for all the tasks. We experimentally show that the proposed iFAN achieves
state-of-the-art performance on multiple face analytics tasks using a single
integrated model. Specifically, iFAN achieves an overall F-score of 91.15% on
the Helen dataset for face parsing, a normalized mean error of 5.81% on the
MTFL dataset for facial landmark localization and an accuracy of 45.73% on the
BNU dataset for emotion recognition with a single model.Comment: 10 page
Dynamic Face Video Segmentation via Reinforcement Learning
For real-time semantic video segmentation, most recent works utilised a
dynamic framework with a key scheduler to make online key/non-key decisions.
Some works used a fixed key scheduling policy, while others proposed adaptive
key scheduling methods based on heuristic strategies, both of which may lead to
suboptimal global performance. To overcome this limitation, we model the online
key decision process in dynamic video segmentation as a deep reinforcement
learning problem and learn an efficient and effective scheduling policy from
expert information about decision history and from the process of maximising
global return. Moreover, we study the application of dynamic video segmentation
on face videos, a field that has not been investigated before. By evaluating on
the 300VW dataset, we show that the performance of our reinforcement key
scheduler outperforms that of various baselines in terms of both effective key
selections and running speed. Further results on the Cityscapes dataset
demonstrate that our proposed method can also generalise to other scenarios. To
the best of our knowledge, this is the first work to use reinforcement learning
for online key-frame decision in dynamic video segmentation, and also the first
work on its application on face videos.Comment: CVPR 2020. 300VW with segmentation labels is available at:
https://github.com/mapleandfire/300VW-Mas
- …