Search CORE

2,990 research outputs found

Learnable PINs: Cross-Modal Embeddings for Person Identity

Author: Albanie Samuel
Nagrani Arsha
Zisserman Andrew
Publication venue
Publication date: 01/01/2018
Field of study

We propose and investigate an identity sensitive joint embedding of face and voice. Such an embedding enables cross-modal retrieval from voice to face and from face to voice. We make the following four contributions: first, we show that the embedding can be learnt from videos of talking faces, without requiring any identity labels, using a form of cross-modal self-supervision; second, we develop a curriculum learning schedule for hard negative mining targeted to this task, that is essential for learning to proceed successfully; third, we demonstrate and evaluate cross-modal retrieval for identities unseen and unheard during training over a number of scenarios and establish a benchmark for this novel task; finally, we show an application of using the joint embedding for automatically retrieving and labelling characters in TV dramas.Comment: To appear in ECCV 201

arXiv.org e-Print Archive

Oxford University Research Archive

Implicit fusion by joint audiovisual training for emotion recognition in mono modality

Author: Han Jing
Ren Zhao
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 01/01/2019
Field of study

A paper in ICASSP 201

OPUS Augsburg

Crossref

ZENODO