475,991 research outputs found
Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision
The goal of this work is to train discriminative cross-modal embeddings
without access to manually annotated data. Recent advances in self-supervised
learning have shown that effective representations can be learnt from natural
cross-modal synchrony. We build on earlier work to train embeddings that are
more discriminative for uni-modal downstream tasks. To this end, we propose a
novel training strategy that not only optimises metrics across modalities, but
also enforces intra-class feature separation within each of the modalities. The
effectiveness of the method is demonstrated on two downstream tasks: lip
reading using the features trained on audio-visual synchronisation, and speaker
recognition using the features trained for cross-modal biometric matching. The
proposed method outperforms state-of-the-art self-supervised baselines by a
signficant margin.Comment: Under submission as a conference pape
Disentangled Speech Embeddings using Cross-modal Self-supervision
The objective of this paper is to learn representations of speaker identity
without access to manually annotated data. To do so, we develop a
self-supervised learning objective that exploits the natural cross-modal
synchrony between faces and audio in video. The key idea behind our approach is
to tease apart--without annotation--the representations of linguistic content
and speaker identity. We construct a two-stream architecture which: (1) shares
low-level features common to both representations; and (2) provides a natural
mechanism for explicitly disentangling these factors, offering the potential
for greater generalisation to novel combinations of content and identity and
ultimately producing speaker identity representations that are more robust. We
train our method on a large-scale audio-visual dataset of talking heads `in the
wild', and demonstrate its efficacy by evaluating the learned speaker
representations for standard speaker recognition performance.Comment: ICASSP 2020. The first three authors contributed equally to this wor
Automatic face recognition of video sequences using self-eigenfaces
The objective of this work is to provide an efficient face recognition scheme useful for video indexing applications. In
particular we are addressing the following problem: given a set of known images and given a video sequence to be
indexed, find where the corresponding persons appear in the sequence. Conventional face detection schemes are not
well suited for this application and alternate and more efficient schemes have to be developed. In this paper we have
modified our original generic eigenface-based recognition scheme presented in [1] by introducing the concept of selfeigenfaces.
The resulting scheme is very efficient to find specific face images and to cope with the different face
conditions present in a video sequence. The main and final objective is to develop a tool to be used in the MPEG-7
standardization effort to help video indexing activities. Good results have been obtained using the video test sequences
used in the MPEG-7 evaluation group.Peer ReviewedPostprint (published version
Interventions for adjustment, impaired self-awareness and empathy
No abstract available
You can go your own way: effectiveness of participant-driven versus experimenter-driven processing strategies in memory training and transfer
Cognitive training programs that instruct specific strategies frequently
show limited transfer. Open-ended approaches can achieve greater transfer, but may fail to benefit many older adults due to age deficits in self-initiated processing. We examined whether a compromise that encourages effort at encoding without an experimenter-prescribed strategy might yield better results. Older adults completed memory training under conditions that either (1) mandated a specific strategy to increase deep, associative encoding, (2) attempted to suppress such encoding by mandating rote rehearsal, or (3) encouraged time and effort toward encoding but allowed for strategy choice. The experimenter-enforced associative encoding strategy succeeded in creating integrated representations of studied items, but training-task progress was related to pre-existing ability. Independent of condition assignment, self-reported deep encoding was associated with positive training and transfer effects, suggesting that the most beneficial outcomes occur when environmental support guiding effort is provided but participants generate their own strategies
- …