475,991 research outputs found

    Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision

    Full text link
    The goal of this work is to train discriminative cross-modal embeddings without access to manually annotated data. Recent advances in self-supervised learning have shown that effective representations can be learnt from natural cross-modal synchrony. We build on earlier work to train embeddings that are more discriminative for uni-modal downstream tasks. To this end, we propose a novel training strategy that not only optimises metrics across modalities, but also enforces intra-class feature separation within each of the modalities. The effectiveness of the method is demonstrated on two downstream tasks: lip reading using the features trained on audio-visual synchronisation, and speaker recognition using the features trained for cross-modal biometric matching. The proposed method outperforms state-of-the-art self-supervised baselines by a signficant margin.Comment: Under submission as a conference pape

    Disentangled Speech Embeddings using Cross-modal Self-supervision

    Full text link
    The objective of this paper is to learn representations of speaker identity without access to manually annotated data. To do so, we develop a self-supervised learning objective that exploits the natural cross-modal synchrony between faces and audio in video. The key idea behind our approach is to tease apart--without annotation--the representations of linguistic content and speaker identity. We construct a two-stream architecture which: (1) shares low-level features common to both representations; and (2) provides a natural mechanism for explicitly disentangling these factors, offering the potential for greater generalisation to novel combinations of content and identity and ultimately producing speaker identity representations that are more robust. We train our method on a large-scale audio-visual dataset of talking heads `in the wild', and demonstrate its efficacy by evaluating the learned speaker representations for standard speaker recognition performance.Comment: ICASSP 2020. The first three authors contributed equally to this wor

    Automatic face recognition of video sequences using self-eigenfaces

    Get PDF
    The objective of this work is to provide an efficient face recognition scheme useful for video indexing applications. In particular we are addressing the following problem: given a set of known images and given a video sequence to be indexed, find where the corresponding persons appear in the sequence. Conventional face detection schemes are not well suited for this application and alternate and more efficient schemes have to be developed. In this paper we have modified our original generic eigenface-based recognition scheme presented in [1] by introducing the concept of selfeigenfaces. The resulting scheme is very efficient to find specific face images and to cope with the different face conditions present in a video sequence. The main and final objective is to develop a tool to be used in the MPEG-7 standardization effort to help video indexing activities. Good results have been obtained using the video test sequences used in the MPEG-7 evaluation group.Peer ReviewedPostprint (published version

    Interventions for adjustment, impaired self-awareness and empathy

    Get PDF
    No abstract available

    You can go your own way: effectiveness of participant-driven versus experimenter-driven processing strategies in memory training and transfer

    Get PDF
    Cognitive training programs that instruct specific strategies frequently show limited transfer. Open-ended approaches can achieve greater transfer, but may fail to benefit many older adults due to age deficits in self-initiated processing. We examined whether a compromise that encourages effort at encoding without an experimenter-prescribed strategy might yield better results. Older adults completed memory training under conditions that either (1) mandated a specific strategy to increase deep, associative encoding, (2) attempted to suppress such encoding by mandating rote rehearsal, or (3) encouraged time and effort toward encoding but allowed for strategy choice. The experimenter-enforced associative encoding strategy succeeded in creating integrated representations of studied items, but training-task progress was related to pre-existing ability. Independent of condition assignment, self-reported deep encoding was associated with positive training and transfer effects, suggesting that the most beneficial outcomes occur when environmental support guiding effort is provided but participants generate their own strategies
    • …
    corecore