200,587 research outputs found

    Audio feedback design: principles and emerging practice

    Get PDF
    This paper considers the design of audio feedback as experienced in several faculties of a UK university and as identified in the literature. Several adaptable models are presented, including: 'personal tutor monologue' recorded at the PC by the tutor as part of the marking process; 'personal feedback conversations', recorded by the tutor or student(s) in the lab or studio to capture project discussions or studio 'crits'; 'broadcast feedback' targeted at large groups; 'peer audio feedback', in which students learn as they assess each other's work; 'tutor conversations', a 'common room conversation' approach designed to model critical thinking; and 'personal audio interventions', targeted at individuals to address emerging issues. The methods are introduced and evaluated according to their potential to formatively affect learning. Audio feedback design factors are outlined and practical recommendations are offered. The paper concludes that the use of audio feedback can promote a culture of dialogic engagement

    AUTOMATIC DUBBING OF VIDEOS WITH MULTIPLE SPEAKERS

    Get PDF
    A machine-learning model that automatically converts audio streams from an audio-visual content from a source language to a destination language is described. In response to determining that an audio stream should be translated, a machine-learning-based dubbing model is invoked for a specific destination language. In case of multiple speakers, voice embedding techniques are used to match dubbed audio streams to the corresponding speakers. The sentiment in the original speaker’s voice is preserved by training the model with targeted data set in the destination language

    Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

    Full text link
    We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples

    Adversarial Black-Box Attacks on Automatic Speech Recognition Systems using Multi-Objective Evolutionary Optimization

    Full text link
    Fooling deep neural networks with adversarial input have exposed a significant vulnerability in the current state-of-the-art systems in multiple domains. Both black-box and white-box approaches have been used to either replicate the model itself or to craft examples which cause the model to fail. In this work, we propose a framework which uses multi-objective evolutionary optimization to perform both targeted and un-targeted black-box attacks on Automatic Speech Recognition (ASR) systems. We apply this framework on two ASR systems: Deepspeech and Kaldi-ASR, which increases the Word Error Rates (WER) of these systems by upto 980%, indicating the potency of our approach. During both un-targeted and targeted attacks, the adversarial samples maintain a high acoustic similarity of 0.98 and 0.97 with the original audio.Comment: Published in Interspeech 201
    • …
    corecore