Search CORE

200,587 research outputs found

Audio feedback design: principles and emerging practice

Author: Andrew Middleton
Anne Nortcliffe
Bloor
Boud
Brown
Chickering
Chickering
Coffield
Cohen
Cryer
Durbridge
France
Gibbs
Gibbs
Glover
Higgins
Higgins
Holmes
Hounsell
Ice
Kelly
Kesterton
Kirkwood
MacLellan
Middleton
Middleton
Middleton
Nicol
Nicol
Nie
Nortcliffe
Ramsden
Rotheram
Rust
Sadler
Siemens
Sipple
Takemoto
Wotjas
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2010
Field of study

This paper considers the design of audio feedback as experienced in several faculties of a UK university and as identified in the literature. Several adaptable models are presented, including: 'personal tutor monologue' recorded at the PC by the tutor as part of the marking process; 'personal feedback conversations', recorded by the tutor or student(s) in the lab or studio to capture project discussions or studio 'crits'; 'broadcast feedback' targeted at large groups; 'peer audio feedback', in which students learn as they assess each other's work; 'tutor conversations', a 'common room conversation' approach designed to model critical thinking; and 'personal audio interventions', targeted at individuals to address emerging issues. The methods are introduced and evaluated according to their potential to formatively affect learning. Audio feedback design factors are outlined and practical recommendations are offered. The paper concludes that the use of audio feedback can promote a culture of dialogic engagement

Crossref

Sheffield Hallam University Research Archive

AUTOMATIC DUBBING OF VIDEOS WITH MULTIPLE SPEAKERS

Author: Barekatain Mohammadamin
Feuz Sandro
Publication venue: Technical Disclosure Commons
Publication date: 14/12/2018
Field of study

A machine-learning model that automatically converts audio streams from an audio-visual content from a source language to a destination language is described. In response to determining that an audio stream should be translated, a machine-learning-based dubbing model is invoked for a specific destination language. In case of multiple speakers, voice embedding techniques are used to match dubbed audio streams to the corresponding speakers. The sentiment in the original speaker’s voice is preserved by training the model with targeted data set in the destination language

Technical Disclosure Common

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Author: Carlini Nicholas
Wagner David
Publication venue
Publication date: 29/03/2018
Field of study

We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples

arXiv.org e-Print Archive

Crossref

Adversarial Black-Box Attacks on Automatic Speech Recognition Systems using Multi-Objective Evolutionary Optimization

Author: Aralikatte Rahul
Khare Shreya
Mani Senthil
Publication venue
Publication date: 03/07/2019
Field of study

Fooling deep neural networks with adversarial input have exposed a significant vulnerability in the current state-of-the-art systems in multiple domains. Both black-box and white-box approaches have been used to either replicate the model itself or to craft examples which cause the model to fail. In this work, we propose a framework which uses multi-objective evolutionary optimization to perform both targeted and un-targeted black-box attacks on Automatic Speech Recognition (ASR) systems. We apply this framework on two ASR systems: Deepspeech and Kaldi-ASR, which increases the Word Error Rates (WER) of these systems by upto 980%, indicating the potency of our approach. During both un-targeted and targeted attacks, the adversarial samples maintain a high acoustic similarity of 0.98 and 0.97 with the original audio.Comment: Published in Interspeech 201

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System