Search CORE

97 research outputs found

The UEDIN English ASR System for the IWSLT 2013 Evaluation

Author: Bell Peter
Birch Alexandra
Gangireddy Siva Reddy
McInnes Fergus
Renals Steve
Sinclair Mark
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

A lecture transcription system combining neural network acoustic and language models

Author: Bell P
Hori C
McInnes F
Renals S
Swietojanski P
Wu Y
Yamamoto H
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech

Author: Aniol M.
Bell P.
Christensen H.
Green P.
Hain T.
King S.
Swietojanski P.
Publication venue
Publication date: 01/08/2013
Field of study

Edinburgh Research Explorer

Robust excitation-based features for Automatic Speech Recognition

Author: Chen L
Chen X
Drugman T
Gales MJF
Stylianou Y
Publication venue: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publication date: 01/01/2015
Field of study

In this paper we investigate the use of robust to noise features characterizing the speech excitation signal as complementary features to the usually considered vocal tract based features for automatic speech recognition (ASR). The features are tested in a state-of-the-art Deep Neural Network (DNN) based hybrid acoustic model for speech recognition. The suggested excitation features expands the set of excitation features previously considered for ASR, expecting that these features help in a better discrimination of the broad phonetic classes (e.g., fricatives, nasal, vowels, etc.). Relative improvements in the word error rate are observed in the AMI meeting transcription system with greater gains (about 5%) if PLP features are combined with the suggested excitation features. For Aurora 4, significant improvements are observed as well. Combining the suggested excitation features with filter banks, a word error rate of 9.96% is achieved.This is the author accepted manuscript. The final version is available from IEEE via http://dx.doi.org/10.1109/ICASSP.2015.717885

CiteSeerX

Crossref

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

Learning speaker-specific pronunciations of disordered speech

Author: Christensen H.
Green P.
Hain T.
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain

Author: Kiefer Bernd
Korbayova Ivana Kruijff
Ravanelli Mirco
Sagar Sangeet
van Genabith Josef
Publication venue
Publication date: 25/09/2023
Field of study

Despite the recent advancements in speech recognition, there are still difficulties in accurately transcribing conversational and emotional speech in noisy and reverberant acoustic environments. This poses a particular challenge in the search and rescue (SAR) domain, where transcribing conversations among rescue team members is crucial to support real-time decision-making. The scarcity of speech data and associated background noise in SAR scenarios make it difficult to deploy robust speech recognition systems. To address this issue, we have created and made publicly available a German speech dataset called RescueSpeech. This dataset includes real speech recordings from simulated rescue exercises. Additionally, we have released competitive training recipes and pre-trained models. Our study highlights that the performance attained by state-of-the-art methods in this challenging scenario is still far from reaching an acceptable level

arXiv.org e-Print Archive

Capturing Synchronous Collaborative Design Activities: A State-Of-The-Art Technology Review

Author: Bermell-Garcia P.
Hall M.
Johansson A.
McMahon C. A.
Ravindranath Ranjitun
Publication venue: 'Faculty of Mechanical Engineering and Naval Architecture, Univ. of Zagreb'
Publication date: 01/01/2018
Field of study

Crossref

Online Research Database In Technology

Distant Speech Recognition Experiments Using the AMI Corpus

Author: Renals Steve
Swietojanski Pawel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/10/2017
Field of study

Edinburgh Research Explorer

An Investigation into Speaker Informed DNN Front-end for LVCSR

Author: Hain T.
Karanasou P.
Liu Y.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2015
Field of study

Deep Neural Network (DNN) has become a standard method in many ASR tasks. Recently there is considerable interest in "informed training" of DNNs, where DNN input is augmented with auxiliary codes, such as i-vectors, speaker codes, speaker separation bottleneck (SSBN) features, etc. This paper compares different speaker informed DNN training methods in LVCSR task. We discuss mathematical equivalence between speaker informed DNN training and "bias adaptation" which uses speaker dependent biases, and give detailed analysis on influential factors such as dimension, discrimination and stability of auxiliary codes. The analysis is supported by experiments on a meeting recognition task using bottleneck feature based system. Results show that i-vector based adaptation is also effective in bottleneck feature based system (not just hybrid systems). However all tested methods show poor generalisation to unseen speakers. We introduce a system based on speaker classification followed by speaker adaptation of biases, which yields equivalent performance to an i-vector based system with 10.4% relative improvement over baseline on seen speakers. The new approach can serve as a fast alternative especially for short utterances

CiteSeerX

Crossref

White Rose Research Online