Search CORE

1,163 research outputs found

Recommended from our members

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

Author: Gales Mark
Kastanos A
Ragni A
Publication venue: 'Organisation for Economic Co-Operation and Development (OECD)'
Publication date: 01/05/2020
Field of study

Apollo (Cambridge)

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

Author: Gales MJF
Kastanos A
Ragni A
Publication venue: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publication date: 15/03/2020
Field of study

Recently, there has been growth in providers of speech transcription services enabling others to leverage technology they would not normally be able to use. As a result, speech-enabled solutions have become commonplace. Their success critically relies on the quality, accuracy, and reliability of the underlying speech transcription systems. Those black box systems, however, offer limited means for quality control as only word sequences are typically available. This paper examines this limited resource scenario for confidence estimation, a measure commonly used to assess transcription reliability. In particular, it explores what other sources of word and sub-word level information available in the transcription process could be used to improve confidence scores. To encode all such information this paper extends lattice recurrent neural networks to handle sub-words. Experimental results using the IARPA OpenKWS 2016 evaluation system show that the use of additional information yields significant gains in confidence estimation accuracy. The implementation for this model can be found online.Comment: 5 pages, 8 figures, ICASSP submissio

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Efficient activity recognition using lightweight CNN and DS-GRU network for surveillance applications

Author: Baik Sung Wook
Ding Weiping
Haq Ijaz Ul
Muhammad Khan
Palade Vasile
Ullah Amin
Publication venue: 'Elsevier BV'
Publication date: 01/05/2021
Field of study

Coventry University Pure Portal

Future word contexts in neural network language models

Author: Chen X
Gales MJF
Liu X
Ragni A
Wang Y
Publication venue: 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings
Publication date: 01/01/2017
Field of study

Recently, bidirectional recurrent network language models (bi-RNNLMs) have been shown to outperform standard, unidirectional, recurrent neural network language models (uni-RNNLMs) on a range of speech recognition tasks. This indicates that future word context information beyond the word history can be useful. However, bi-RNNLMs pose a number of challenges as they make use of the complete previous and future word context information. This impacts both training efficiency and their use within a lattice rescoring framework. In this paper these issues are addressed by proposing a novel neural network structure, succeeding word RNNLMs (su-RNNLMs). Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a finite number of succeeding, future, words. This model can be trained much more efficiently than bi-RNNLMs and can also be used for lattice rescoring. Experimental results on a meeting transcription task (AMI) show the proposed model consistently outperformed uni-RNNLMs and yield only a slight degradation compared to bi-RNNLMs in N-best rescoring. Additionally, performance improvements can be obtained using lattice rescoring and subsequent confusion network decoding

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

White Rose Research Online