Search CORE

12 research outputs found

Speech analysis using very low-dimensional bottleneck features and phone-class dependent neural networks

Author: Bai Linxue
Publication venue
Publication date: 01/07/2018
Field of study

The first part of this thesis focuses on very low-dimensional bottleneck features (BNFs), extracted from deep neural networks (DNNs) for speech analysis and recognition. Very low-dimensional BNFs are analysed in terms of their capability of representing speech and their suitability for modelling speech dynamics. Nine-dimensional BNFs obtained from a phone discrimination DNN are shown to give comparable phone recognition accuracy to 39-dimensional MFCCs, and an average of 34% higher phone recognition accuracy than formant-based features of the same dimensions. They also preserve the trajectory continuity well and thus hold promise for modelling speech dynamics. Visualisations and interpretations of the BNFs are presented, with phonetically motivated studies of the strategies that DNNs employ to create these features. The relationships between BNF representations resulting from different initialisations of DNNs are explored. The second part of this thesis considers BNFs from the perspective of feature extraction. It is motivated by the observation that different types of speech sounds lend themselves to different acoustic analysis, and that the mapping from spectra-in-context to phone posterior probabilities implemented by the DNN is a continuous approximation to a discontinuous function. This suggests that it may be advantageous to replace the single DNN with a set of phone class dependent DNNs. In this case, the appropriate mathematical structure is a manifold. It is shown that this approach leads to significant improvements in frame level phone classification accuracy

University of Birmingham Research Archive, E-theses Repository

Exploring how phone classification neural networks learn phonetic information by visualising and interpreting bottleneck features

Author: Bai Linxue
Jancovic Peter
Russell Martin
Weber Philip
Publication venue: ISCA
Publication date: 03/09/2018
Field of study

Crossref

University of Birmingham Research Portal

Phone recognition using a non-linear manifold with broad phone class dependent DNNs

Author: Bai Linxue
Jancovic Peter
Qian Mengjie
Russell Martin
Publication venue: 'International Speech Communication Association'
Publication date: 03/09/2018
Field of study

Crossref

University of Birmingham Research Portal

Phone classification using a non-linear manifold with broad phone class dependent DNNs

Author: Bai Linxue
Houghton Stephen
Jancovic Peter
Russell Martin
Weber Philip
Publication venue: 'International Speech Communication Association'
Publication date: 20/08/2017
Field of study

Crossref

University of Birmingham Research Portal

Analysis of a Low-Dimensional Bottleneck Neural Network Representation of Speech for Modelling Speech Dynamics

Author: Bai Linxue
Jancovic Peter
Russell Martin
Weber Phil
Publication venue
Publication date: 01/01/2015
Field of study

University of Birmingham Research Portal

Analysis of a Low-Dimensional Bottleneck Neural Network Representation of Speech for Modelling Speech Dynamics

Author: Bai Linxue
Jancovic Peter
Russell Martin
Weber Phil
Publication venue
Publication date: 01/09/2015
Field of study

University of Birmingham Research Portal

Analysis of a Low-Dimensional Bottleneck Neural Network Representation of Speech for Modelling Speech Dynamics

Author: Bai Linxue
Jancovic Peter
Russell Martin
Weber Phil
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 01/01/2015
Field of study

University of Birmingham Research Portal

Analysis of a Low-Dimensional Bottleneck Neural Network Representation of Speech for Modelling Speech Dynamics

Author: Bai Linxue
Jancovic Peter
Russell Martin
Weber Phil
Publication venue: ISCA
Publication date
Field of study

University of Birmingham Research Portal

Consonant Recognition with Continuous-Sate Hidden Markov Models and Perceptually-Motivated Features

Author: Bai Linxue
Houghton Stephen
Jancovic Peter
Russell Martin
Weber Phil
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 01/01/2015
Field of study

University of Birmingham Research Portal

Interpretation of low dimensional neural network bottleneck features in terms of human perception and production

Author: Bai Linxue
Houghton Stephen
Jancovic Peter
Russell Martin
Weber Phil
Publication venue: 'International Speech Communication Association'
Publication date: 10/06/2016
Field of study

Crossref

University of Birmingham Research Portal