Search CORE

497 research outputs found

Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech

Author: Likitsupin Krerksak
Punyabukkana Proadpran
Suchato Atiwong
Wutiwiwatchai Chai
Publication venue: 'Faculty of Engineering, Chulalongkorn University'
Publication date: 18/05/2016
Field of study

Segment-based speech recognition has shown to be a competitive alternative to the state-of-the-art HMM-based techniques. Its accuracies rely heavily on the quality of the segment graph from which the recognizer searches for the most likely recognition hypotheses. In order to increase the inclusion rate of actual segments in the graph, it is important to recover possible missing segments generated by segment-based segmentation algorithm. An aspect of this research focuses on determining the missing segments due to missed detection of segment boundaries. The acoustic discontinuities, together with manner-distinctive features are utilized to recover the missing segments. Another aspect of improvement to our segment-based framework tackles the restriction of having limited amount of training speech data which prevents the usage of more complex covariance matrices for the acoustic models. Feature dimensional reduction in the form of the Principal Component Analysis (PCA) is applied to enable the training of full covariance matrices and it results in improved segment-based phoneme recognition. Furthermore, to benefit from the fact that segment-based approach allows the integration of phonetic knowledge, we incorporate the probability of each segment being one type of sound unit of a certain specific common manner of articulation into the scoring of the segment graphs. Our experiment shows that, with the proposed improvements, our segment-based framework approximately increases the phoneme recognition accuracy by approximately 25% of the one obtained from the baseline segment-based speech recognition

Engineering Journal (Faculty of Engineering, Chulalongkorn University, Bangkok)

Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise

Author: Chen F
Loizou PC
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2012
Field of study

published_or_final_versio

PubMed Central

HKU Scholars Hub

Perception of allophonic cues to English word boundaries by Polish learners: Approximant devoicing in English

Author: Balas Anna
Rojczyk Arkadiusz
Schwartz Geoffrey
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2016
Field of study

The study investigates the perception of devoicing of English /w, r, j, l/ after /p, t, k/ as a word-boundary cue by Polish listeners. Polish does not devoice sonorants following voiceless stops in word-initial positions. As a result, Polish learners are not made sensitive to sonorant devoicing as a segmentation cue. Higher-proficiency and lower-proficiency Polish learners of English participated in the task in which they recognised phrases such as buy train vs. bite rain or pie plot vs. pipe lot. The analysis of accuracy scores revealed that successful segmentation was only above chance level, indicating that sonorant voicing/devoicing cue was largely unattended to in identifying the boundary location. Moreover, higher proficiency did not lead to more successful segmentation. The analysis of reaction times showed an unclear pattern in which higher-proficiency listeners segmented the test phrases faster but not more accurately than lower-proficiency listeners. Finally, #CS sequences were recognised more accurately than C#S sequences, which was taken to suggest that the listeners may have had some limited knowledge that devoiced sonorants appear only in word-initial positions, but they treated voiced sonorants as equal candidates for word-final and word-initial position

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

An Improved GA Based Modified Dynamic Neural Network for Cantonese-Digit Speech Recognition

Author: F.H.F. Leung
H.H.C. Iu
H.K. Lam
K.F. Leung
S.H. Ling
Publication venue: 'IntechOpen'
Publication date: 06/01/2007
Field of study

Author name used in this publication: F. H. F. Leung2007-2008 > Academic research: refereed > Chapter in an edited book (author)published_fina

IntechOpen

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

PolyU Institutional Repository

Automatic prosodic analysis for computer aided pronunciation teaching

Author: Bagshaw Paul Christopher
Publication venue: The University of Edinburgh
Publication date: 01/01/1994
Field of study

Correct pronunciation of spoken language requires the appropriate modulation of acoustic characteristics of speech to convey linguistic information at a suprasegmental level. Such prosodic modulation is a key aspect of spoken language and is an important component of foreign language learning, for purposes of both comprehension and intelligibility. Computer aided pronunciation teaching involves automatic analysis of the speech of a non-native talker in order to provide a diagnosis of the learner's performance in comparison with the speech of a native talker. This thesis describes research undertaken to automatically analyse the prosodic aspects of speech for computer aided pronunciation teaching. It is necessary to describe the suprasegmental composition of a learner's speech in order to characterise significant deviations from a native-like prosody, and to offer some kind of corrective diagnosis. Phonological theories of prosody aim to describe the suprasegmental composition of speech..

CiteSeerX

Edinburgh Research Archive

Acoustic-phonetic constraints in continuous speech recognition: a case study using the digit vocabulary.

Author: Chen Francine Robina.
Publication venue: Massachusetts Institute of Technology.
Publication date: 01/01/1985
Field of study

Thesis (Ph.D.)—Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1985.Includes bibliographical references (leaves 155-159).This electronic version was scanned from a copy of the thesis on file at the Speech Communication Group. The certified thesis is available in the Institute Archives and Special Collections.Vinton-Hayes Fellowship. DARPA, monitored through the Office of Naval Research. System Development Foundation.Ph.D

DSpace@MIT

Acoustic-Phonetic Features for the Automatic Classification of Stop Consonants

Author: Ali Ahmed M. Abdelatty
Mueller Paul
Van der Spiegel Jan
Publication venue: ScholarlyCommons
Publication date: 01/11/2001
Field of study

In this paper, the acoustic–phonetic characteristics of American English stop consonants are investigated. Features studied in the literature are evaluated for their information content and new features are proposed. A statistically guided, knowledge-based, acoustic–phonetic system for the automatic classification of stops, in speaker independent continuous speech, is proposed. The system uses a new auditory-based front-end processing and incorporates new algorithms for the extraction and manipulation of the acoustic–phonetic features that proved to be rich in their information content. Recognition experiments are performed using hard decision algorithms on stops extracted from the TIMIT database continuous speech of 60 speakers (not used in the design process) from seven different dialects of American English. An accuracy of 96% is obtained for voicing detection, 90% for place articulation detection and 86% for the overall classification of stops

ScholarlyCommons@Penn

Investigating potential acoustic correlates of sonority: Intensity vs. periodic energy

Author: Schröer Tobias Reinhold
Publication venue
Publication date: 10/08/2020
Field of study

This empirical study examines possible acoustic correlates of sonority. The results indicate that periodic energy (in particular its sum) is a more reliable cue to sonority than intensity

Kölner UniversitätsPublikationsServer