Search CORE

20 research outputs found

Towards an automatic speech recognition system for use by deaf students in lectures

Author: Collingham Russell James
Publication venue
Publication date: 01/01/1994
Field of study

According to the Royal National Institute for Deaf people there are nearly 7.5 million hearing-impaired people in Great Britain. Human-operated machine transcription systems, such as Palantype, achieve low word error rates in real-time. The disadvantage is that they are very expensive to use because of the difficulty in training operators, making them impractical for everyday use in higher education. Existing automatic speech recognition systems also achieve low word error rates, the disadvantages being that they work for read speech in a restricted domain. Moving a system to a new domain requires a large amount of relevant data, for training acoustic and language models. The adopted solution makes use of an existing continuous speech phoneme recognition system as a front-end to a word recognition sub-system. The subsystem generates a lattice of word hypotheses using dynamic programming with robust parameter estimation obtained using evolutionary programming. Sentence hypotheses are obtained by parsing the word lattice using a beam search and contributing knowledge consisting of anti-grammar rules, that check the syntactic incorrectness’ of word sequences, and word frequency information. On an unseen spontaneous lecture taken from the Lund Corpus and using a dictionary containing "2637 words, the system achieved 815% words correct with 15% simulated phoneme error, and 73.1% words correct with 25% simulated phoneme error. The system was also evaluated on 113 Wall Street Journal sentences. The achievements of the work are a domain independent method, using the anti- grammar, to reduce the word lattice search space whilst allowing normal spontaneous English to be spoken; a system designed to allow integration with new sources of knowledge, such as semantics or prosody, providing a test-bench for determining the impact of different knowledge upon word lattice parsing without the need for the underlying speech recognition hardware; the robustness of the word lattice generation using parameters that withstand changes in vocabulary and domain

Durham e-Theses

Subword lexical modelling for speech recognition

Author: Lau Raymond, 1971-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1998
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 155-160).by Raymond Lau.Ph.D

DSpace@MIT

Danish activities concerning noise in the environment (A)

Author: Ingerslev Fritz
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1982
Field of study

Online Research Database In Technology

2020 NASA Technology Taxonomy

Author: Miranda David
Publication venue
Publication date
Field of study

This document is an update (new photos used) of the PDF version of the 2020 NASA Technology Taxonomy that will be available to download on the OCT Public Website. The updated 2020 NASA Technology Taxonomy, or "technology dictionary", uses a technology discipline based approach that realigns like-technologies independent of their application within the NASA mission portfolio. This tool is meant to serve as a common technology discipline-based communication tool across the agency and with its partners in other government agencies, academia, industry, and across the world

NASA Technical Reports Server

Heterogeneous acoustic measurements and multiple classifiers for speech recognition

Author: Halberstadt Andrew K. (Andrew King), 1970-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1999
Field of study

Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 165-173).by Andrew K. Halberstadt.Ph.D

DSpace@MIT

Cyber-Human Systems, Space Technologies, and Threats

Author: Carter Candice M.
Diebold Carter
Drew Jerry V., II
Farcot Max
Hood John-Paul
Jackson Mark J.
Johnson Peter D.
Joseph Siny
Kahn Saeed
Lonstein Wayne D.
McCreight Robert
Muehlfelder Trevor W.
Mumm Hans C.
Nichols Randall K.
Ryan Juole J.C.H.
Sincavage Suzanne M.
Solfer William
Toebes John
Publication venue: 'New Prairie Press'
Publication date: 15/08/2023
Field of study

CYBER-HUMAN SYSTEMS, SPACE TECHNOLOGIES, AND THREATS is our eighth textbook in a series covering the world of UASs / CUAS/ UUVs / SPACE. Other textbooks in our series are Space Systems Emerging Technologies and Operations; Drone Delivery of CBNRECy – DEW Weapons: Emerging Threats of Mini-Weapons of Mass Destruction and Disruption (WMDD); Disruptive Technologies with applications in Airline, Marine, Defense Industries; Unmanned Vehicle Systems & Operations On Air, Sea, Land; Counter Unmanned Aircraft Systems Technologies and Operations; Unmanned Aircraft Systems in the Cyber Domain: Protecting USA’s Advanced Air Assets, 2nd edition; and Unmanned Aircraft Systems (UAS) in the Cyber Domain Protecting USA’s Advanced Air Assets, 1st edition. Our previous seven titles have received considerable global recognition in the field. (Nichols & Carter, 2022) (Nichols, et al., 2021) (Nichols R. K., et al., 2020) (Nichols R. , et al., 2020) (Nichols R. , et al., 2019) (Nichols R. K., 2018) (Nichols R. K., et al., 2022)https://newprairiepress.org/ebooks/1052/thumbnail.jp

Kansas State University

Prediction of room acoustical parameters (A)

Author: Gade Anders Christian
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1991
Field of study

Crossref

Online Research Database In Technology

Esprit. European Strategic Programme for Research and Development in Information Technology. Progress and results 1990/91. EUR 13583 EN

Author
Publication venue
Publication date
Field of study

Towards a unified framework for sub-lexical and supra-lexical linguistic modeling

Author: Mou Xiaolong, 1973-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2002
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 171-178).Conversational interfaces have received much attention as a promising natural communication channel between humans and computers. A typical conversational interface consists of three major systems: speech understanding, dialog management and spoken language generation. In such a conversational interface, speech recognition as the front-end of speech understanding remains to be one of the fundamental challenges for establishing robust and effective human/computer communications. On the one hand, the speech recognition component in a conversational interface lives in a rich system environment. Diverse sources of knowledge are available and can potentially be beneficial to its robustness and accuracy. For example, the natural language understanding component can provide linguistic knowledge in syntax and semantics that helps constrain the recognition search space. On the other hand, the speech recognition component also faces the challenge of spontaneous speech, and it is important to address the casualness of speech using the knowledge sources available. For example, sub-lexical linguistic information would be very useful in providing linguistic support for previously unseen words, and dynamic reliability modeling may help improve recognition robustness for poorly articulated speech. In this thesis, we mainly focused on the integration of knowledge sources within the speech understanding system of a conversational interface. More specifically, we studied the formalization and integration of hierarchical linguistic knowledge at both the sub-lexical level and the supra-lexical level, and proposed a unified framework for integrating hierarchical linguistic knowledge in speech recognition using layered finite-state transducers (FSTs).(cont.) Within the proposed framework, we developed context-dependent hierarchical linguistic models at both sub-lexical and supra-lexical levels. FSTs were designed and constructed to encode both structure and probability constraints provided by the hierarchical linguistic models. We also studied empirically the feasibility and effectiveness of integrating hierarchical linguistic knowledge into speech recognition using the proposed framework. We found that, at the sub-lexical level, hierarchical linguistic modeling is effective in providing generic sub-word structure and probability constraints. Since such constraints are not restricted to a fixed system vocabulary, they can help the recognizer correctly identify previously unseen words. Together with the unknown word support from natural language understanding, a conversational interface would be able to deal with unknown words better, and can possibly incorporate them into the active recognition vocabulary on-the-fly. At the supra-lexical level, experimental results showed that the shallow parsing model built within the proposed layered FST framework with top-level n-gram probabilities and phrase-level context-dependent probabilities was able to reduce recognition errors, compared to a class n-gram model of the same order. However, we also found that its application can be limited by the complexity of the composed FSTs. This suggests that, with a much more complex grammar at the supra-lexical level, a proper tradeoff between tight knowledge integration and system complexity becomes more important ...by Xiaolong Mou.Ph.D

CiteSeerX

DSpace@MIT