79,924 research outputs found
Voice recognition system for Massey University Smarthouse : a thesis presented in partial fulfilment of the requirements for the degree of Master of Engineering in Information Engineering at Massey University
The concept of a smarthouse aims to integrate technology into houses to a level where most daily tasks are automated and to provide comfort, safety and entertainment to the house residents. The concept is mainly aimed at the elderly population to improve their quality of life. In order to maintain a natural medium of communication, the house employs a speech recognition system capable of analysing spoken language, and extracting commands from it. This project focuses on the development and evaluation of a windows application developed with a high level programming language which incorporates speech recognition technology by utilising a commercial speech recognition engine. The speech recognition system acts as a hub within the Smarthouse to receive and delegate user commands to different switching and control systems. Initial trails were built using Dragon Naturally Speaking as the recognition engine. However that proved inappropriate for use in the Smarthouse project as it is speaker dependent and requires each user to train it with his/her own voice. The application now utilizes the Microsoft Speech Application Programming Interface (SAPI), a software layer which sits between applications and speech engines and the Microsoft Speech Recognition Engine, which is freely distributed with some Microsoft products. Although Dragon Naturally Speaking offers better recognition for dictation, MS engine can be optimized using Context Free Grammar (CFG) to give enhanced recognition in the intended application. The application is designed to be speaker independent and can handle continuous speech. It connects to a database oriented expert system to carry out full conversations with the users. Audible prompts and confirmations are achieved through speech synthesis using any SAPI compliant text to speech engine. Other developments focused on designing a telephony system using Microsoft Telephony Application Programming Interface (TAPI). This allows the house to be remotely controlled from anywhere in the world. House residents will be able to call their house from any part of the world and regardless of their location, the house will be able to respond to and fulfil their commands
Design and development of automatic speech recognition system for Tamil language using CMU Sphinx 4
This paper presents a design and development of Speech Recognition System for Tamil language. This system is based on CMU Sphinx 4 open source speech recognition (ASR) engine developed by Carnegie Mellon University. This system should be adapted to speaker specific automatic, continuous speech. One of the main components of this system is a core Tamil speech recognition system that can be trained with field specific data. The target domain is the accent spoken by illiterate Tamil-speaker from Eastern area of Sri Lanka. The phonetically rich and balanced sentence text corpus were developed and recorded in conditional environment to set up speaker specific speech corpus. Using this speech corpus the system was trained and tested with speaker specific (testing with same word uttered by same person) and speaker independent data (testing with different word uttered by different person). The system currently gives a satisfactory peak performance of 39.5% Word Error Rate (WER) for speaker specific and unsatisfactory rate for speaker independent data, which is comparable with the best word error rates of most of the recognition systems for continuous speech available for any language
Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean
A new tightly coupled speech and natural language integration model is
presented for a TDNN-based continuous possibly large vocabulary speech
recognition system for Korean. Unlike popular n-best techniques developed for
integrating mainly HMM-based speech recognition and natural language processing
in a {\em word level}, which is obviously inadequate for morphologically
complex agglutinative languages, our model constructs a spoken language system
based on a {\em morpheme-level} speech and language integration. With this
integration scheme, the spoken Korean processing engine (SKOPE) is designed and
implemented using a TDNN-based diphone recognition module integrated with a
Viterbi-based lexical decoding and symbolic phonological/morphological
co-analysis. Our experiment results show that the speaker-dependent continuous
{\em eojeol} (Korean word) recognition and integrated morphological analysis
can be achieved with over 80.6% success rate directly from speech inputs for
the middle-level vocabularies.Comment: latex source with a4 style, 15 pages, to be published in computer
processing of oriental language journa
Robust classification with context-sensitive features
This paper addresses the problem of classifying observations when features are context-sensitive, especially when the testing set involves a context that is different from the training set. The paper begins with a precise definition of the problem, then general strategies are presented for enhancing the performance of classification algorithms on this type of problem. These strategies are tested on three domains. The first domain is the diagnosis of gas turbine engines. The problem is to diagnose a faulty engine in one context, such as warm weather, when the fault has previously been seen only in another context, such as cold weather. The second domain is speech recognition. The context is given by the identity of the speaker. The problem is to recognize words spoken by a new speaker, not represented in the training set. The third domain is medical prognosis. The problem is to predict whether a patient with hepatitis will live or die. The context is the age of the patient. For all three domains, exploiting context results in substantially more accurate classification
Exploiting context when learning to classify
This paper addresses the problem of classifying observations when
features are context-sensitive, specifically when the testing set involves a context
that is different from the training set. The paper begins with a precise definition of
the problem, then general strategies are presented for enhancing the performance
of classification algorithms on this type of problem. These strategies are tested on
two domains. The first domain is the diagnosis of gas turbine engines. The
problem is to diagnose a faulty engine in one context, such as warm weather,
when the fault has previously been seen only in another context, such as cold
weather. The second domain is speech recognition. The problem is to recognize
words spoken by a new speaker, not represented in the training set. For both
domains, exploiting context results in substantially more accurate classification
- …