unknown

Developing a corpus-based grammar model within a continuous commercial speech recognition package

Abstract

This paper is derived from experiments with a commercial ’off-the-shelf’ continuous speech recognition system, applied to the apparently restricted domain of Air Traffic Control (ATC) for light aircraft. The system is required to transcribe key sub-phrases in a transmission by the ATC to a particular aircraft, the commercial speech recognition system providing the main recognition component. After the development of a corpus of transmissions, it was realised that key information is often interspersed with unconstrained English. Initial attempts focused on using a wildcard mechanism for the non-key sub- phrases. The mechanism, however, proved to be valuable only in simplistic grammars due to its overgenerative nature. The speech recognition system showed us that whilst useful mechanisms are provided, such as the wildcard mechanism, they tend to make over-simplistic assumptions about English grammar and dialogue structure

    Similar works