4 research outputs found

    The ESPRIT project polyglot

    No full text

    A duration model for phonetic units in isolated Dutch words

    No full text
    this paper was done as part of the ESPRIT project POLYGLOT. The aim of the project is to develop a multi-lingual Speech-to-Text and Text-to-Speech system. Part of the work comprises the adaptation of a large vocabulary isolated word speech recognition (IWSR) system, originally developed for Italian (Billi et al., 1989), to a number of other European languages, including Dutch. The IWSR system runs on a MS-DOS PC that uses one or two special purpose plug-in boards. The speech is picked up by a table-mounted microphone and A/D-converted with a sampling frequency of 16 kHz and a 12 bit resolution. For each 10 ms frame 20 LPC Cepstrum coefficients are calculated, which are used to calculate the acoustic distance to a set of stored prototypes. A prototype represents a phonetic unit, and because these phonetic units are not always phonemes we will use the terms prototype and phonetic unit throughout this article. The resulting lattice of prototype labels is then submitted to a dynamic programming procedure that outputs the most likely string of prototype labels. To calculate the optimal path in the lattice, the dynamic programming algorithm uses the acoustic distances (which are stored in the lattice) and statistics on prototype frequency, prototype-pair frequency, and the duration of prototypes. The prototype string is then used for fast lexical access, to retrieve the most likely word candidates. In a later stage of the recognition process, called Fine Phonetic Analysis (FPA), a top-down algorithm is used to find the best candidate. For FPA the statistics on the duration of prototypes is also required (Billi et al., 1989). The statistics on prototype frequency and prototype-pair frequency can be derived from large text corpora with tools that were developed in a previous ES..
    corecore