23 research outputs found

    Improving speech interaction with mixed initiative dialogue Loquendo SDS- Spoken Dialog System

    No full text
    The transition from system-directed to mixed-initiative human-machine dialogue has been a focus of several research institutes and industries during the past several years [4,5]. The term “mixedinitiative” was introduced in the artificial intelligence literature by Jaime Carbonell, who described a prototype that “is capable of maintaining a mixed initiative dialogue with the student, with questions asked by either side and answered by the other ” [3]. A similar definition is proposed by Balentine and Morgan: they write that, the term “mixed-initiative ” identifies “a dialogue in which interactions are sometimes initiated by the users and sometimes by the machine ” [1], but that smooth turn-taking is still difficult, and not yet fully realized. The interaction style in mixed-initiative dialogues may be radically different from the question-answer sequences provided by most off-the-shelf speech tools, because users can decide in which order their requests parameters are to be specified, reducing the dialogue time and increasing the success rate. However, the increased flexibility also promotes greater complexity. Actually, several factors affect the transition in question, the most important of which are the troubles related to modelling a user’s spontaneous linguistic behaviour, and the development costs of highly flexible dialogue models. Loquendo has identified some feasible solutions to some of these critical factors

    ADAM Corpus

    No full text
    The ADAM spoken corpus is a collection of 450 spoken dialogues: they are both human-human (200 dialogues) and human-machine (250 dialogues). All the dialogues are recordings and transcriptions of telephone conversations in the semantic domain of tourism and railway transportation. The format of the audio files is the standard format for telephone signal data recommended by the SPEECHDAT3 project directions. Each dialogue is annotated at five levels of linguistic information: prosody, morphosyntax, syntax, semantics and pragmatics. For each level a corresponding annotation scheme has been defined that provides annotation instructions, examples and criteria. The result of each annotation is an XML file that encodes the content of a dialogue with respect to a particular level according to the annotation scheme of that level. The human-human dialogues are simulated telephone conversations between two experimental subjects, playing the roles of a travel agent and of a caller, respectively. The human-machine dialogues were collected on the field: they are interactions between callers and the automatic telephone information service of the Italian railway company, recorded during an experimental phase of that service. Each dialogue in the ADAM corpus is represented by an orthographic transcription (physically an XML file), which in turn is linked to an audio file containing the corresponding recording. In addition, the transcription of each dialogue is associated to five XML annotation files, according to five different levels or layers of linguistic information, namely prosody, morphosyntax, syntax, semantics and pragmatics
    corecore