415 research outputs found

    Dialogue management systems: a survey and overview

    Get PDF
    This report provides an overview of the current issues and techniques for the modelling of dialogues using a computer. A dialogue management system can manage a dialogue between two or more agents, be they human or computer. Recently, increasingly complex dialogues are being modelled which allow a range of discourse phenomena including ellipsis and anaphoric reference. Such dialogues are thought to be similar to those between two humans, and accurate modelling of these phenomena leads to "pleasant", i.e. easy to talk to, and natural human-computer dialogues. Dialogue management can be classified into three often overlapping approaches: discourse grammars, plan-based and collaborative approaches. The design of a system often begins by eliciting the language used initially between two humans and later by Wizard of Oz experiments. Special issues relating to dialogue management systems are discussed including recovery strategies from different types of errors and the coding of dialogue in corpora. Lastly, approaches to evaluation are briefly discussed from the qualitative and quantitative viewpoints, recognising the importance and size of this sub-field

    Developing a corpus-based grammar model within a continuous commercial speech recognition package

    Get PDF
    This paper is derived from experiments with a commercial ’off-the-shelf’ continuous speech recognition system, applied to the apparently restricted domain of Air Traffic Control (ATC) for light aircraft. The system is required to transcribe key sub-phrases in a transmission by the ATC to a particular aircraft, the commercial speech recognition system providing the main recognition component. After the development of a corpus of transmissions, it was realised that key information is often interspersed with unconstrained English. Initial attempts focused on using a wildcard mechanism for the non-key sub- phrases. The mechanism, however, proved to be valuable only in simplistic grammars due to its overgenerative nature. The speech recognition system showed us that whilst useful mechanisms are provided, such as the wildcard mechanism, they tend to make over-simplistic assumptions about English grammar and dialogue structure

    Comparing linguistic interpretation schemes for English corpora

    Get PDF
    Project AMALGAM explored a range of Partof-Speech tagsets and phrase structure parsing schemes used in modern English corpus-based research. The PoS-tagging schemes and parsing schemes include some which have been used for hand annotation of corpora or manual postediting of automatic taggers or parsers; and others which are unedited output of a parsing program. Project deliverables include: a detailed description of each PoS-tagging scheme, and multi-tagged corpus; a “Corpus-neutral ” tokenization scheme; a family of PoS-taggers, for 8 PoS-tagsets; a method for “PoS-tagset conversion”, a sample of texts parsed according to a range of parsing schemes: a MultiTreebank; an Internet service allowing researchers worldwide free access to the above resources, including a simple email-based method for PoS-tagging any English text with any or all PoS-tagset(s). We conclude that the range of tagging and parsing schemes in use is too varied to allow agreement on a standard; and that parserevaluation based on ‘bracket-matching ’ is unfair to more sophisticated parsers

    A comparative evaluation of modern English corpus grammatical annotation schemes

    Get PDF
    Many English Corpus Linguistics projects reported in ICAME Journal and elsewhere involve grammatical analysis or tagging of English texts (eg Atwell 1983, Leech et al 1983, Booth 1985, Owen 1987, Souter 1989a, O’Donoghue 1991, Belmore 1991, Kytö and Voutilainen 1995, Aarts 1996, Qiao and Huang 1998). Each new project has to review existing tagging schemes, and decide which to adopt and/or adapt. The AMALGAM project can help in this decision, by providing descriptions and analyses of a range of tagging schemes, and an internet-based service for researchers to try out the range of tagging schemes on their own data. The project AMALGAM (Automatic Mapping Among Lexico-Grammatical Annotation Models) explored a range of Part-of-Speech tagsets and phrase structure parsing schemes used in modern English corpus-based research. The PoS-tagging schemes include: Brown (Greene and Rubin 1981), LOB (Atwell 1982, Johansson et al 1986), Parts (man 1986), SEC (Taylor and Knowles 1988), POW (Souter 1989b), UPenn (Santorini 1990), LLC (Eeg-Olofsson 1991), ICE (Greenbaum 1993), and BNC (Garside 1996). The parsing schemes include some which have been used for hand annotation of corpora or manual post-editing of automatic parsers, and others which are unedited output of a parsing program. Project deliverables include: – a detailed description of each PoS-tagging scheme, at a comparable level of detail. This includes a list of PoS-tags with descriptions and example uses from the source Corpus. The description of the use of PoS-tags is also illustrated in a multi-tagged corpus: a set of sample texts PoS-tagged in parallel with each PoS-tagset (and proofread by experts), for comparative studies – an analysis of the different lexical tokenization rules used in the source Corpora, to arrive at a ‘Corpus-neutral’ tokenization scheme (and consequent adjustments to the PoS-tagsets in our study to accept modified tokenization) – an implementation of each PoS-tagset in conjunction with our standardised tokenizer, as a family of PoS-taggers, one for each PoS-tagset – a method for ‘PoS-tagset conversion’, taking a text tagged according to one PoS-tagset and outputting the text annotated with another PoS-tagset – a sample of texts parsed according to a range a parsing schemes: a Multi-Treebank resource for comparative studies – an Internet service allowing researchers worldwide free access to the above resources, including a simple email-based method for PoS-tagging any English text with any or all PoS-tagset(s)

    Drinking Water Safety & Security Planning (DWSSP) Structured Follow - up Implementation Guide

    Get PDF
    The International WaterCentre (IWC) at Griffith University, in partnership with The University of the South Pacific (USP), have prepared this Implementation Guide following pilot testing with the Department of Water Resources (DoWR) and Vanuatu Red Cross (VRC) in five villages in the Shefa province, Republic of Vanuatu. Research shows that Drinking Water Safety & Security Planning (DWSSP) has mixed results, with community Implementation Plans often not being progressed by communities due to a lack of ownership and collective action. As with community water management more generally, communities require some sort of follow-up support. This guide contains five targeted activities designed to assist communities to re-engage with their Implementation Plans. This DWSSP follow-up activities are especially designed for communities whose DWSSP Plans have stalled, and who may not have received some follow-up visits since the initial DWSSP intervention. This is not intended to be the only form of follow-up support provided to communities

    Multi-level disambiguation grammar inferred from English Corpus, treebank, and dictionary

    Get PDF
    It is shown that grammatical inference is applicable to natural language processing. Given the wide and complex range of structures appearing in an unrestricted natural language like English, full grammatical inference, yielding a comprehensive syntactic and semantic definition of English, is too much to hope for at present. Instead, the authors focus on techniques for dealing with ambiguity resolution by probabilistic ranking; this does not require a full formal Chomskyan grammar. They give a short overview of the different levels and methods being investigated at CCALAS for probabilistic ranking of candidates in ambiguous English input
    • 

    corecore