2 research outputs found

    A generic template for the evaluation of dialogue management systems

    Get PDF
    We present a generic template for spoken dialogue systems integrating speech recognition and synthesis with 'higher-level' natural language dialogue modelling components. The generic model is abstracted from a number of real application systems targetted at very different domains. Our research aim in developing this generic template is to investigate a new approach to the evaluation of Dialogue Management Systems. Rather than attempting to measure accuracy/speed of output, we propose principles for the evaluation of the underlying theoretical linguistic model of Dialogue Management in a given system, in terms of how well it fits our generic template for Dialogue Management Systems. This is a measure of 'genericness' or 'application-independence' of a given system, which can be used to moderate accuracy/speed scores in comparisons of very unlike DMSs serving different domains. This relates to (but is orthogonal to) Dialogue Management Systems evaluation in terms of naturalness and like measurable metrics (eg. Dybkjaer et al 1995, Vilnat 1996, EAGLES 1994, Fraser 1995); it follows more closely emerging qualitative evaluation techniques for NL grammatical parsing schemes (Leech et al 1996, Atwell 1996)

    Corpus linguistics and language learning: bootstrapping linguistic knowledge and resources from text

    Get PDF
    This submission for the award of the degree of PhD by published work must: “make a contribution to knowledge in a coherent and related subject area; demonstrate originality and independent critical ability; satisfy the examiners that it is of sufficient merit to qualify for the award of the degree of PhD.” It includes a selection of my work as a Lecturer (and later, Senior Lecturer) at Leeds University, from 1984 to the present. The overall theme of my research has been bootstrapping linguistic knowledge and resources from text. A persistent strand of interest has been unsupervised and semi-supervised machine learning of linguistic knowledge from textual sources; the attraction of this approach is that I could start with English, but go on to apply analogous techniques to other languages, in particular Arabic. This theme covers a broad range of research over more than 20 years at Leeds University which I have divided into 8 sub-topics: A: Constituent-Likelihood statistical modelling of English grammar; B: Machine Learning of grammatical patterns from a corpus; C: Detecting grammatical errors in English text; D: Evaluation of English grammatical annotation models; E: Machine Learning of semantic language models; F: Applications in English language teaching; G: Arabic corpus linguistics; H: Applications in Computing teaching and research. The first section builds on my early years as a lecturer at Leeds University, when my research was essentially a progression from my previous work at Lancaster University on the LOB Corpus Part-of-Speech Tagging project (which resulted in the Tagged LOB Corpus, a resource for Corpus Linguistics research still in use today); I investigated a range of ideas for extending and/or applying techniques related to Part-of-Speech tagging in Corpus Linguistics. The second section covers a range of co-authored papers representing grant-funded research projects in Corpus Linguistics; in this mode of research, I had to come up with the original ideas and guide the project, but much of the detailed implementation was down to research assistant staff. Another highly productive mode of research has been supervision of research students, leading to further jointly-authored research papers. I helped formulate the research plans, and guided and advised the students; as with research-grant projects, the detailed implementation of the research has been down to the research students. The third section includes a few of the most significant of these jointly-authored Corpus Linguistics research papers. A “standard” PhD generally includes a survey of the field to put the work in context; so as a fourth section, I include some survey papers aimed at introducing new developments in corpus linguistics to a wider audience