29 research outputs found

    PARADISE: A Framework for Evaluating Spoken Dialogue Agents

    Full text link
    This paper presents PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating spoken dialogue agents. The framework decouples task requirements from an agent's dialogue behaviors, supports comparisons among dialogue strategies, enables the calculation of performance over subdialogues and whole dialogues, specifies the relative contribution of various factors to performance, and makes it possible to compare agents performing different tasks by normalizing for task complexity.Comment: 10 pages, uses aclap, psfig, lingmacros, time

    Pronominal anaphora in Basque: annotation of a real corpus

    Get PDF
    This paper describes the process followed in the annotation of pronominal anaphora in the Eus3LB corpus1 of Basque. Our aim is to use this annotation as the basis for later computational treatment of our language. We present the linguistic analysis carried out, the criteria defined for the tagging and some relevant linguistic conclusions about the features of the antecedents needed to link them correctly to their anaphoric elements

    Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation

    Get PDF
    We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov models and decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech waveforms. We evaluate our approach on the Broadcast News corpus, using the DARPA-TDT evaluation metrics. Results show that the prosodic model alone is competitive with word-based segmentation methods. Furthermore, we achieve a significant reduction in error by combining the prosodic and word-based knowledge sources.Comment: 27 pages, 8 figure

    Measuring Agreement on Set-valued Items (MASI) for Semantic and Pragmatic Annotation

    Get PDF
    Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creating annotation schemes that oversimplify the phenomena, or that capture distinctions conventional reliability metrics cannot measure adequately. The solution to the dilemma is to develop metrics that quantify the decisions that annotators are asked to make. This paper discusses MASI, distance metric for comparing sets, and illustrates its use in quantifying the reliability of a specific dataset. Annotations of Summary Content Units (SCUs) generate models referred to as pyramids which can be used to evaluate unseen human summaries or machine summaries. The paper presents reliability results for five pairs of pyramids created for document sets from the 2003 Document Understanding Conference (DUC). The annotators worked independently of each other. Differences between application of MASI to pyramid annotation and its previous application to co-reference annotation are discussed. In addition, it is argued that a paradigmatic reliability study should relate measures of inter-annotator agreement to independent assessments, such as significance tests of the annotated variables with respect to other phenomena. In effect, what counts as sufficiently reliable intera-annotator agreement depends on the use the annotated data will be put to

    Measuring the differences between human-human and human-machine dialogs

    Get PDF
    In this paper, we assess the applicability of user simulation techniques to generate dialogs which are similar to real human-machine spoken interactions.To do so, we present the results of the comparison between three corpora acquired by means of different techniques. The first corpus was acquired with real users.A statistical user simulation technique has been applied to the same task to acquire the second corpus. In this technique, the next user answer is selected by means of a classification process that takes into account the previous dialog history, the lexical information in the clause, and the subtask of the dialog to which it contributes. Finally, a dialog simulation technique has been developed for the acquisition of the third corpus. This technique uses a random selection of the user and system turns, defining stop conditions for automatically deciding if the simulated dialog is successful or not. We use several evaluation measures proposed in previous research to compare between our three acquired corpora, and then discuss the similarities and differences with regard to these measures

    Cue Phrase Classification Using Machine Learning

    Full text link
    Cue phrases may be used in a discourse sense to explicitly signal discourse structure, but also in a sentential sense to convey semantic rather than structural information. Correctly classifying cue phrases as discourse or sentential is critical in natural language processing systems that exploit discourse structure, e.g., for performing tasks such as anaphora resolution and plan recognition. This paper explores the use of machine learning for classifying cue phrases as discourse or sentential. Two machine learning programs (Cgrendel and C4.5) are used to induce classification models from sets of pre-classified cue phrases and their features in text and speech. Machine learning is shown to be an effective technique for not only automating the generation of classification models, but also for improving upon previous results. When compared to manually derived classification models already in the literature, the learned models often perform with higher accuracy and contain new linguistic insights into the data. In addition, the ability to automatically construct classification models makes it easier to comparatively analyze the utility of alternative feature representations of the data. Finally, the ease of retraining makes the learning approach more scalable and flexible than manual methods.Comment: 42 pages, uses jair.sty, theapa.bst, theapa.st

    Inter-Coder Agreement for Computational Linguistics

    Get PDF
    This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff's alpha as well as Scott's pi and Cohen's kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasks—but that their use makes the interpretation of the value of the coefficient even harder. </jats:p

    Computer-based support for patients with limited English

    Full text link
    The paper describes a proposal for computer-based aids for patients with limited or no English. The paper describes the barriers to health-care experienced due to linguistic problems, then suggests some computer-based remedies incorporating a multi-engine machine translation system based on a corpus of doctor-patient interviews which provides a dialogue model for the system. The doctor&apos;s and patient&apos;s interfaces are described. Ideas from Augmentative and Alternative Communication and in particular picture-based communication are incorporated. The initial proposal will focus on Urdu- and Somali-speaking patients with respiratory problems.

    A relação entre a prosódia e a estrutura de narrativas espontâneas: um estudo perceptual

    Get PDF
    O objetivo central do presente estudo foi examinar em que medida a estrutura de narrativas orais espontâneas é reconhecida por examinadores inexperientes e leigos, e, como corolário, demonstrar qual o papel que as pistas prosódicas têm nesse processo de reconhecimento. Para isso, um protocolo experimental foi desenvolvido, em que narrativas em diferentes condições (só texto, texto e áudio, só áudio e só áudio filtrado) foram apresentadas. Participaram do estudo 48 informantes. Os resultados indicam que, mesmo sem acesso a informações lexicais, sintáticas e semânticas, as pessoas identificam, de forma consistente, uma estrutura discursiva em narrativas espontâneas, o que sugere ter a prosódia um relevante papel na percepção da estrutura discursiva em narrativas orais espontâneas
    corecore