11 research outputs found

    A stand-off XML-TEI representation of reference annotation

    Get PDF
    International audienceIn this poster, we present an XML-TEI conformant stand-off representation of reference in discourse, building on the seminal workcarried out in the MATE project (Poesio, Bruneseaux & Romary 1999) and the earlier proposal on a reference annotation framework in Salmon-Alt & Romary (2005). We make a three-way distinction between markables (the referring expressions), discourse entities (referents in the textual or extra-textual world), and links (relations that hold between referents, e.g., part-whole). Our approach differs from previous suggestions in that (i) inherent properties of the referent itself (e.g., animacy) are disentangled from the expressions used to refer to that referent, (ii) existing annotations from other layers such as morphosyntax are cleanly separated from the annotation of reference, but can be combined in queries and (iii) ourproposal is integrated into the larger structure of existing TEI-ISO standards, thereby allowing for compatibility with existing TEI-encodedcorpora and data sustainability. The workflow of adding reference annotations to an existing corpus willbe demonstrated with concrete examples from ongoing work in the SFB 1252 (subprojects C01 and INF), where this representation ofreference is the backbone for the annotation of (sentence) topic chains in dialogue data and for queries of topics in various grammaticalconstructions

    processed data

    No full text

    What can Neural Referential Form Selectors Learn?

    Get PDF
    Despite achieving encouraging results, neural Referring Expression Generation models are often thought to lack transparency. We probed neural Referential Form Selection (RFS) models to find out to what extent the linguistic features influencing the RE form are learned and captured by state-of-the-art RFS models. The results of 8 probing tasks show that all the defined features were learned to some extent. The probing tasks pertaining to referential status and syntactic position exhibited the highest performance. The lowest performance was achieved by the probing models designed to predict discourse structure properties beyond the sentence level

    What can Neural Referential Form Selectors Learn?

    No full text

    Neural referential form selection: Generalisability and interpretability

    No full text
    In recent years, a range of Neural Referring Expression Generation (REG) systems have been built and they have often achieved encouraging results. However, these models are often thought to lack transparency and generality. Firstly, it is hard to understand what these neural REG models can learn and to compare their performance with existing linguistic theories. Secondly, it is unclear whether they can generalise to data in different text genres and different languages. To answer these questions, we propose to focus on a sub-task of REG: Referential Form Selection (RFS). We introduce the task of RFS and a series of neural RFS models built on state-of-the-art neural REG models. To address the issue of interpretability, we probe these RFS models using probing classifiers that consider information known to impact the human choice of Referential Forms. To address the issue of generalisability, we assess the performance of RFS models on multiple datasets in multiple genres and two different languages, namely, English and Chinese

    Non-neural Models Matter: a Re-evaluation of Neural Referring Expression Generation Systems

    Get PDF
    In recent years, neural models have often outperformed rule-based and classic Machine Learning approaches in NLG. These classic approaches are now often disregarded, for example when new neural models are evaluated. We argue that they should not be overlooked, since, for some tasks, well-designed non-neural approaches achieve better performance than neural ones. In this paper, the task of generating referring expressions in linguistic context is used as an example. We examined two very different English datasets (WEBNLG and WSJ), and evaluated each algorithm using both automatic and human evaluations. Overall, the results of these evaluations suggest that rule-based systems with simple rule sets achieve on-par or better performance on both datasets compared to state-of-the-art neural REG systems. In the case of the more realistic dataset, WSJ, a machine learning-based system with well-designed linguistic features performed best. We hope that our work can encourage researchers to consider non-neural models in future.Comment: ACL 202

    A stand-off XML-TEI representation of reference annotation

    No full text
    International audienceIn this poster, we present an XML-TEI conformant stand-off representation of reference in discourse, building on the seminal workcarried out in the MATE project (Poesio, Bruneseaux & Romary 1999) and the earlier proposal on a reference annotation framework in Salmon-Alt & Romary (2005). We make a three-way distinction between markables (the referring expressions), discourse entities (referents in the textual or extra-textual world), and links (relations that hold between referents, e.g., part-whole). Our approach differs from previous suggestions in that (i) inherent properties of the referent itself (e.g., animacy) are disentangled from the expressions used to refer to that referent, (ii) existing annotations from other layers such as morphosyntax are cleanly separated from the annotation of reference, but can be combined in queries and (iii) ourproposal is integrated into the larger structure of existing TEI-ISO standards, thereby allowing for compatibility with existing TEI-encodedcorpora and data sustainability. The workflow of adding reference annotations to an existing corpus willbe demonstrated with concrete examples from ongoing work in the SFB 1252 (subprojects C01 and INF), where this representation ofreference is the backbone for the annotation of (sentence) topic chains in dialogue data and for queries of topics in various grammaticalconstructions
    corecore