594,754 research outputs found

    Information extraction tools and methods for understanding dialogue in a companion

    Get PDF
    The authors' research was sponsored by the European Commission under EC grant IST-FP6-034434 (Companions).This paper discusses how Information Extraction is used to understand and manage Dialogue in the EU-funded Companions project. This will be discussed with respect to the Senior Companion, one of two applications under development in the EU-funded Companions project. Over the last few years, research in human-computer dialogue systems has increased and much attention has focused on applying learning methods to improving a key part of any dialogue system, namely the dialogue manager. Since the dialogue manager in all dialogue systems relies heavily on the quality of the semantic interpretation of the user’s utterance, our research in the Companions project, focuses on how to improve the semantic interpretation and combine it with knowledge from the Knowledge Base to increase the performance of the Dialogue Manager. Traditionally the semantic interpretation of a user utterance is handled by a natural language understanding module which embodies a variety of natural language processing techniques, from sentence splitting, to full parsing. In this paper we discuss the use of a variety of NLU processes and in particular Information Extraction as a key part of the NLU module in order to improve performance of the dialogue manager and hence the overall dialogue system.peer-reviewe

    A Cluster Ranking Model for Full Anaphora Resolution

    Get PDF
    Anaphora resolution (coreference) systems designed for theCONLL2012 dataset typically cannot handle key aspects of the full anaphoraresolution task such as the identification of singletons and of certain types of non-referring expressions (e.g., expletives), as these aspectsare not annotated in that corpus. However, the recently releasedCRAC2018 Shared Task and Phrase Detectives (PD) datasets can nowbe used for that purpose. In this paper, we introduce an architecture to simultaneously identify non-referring expressions (includingexpletives, predicativeNPs, and other types) and build coreference chains, including singletons. Our cluster-ranking system uses anattention mechanism to determine the relative importance of the mentions in the same cluster. Additional classifiers are used to identifysingletons and non-referring markables. Our contributions are as follows. First of all, we report the first result on theCRACdata usingsystem mentions; our result is 5.8% better than the shared task baseline system, which used gold mentions. Our system also outperformsthe best-reported system onPDby up to 5.3%. Second, we demonstrate that the availability of singleton clusters and non-referringexpressions can lead to substantially improved performance on non-singleton clusters as well. Third, we show that despite our model notbeing designed specifically for theCONLLdata, it achieves a very competitive result

    Neural Mention Detection

    Get PDF
    Mention detection is an important preprocessing step for annotation and interpretation in applications such as NER and coreference resolution, but few stand-alone neural models have been proposed able to handle the full range of mentions. In this work, we propose and compare three neural network-based approaches to mention detection. The first approach is based on the mention detection part of a state of the art coreference resolution system; the second uses ELMO embeddings together with a bidirectional LSTM and a biaffine classifier; the third approach uses the recently introduced BERT model. Our best model (using a biaffine classifier) achieves gains of up to 1.8 percentage points on mention recall when compared with a strong baseline in a HIGH RECALL coreference annotation setting. The same model achieves improvements of up to 5.3 and 6.2 p.p. when compared with the best-reported mention detection F1 on the CONLL and CRAC coreference data sets respectively in a HIGH F1 annotation setting. We then evaluate our models for coreference resolution by using mentions predicted by our best model in start-of-the-art coreference systems. The enhanced model achieved absolute improvements of up to 1.7 and 0.7 p.p. when compared with our strong baseline systems (pipeline system and end-to-end system) respectively. For nested NER, the evaluation of our model on the GENIA corpora shows that our model matches or outperforms state-of-the-art models despite not being specifically designed for this task

    CoSimLex : A Resource for Evaluating Graded Word Similarity in Context

    Get PDF
    State of the art natural language processing tools are built on context-dependent word embeddings, but no direct method for evaluating these representations currently exists. Standard tasks and datasets for intrinsic evaluation of embeddings are based on judgements of similarity, but ignore context; standard tasks for word sense disambiguation take account of context but do not provide continuous measures of meaning similarity. This paper describes an effort to build a new dataset, CoSimLex, intended to fill this gap. Building on the standard pairwise similarity task of SimLex-999, it provides context-dependent similarity measures; covers not only discrete differences in word sense but more subtle, graded changes in meaning; and covers not only a well-resourced language (English) but a number of less-resourced languages. We define the task and evaluation metrics, outline the dataset collection methodology, and describe the status of the dataset so far.Peer reviewe

    Italian via email: From an online project of learning and teaching towards the development of a multi‐cultural discourse community

    Get PDF
    This paper seeks to illustrate how the use of Internet resources (specifically email and the Web) can affect and enhance language learning and cultural understanding, modify the learning environment, reduce the barriers which time, space and societal differences may create, be a source of motivation, and redefine the role of teachers and learners. Although it is based on an on‐going project, it already provides practical evidence of some advantages email and Internet resources can bring to the language learner and to the teacher. A detailed evaluation of the language outcomes is under way, but incomplete at the time of writing. This paper is nevertheless more concerned with other variables of language learning and teaching which the author considers fundamental to reach a successful degree of language use

    Contextual bitext-derived paraphrases in automatic MT evaluation

    Get PDF
    In this paper we present a novel method for deriving paraphrases during automatic MT evaluation using only the source and reference texts, which are necessary for the evaluation, and word and phrase alignment software. Using target language paraphrases produced through word and phrase alignment a number of alternative reference sentences are constructed automatically for each candidate translation. The method produces lexical and lowlevel syntactic paraphrases that are relevant to the domain in hand, does not use external knowledge resources, and can be combined with a variety of automatic MT evaluation system
    • 

    corecore