12 research outputs found

    Combination Strategies for Semantic Role Labeling

    Full text link
    This paper introduces and analyzes a battery of inference models for the problem of semantic role labeling: one based on constraint satisfaction, and several strategies that model the inference as a meta-learning problem using discriminative classifiers. These classifiers are developed with a rich set of novel features that encode proposition and sentence-level information. To our knowledge, this is the first work that: (a) performs a thorough analysis of learning-based inference models for semantic role labeling, and (b) compares several inference strategies in this context. We evaluate the proposed inference strategies in the framework of the CoNLL-2005 shared task using only automatically-generated syntactic information. The extensive experimental evaluation and analysis indicates that all the proposed inference strategies are successful -they all outperform the current best results reported in the CoNLL-2005 evaluation exercise- but each of the proposed approaches has its advantages and disadvantages. Several important traits of a state-of-the-art SRL combination strategy emerge from this analysis: (i) individual models should be combined at the granularity of candidate arguments rather than at the granularity of complete solutions; (ii) the best combination strategy uses an inference model based in learning; and (iii) the learning-based inference benefits from max-margin classifiers and global feedback

    Event Coreference Resolution by Iteratively Unfolding Inter-dependencies among Events

    Full text link
    We introduce a novel iterative approach for event coreference resolution that gradually builds event clusters by exploiting inter-dependencies among event mentions within the same chain as well as across event chains. Among event mentions in the same chain, we distinguish within- and cross-document event coreference links by using two distinct pairwise classifiers, trained separately to capture differences in feature distributions of within- and cross-document event clusters. Our event coreference approach alternates between WD and CD clustering and combines arguments from both event clusters after every merge, continuing till no more merge can be made. And then it performs further merging between event chains that are both closely related to a set of other chains of events. Experiments on the ECB+ corpus show that our model outperforms state-of-the-art methods in joint task of WD and CD event coreference resolution.Comment: EMNLP 201

    Проблема кореферентности и модель кодификации клинической информации

    Get PDF
    Рассмотрены прикладные задачи информатизации лечебно-диагностического процесса, исследованы особенности клинических симптомов и синдромов (нозологических форм) как информационных объектов баз данны

    Szemantikus szerepek automatikus címkézése függőségi elemző alkalmazásával magyar nyelvű gazdasági szövegeken

    Get PDF
    Jelen tanulmányunkban bemutatjuk gazdag jellemzőtéren alapuló gépi tanuló megközelítésünket, amely automatikusan képes magyar nyelvű szövegekben szemantikus szerepek címkézésére függőségi elemző alkalmazásával. Munkánkban a vállalati vásárlások, tulajdonváltozások keretével foglalkoztunk. Jellemzőkészletünkben felszíni, morfológiai és a függőségi elemzés alapján kinyert jellemzőket használtunk fel. Ezen alapjellemzőket kiegészítettük a jellemzőkből számolt statisztikai arányokkal is. Megvizsgáltuk, hogy a modell hogyan teljesít egy gyakori célszóra önállóan, és a célszavak keretekbe összefoglalt csoportjára is

    Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora

    Full text link
    Cross-document event coreference resolution (CDCR) is an NLP task in which mentions of events need to be identified and clustered throughout a collection of documents. CDCR aims to benefit downstream multi-document applications, but despite recent progress on corpora and system development, downstream improvements from applying CDCR have not been shown yet. We make the observation that every CDCR system to date was developed, trained, and tested only on a single respective corpus. This raises strong concerns on their generalizability -- a must-have for downstream applications where the magnitude of domains or event mentions is likely to exceed those found in a curated corpus. To investigate this assumption, we define a uniform evaluation setup involving three CDCR corpora: ECB+, the Gun Violence Corpus and the Football Coreference Corpus (which we reannotate on token level to make our analysis possible). We compare a corpus-independent, feature-based system against a recent neural system developed for ECB+. Whilst being inferior in absolute numbers, the feature-based system shows more consistent performance across all corpora whereas the neural system is hit-and-miss. Via model introspection, we find that the importance of event actions, event time, etc. for resolving coreference in practice varies greatly between the corpora. Additional analysis shows that several systems overfit on the structure of the ECB+ corpus. We conclude with recommendations on how to achieve generally applicable CDCR systems in the future -- the most important being that evaluation on multiple CDCR corpora is strongly necessary. To facilitate future research, we release our dataset, annotation guidelines, and system implementation to the public.Comment: Accepted at CL Journa

    XI. Magyar Számítógépes Nyelvészeti Konferencia

    Get PDF
    corecore