Search CORE

10 research outputs found

Identity and Granularity of Events in Text

Author: Cybulska Agata
Vossen Piek
Publication venue
Publication date: 13/04/2017
Field of study

In this paper we describe a method to detect event descrip- tions in different news articles and to model the semantics of events and their components using RDF representations. We compare these descriptions to solve a cross-document event coreference task. Our com- ponent approach to event semantics defines identity and granularity of events at different levels. It performs close to state-of-the-art approaches on the cross-document event coreference task, while outperforming other works when assuming similar quality of event detection. We demonstrate how granularity and identity are interconnected and we discuss how se- mantic anomaly could be used to define differences between coreference, subevent and topical relations.Comment: Invited keynote speech by Piek Vossen at Cicling 201

arXiv.org e-Print Archive

VU Research Portal

Event coreference in the news:Who, what, where and when?

Author: Cybulska Agata Katarzyna
Publication venue: s.n.
Publication date: 15/04/2021
Field of study

VU Research Portal

Event Coreference Resolution by Iteratively Unfolding Inter-dependencies among Events

Author: Choubey Prafulla Kumar
Huang Ruihong
Publication venue
Publication date: 01/01/2017
Field of study

We introduce a novel iterative approach for event coreference resolution that gradually builds event clusters by exploiting inter-dependencies among event mentions within the same chain as well as across event chains. Among event mentions in the same chain, we distinguish within- and cross-document event coreference links by using two distinct pairwise classifiers, trained separately to capture differences in feature distributions of within- and cross-document event clusters. Our event coreference approach alternates between WD and CD clustering and combines arguments from both event clusters after every merge, continuing till no more merge can be made. And then it performs further merging between event chains that are both closely related to a set of other chains of events. Experiments on the ECB+ corpus show that our model outperforms state-of-the-art methods in joint task of WD and CD event coreference resolution.Comment: EMNLP 201

arXiv.org e-Print Archive

Crossref

Event-based Access to Historical Italian War Memoirs

Author: Nanni Federico
Ponzetto Simone Paolo
Rovera Marco
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

The progressive digitization of historical archives provides new, often domain specific, textual resources that report on facts and events which have happened in the past; among these, memoirs are a very common type of primary source. In this paper, we present an approach for extracting information from Italian historical war memoirs and turning it into structured knowledge. This is based on the semantic notions of events, participants and roles. We evaluate quantitatively each of the key-steps of our approach and provide a graph-based representation of the extracted knowledge, which allows to move between a Close and a Distant Reading of the collection.Comment: 23 pages, 6 figure

arXiv.org e-Print Archive

MAnnheim DOCument Server

Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora

Author: Bugert Michael
Gurevych Iryna
Reimers Nils
Publication venue
Publication date: 10/06/2021
Field of study

Cross-document event coreference resolution (CDCR) is an NLP task in which mentions of events need to be identified and clustered throughout a collection of documents. CDCR aims to benefit downstream multi-document applications, but despite recent progress on corpora and system development, downstream improvements from applying CDCR have not been shown yet. We make the observation that every CDCR system to date was developed, trained, and tested only on a single respective corpus. This raises strong concerns on their generalizability -- a must-have for downstream applications where the magnitude of domains or event mentions is likely to exceed those found in a curated corpus. To investigate this assumption, we define a uniform evaluation setup involving three CDCR corpora: ECB+, the Gun Violence Corpus and the Football Coreference Corpus (which we reannotate on token level to make our analysis possible). We compare a corpus-independent, feature-based system against a recent neural system developed for ECB+. Whilst being inferior in absolute numbers, the feature-based system shows more consistent performance across all corpora whereas the neural system is hit-and-miss. Via model introspection, we find that the importance of event actions, event time, etc. for resolving coreference in practice varies greatly between the corpora. Additional analysis shows that several systems overfit on the structure of the ECB+ corpus. We conclude with recommendations on how to achieve generally applicable CDCR systems in the future -- the most important being that evaluation on multiple CDCR corpora is strongly necessary. To facilitate future research, we release our dataset, annotation guidelines, and system implementation to the public.Comment: Accepted at CL Journa

arXiv.org e-Print Archive

TUbiblio

NewsReader: Using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news

Author: Agerri Rodrigo
Aldabe Itziar
Cybulska Agata
Fokkens Antske
Laparra Egoitz
Minard Anne-Lyse
Palmero Aprosio Alessio
Rigau German
Rospocher Marco
Segers Roxane
van Erp Marieke
Vossen Piek
Publication venue
Publication date: 01/01/2016
Field of study

Abstract In this article, we describe a system that reads news articles in four different languages and detects what happened, who is involved, where and when. This event-centric information is represented as episodic situational knowledge on individuals in an interoperable RDF format that allows for reasoning on the implications of the events. Our system covers the complete path from unstructured text to structured knowledge, for which we defined a formal model that links interpreted textual mentions of things to their representation as instances. The model forms the skeleton for interoperable interpretation across different sources and languages. The real content, however, is defined using multilingual and cross-lingual knowledge resources, both semantic and episodic. We explain how these knowledge resources are used for the processing of text and ultimately define the actual content of the episodic situational knowledge that is reported in the news. The knowledge and model in our system can be seen as an example how the Semantic Web helps NLP. However, our systems also generate massive episodic knowledge of the same type as the Semantic Web is built on. We thus envision a cycle of knowledge acquisition and NLP improvement on a massive scale. This article reports on the details of the system but also on the performance of various high-level components. We demonstrate that our system performs at state-of-the-art level for various subtasks in the four languages of the project, but that we also consider the full integration of these tasks in an overall system with the purpose of reading text. We applied our system to millions of news articles, generating billions of triples expressing formal semantic properties. This shows the capacity of the system to perform at an unprecedented scale

Elsevier - Publisher Connector

VU Research Portal

Archivio della ricerca - Fondazione Bruno Kessler

Catalogo dei prodotti della ricerca

Open Access Repository

Bag of Events” Approach to Event Coreference Resolution. Supervised Classification of Event Templates

Author: Cybulska A.K.
Vossen P.T.J.M.
Publication venue
Publication date: 01/01/2015
Field of study

VU Research Portal

Bag of Events” Approach to Event Coreference Resolution. Supervised Classification of Event Templates

Author: Cybulska A.K.
Gelbukh Alexander
Vossen P.T.J.M.
Publication venue
Publication date
Field of study