404 research outputs found

    The Development of a Temporal Information Dictionary for Social Media Analytics

    Get PDF
    Dictionaries have been used to analyze text even before the emergence of social media and the use of dictionaries for sentiment analysis there. While dictionaries have been used to understand the tonality of text, so far it has not been possible to automatically detect if the tonality refers to the present, past, or future. In this research, we develop a dictionary containing time-indicating words in a wordlist (T-wordlist). To test how the dictionary performs, we apply our T-wordlist on different disaster related social media datasets. Subsequently we will validate the wordlist and results by a manual content analysis. So far, in this research-in-progress, we were able to develop a first dictionary and will also provide some initial insight into the performance of our wordlist

    Explicit diversification of event aspects for temporal summarization

    Get PDF
    During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness

    A Survey on Event-based News Narrative Extraction

    Full text link
    Narratives are fundamental to our understanding of the world, providing us with a natural structure for knowledge representation over time. Computational narrative extraction is a subfield of artificial intelligence that makes heavy use of information retrieval and natural language processing techniques. Despite the importance of computational narrative extraction, relatively little scholarly work exists on synthesizing previous research and strategizing future research in the area. In particular, this article focuses on extracting news narratives from an event-centric perspective. Extracting narratives from news data has multiple applications in understanding the evolving information landscape. This survey presents an extensive study of research in the area of event-based news narrative extraction. In particular, we screened over 900 articles that yielded 54 relevant articles. These articles are synthesized and organized by representation model, extraction criteria, and evaluation approaches. Based on the reviewed studies, we identify recent trends, open challenges, and potential research lines.Comment: 37 pages, 3 figures, to be published in the journal ACM CSU

    Resumen multidocumento utilizando teorías semántico-discursivas

    Get PDF
    El resumen automático tiene por objetivo reducir el tamaño de los textos, preservando el contenido más importante. En este trabajo, proponemos algunos métodos de resumen basados en dos teorías semántico-discursivas: Teoría de la Estructura Retórica (Rhetorical Structure Theory, RST) y Teoría de la Estructura Inter-Documento (Cross-document Structure Theory, CST). Han sido elegidas ambas teorías con el fin de abordar de un modo más relevante de un texto, los fenómenos relacionales de inter-documentos y la distribución de subtopicos en los textos. Los resultados muestran que el uso de informaciones semánticas y discursivas para la selección de contenidos mejora la capacidad informativa de los resúmenes automáticos.Automatic multi-document summarization aims at reducing the size of texts while preserving the important content. In this paper, we propose some methods for automatic summarization based on two semantic discourse models: Rhetorical Structure Theory (RST) and Cross-document Structure Theory (CST). These models are chosen in order to properly address the relevance of information, multi-document phenomena and subtopical distribution in the source texts. The results show that using semantic discourse knowledge for content selection improve the informativeness of automatic summaries

    Drafting Event Schemas using Language Models

    Full text link
    Past work has studied event prediction and event language modeling, sometimes mediated through structured representations of knowledge in the form of event schemas. Such schemas can lead to explainable predictions and forecasting of unseen events given incomplete information. In this work, we look at the process of creating such schemas to describe complex events. We use large language models (LLMs) to draft schemas directly in natural language, which can be further refined by human curators as necessary. Our focus is on whether we can achieve sufficient diversity and recall of key events and whether we can produce the schemas in a sufficiently descriptive style. We show that large language models are able to achieve moderate recall against schemas taken from two different datasets, with even better results when multiple prompts and multiple samples are combined. Moreover, we show that textual entailment methods can be used for both matching schemas to instances of events as well as evaluating overlap between gold and predicted schemas. Our method paves the way for easier distillation of event knowledge from large language model into schemas
    • …
    corecore