2,634 research outputs found
Time Aware Knowledge Extraction for Microblog Summarization on Twitter
Microblogging services like Twitter and Facebook collect millions of user
generated content every moment about trending news, occurring events, and so
on. Nevertheless, it is really a nightmare to find information of interest
through the huge amount of available posts that are often noise and redundant.
In general, social media analytics services have caught increasing attention
from both side research and industry. Specifically, the dynamic context of
microblogging requires to manage not only meaning of information but also the
evolution of knowledge over the timeline. This work defines Time Aware
Knowledge Extraction (briefly TAKE) methodology that relies on temporal
extension of Fuzzy Formal Concept Analysis. In particular, a microblog
summarization algorithm has been defined filtering the concepts organized by
TAKE in a time-dependent hierarchy. The algorithm addresses topic-based
summarization on Twitter. Besides considering the timing of the concepts,
another distinguish feature of the proposed microblog summarization framework
is the possibility to have more or less detailed summary, according to the
user's needs, with good levels of quality and completeness as highlighted in
the experimental results.Comment: 33 pages, 10 figure
Adaptive Representations for Tracking Breaking News on Twitter
Twitter is often the most up-to-date source for finding and tracking breaking
news stories. Therefore, there is considerable interest in developing filters
for tweet streams in order to track and summarize stories. This is a
non-trivial text analytics task as tweets are short, and standard retrieval
methods often fail as stories evolve over time. In this paper we examine the
effectiveness of adaptive mechanisms for tracking and summarizing breaking news
stories. We evaluate the effectiveness of these mechanisms on a number of
recent news events for which manually curated timelines are available.
Assessments based on ROUGE metrics indicate that an adaptive approaches are
best suited for tracking evolving stories on Twitter.Comment: 8 Pag
Tracking the History and Evolution of Entities: Entity-centric Temporal Analysis of Large Social Media Archives
How did the popularity of the Greek Prime Minister evolve in 2015? How did
the predominant sentiment about him vary during that period? Were there any
controversial sub-periods? What other entities were related to him during these
periods? To answer these questions, one needs to analyze archived documents and
data about the query entities, such as old news articles or social media
archives. In particular, user-generated content posted in social networks, like
Twitter and Facebook, can be seen as a comprehensive documentation of our
society, and thus meaningful analysis methods over such archived data are of
immense value for sociologists, historians and other interested parties who
want to study the history and evolution of entities and events. To this end, in
this paper we propose an entity-centric approach to analyze social media
archives and we define measures that allow studying how entities were reflected
in social media in different time periods and under different aspects, like
popularity, attitude, controversiality, and connectedness with other entities.
A case study using a large Twitter archive of four years illustrates the
insights that can be gained by such an entity-centric and multi-aspect
analysis.Comment: This is a preprint of an article accepted for publication in the
International Journal on Digital Libraries (2018
The Development of a Temporal Information Dictionary for Social Media Analytics
Dictionaries have been used to analyze text even before the emergence of social media and the use of dictionaries for sentiment analysis there. While dictionaries have been used to understand the tonality of text, so far it has not been possible to automatically detect if the tonality refers to the present, past, or future. In this research, we develop a dictionary containing time-indicating words in a wordlist (T-wordlist). To test how the dictionary performs, we apply our T-wordlist on different disaster related social media datasets. Subsequently we will validate the wordlist and results by a manual content analysis. So far, in this research-in-progress, we were able to develop a first dictionary and will also provide some initial insight into the performance of our wordlist
Explicit diversification of event aspects for temporal summarization
During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness
- …