5,341 research outputs found
Foreground and background text in retrieval
Our hypothesis is that certain clauses have foreground functions in text,
while other clauses have background functions and that these functions are
expressed or reflected in the syntactic structure of the clause.
Presumably these clauses will have differing utility for automatic
approaches to text understanding; a summarization system might want to
utilize background clauses to capture commonalities between numbers of
documents while an indexing system might use foreground clauses in order to
capture specific characteristics of a certain document
One for All: Neural Joint Modeling of Entities and Events
The previous work for event extraction has mainly focused on the predictions
for event triggers and argument roles, treating entity mentions as being
provided by human annotators. This is unrealistic as entity mentions are
usually predicted by some existing toolkits whose errors might be propagated to
the event trigger and argument role recognition. Few of the recent work has
addressed this problem by jointly predicting entity mentions, event triggers
and arguments. However, such work is limited to using discrete engineering
features to represent contextual information for the individual tasks and their
interactions. In this work, we propose a novel model to jointly perform
predictions for entity mentions, event triggers and arguments based on the
shared hidden representations from deep learning. The experiments demonstrate
the benefits of the proposed method, leading to the state-of-the-art
performance for event extraction.Comment: Accepted at The Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19) (Honolulu, Hawaii, USA
Analysis of Twitter Data Using a Multiple-level Clustering Strategy
Twitter, currently the leading microblogging social network, has attracted a great body of research works. This paper proposes a data analysis framework to discover groups of similar twitter messages posted on a given event. By analyzing these groups, user emotions or thoughts that seem to be associated with specific events can be extracted, as well as aspects characterizing events according to user perception. To deal with the inherent sparseness of micro-messages, the proposed approach relies on a multiple-level strategy that allows clustering text data with a variable distribution. Clusters are then characterized through the most representative words appearing in their messages, and association rules are used to highlight correlations among these words. To measure the relevance of specific words for a given event, text data has been represented in the Vector Space Model using the TF-IDF weighting score. As a case study, two real Twitter datasets have been analyse
- …