Search CORE

33 research outputs found

INEX Tweet Contextualization Task: Evaluation, Results and Lesson Learned

Author: Bellot Patrice
Juan Eric San
Moriceau Véronique
Mothe Josiane
SanJuan Eric
Tannier Xavier
Publication venue: Elsevier
Publication date: 01/03/2016
Field of study

Microblogging platforms such as Twitter are increasingly used for on-line client and market analysis. This motivated the proposal of a new track at CLEF INEX lab of Tweet Contextualization. The objective of this task was to help a user to understand a tweet by providing him with a short explanatory summary (500 words). This summary should be built automatically using resources like Wikipedia and generated by extracting relevant passages and aggregating them into a coherent summary. Running for four years, results show that the best systems combine NLP techniques with more traditional methods. More precisely the best performing systems combine passage retrieval, sentence segmentation and scoring, named entity recognition, text part-of-speech (POS) analysis, anaphora detection, diversity content measure as well as sentence reordering. This paper provides a full summary report on the four-year long task. While yearly overviews focused on system results, in this paper we provide a detailed report on the approaches proposed by the participants and which can be considered as the state of the art for this task. As an important result from the 4 years competition, we also describe the open access resources that have been built and collected. The evaluation measures for automatic summarization designed in DUC or MUC were not appropriate to evaluate tweet contextualization, we explain why and depict in detailed the LogSim measure used to evaluate informativeness of produced contexts or summaries. Finally, we also mention the lessons we learned and that it is worth considering when designing a task

Scientific Publications of the University of Toulouse II Le Mirail

HAL AMU

ZENODO

Open Archive Toulouse Archive Ouverte

HAL Descartes

DCU@INEX-2012: exploring sentence retrieval for tweet contextualization

Author: Ganguly Debasis
Jones Gareth J.F.
Leveling Johannes
Publication venue
Publication date: 17/09/2012
Field of study

For the participation of Dublin City University (DCU) in the INEX-2012 tweet contextualization task, we investigated sentence retrieval methodologies. The task requires providing the context to an ad-hoc real-life tweet. This context is to be constructed from Wikipedia articles. Our approach involves indexing the passages in Wikipedia articles as separate retrievable units, extracting sentences from the top ranked passages, computing the sentence selection score for each such sentence with respect to the query, and then returning the top most similar ones. The simple sentence selection strategy performed quite well in the task. Our best run has ranked rst from the readability perspective and ranked eighth as ordered by informativeness out of 33 ocial runs

Irish Universities

DCU Online Research Access Service

Overview of INEX Tweet Contextualization 2013 track

Author: Bellot Patrice
Moriceau Véronique
Mothe Josiane
Sanjuan Eric
Tannier Xavier
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceTwitter is increasingly used for on-line client and audience fishing; this motivated the tweet contextualization task at INEX. The objective is to help a user to understand a tweet by providing him with a short summary (500 words). This summary should be built automatically using local resources like the Wikipedia and generated by extracting relevant passages and aggregating them into a coherent summary. The task is evaluated considering informativeness which is computed using a variant of Kullback-Leibler divergence and passage pooling. Meanwhile effective readability in context of summaries is checked using binary questionnaires on small samples of results. Running since 2010, results show that only systems that efficiently combine passage retrieval, sentence segmentation and scoring, named entity recognition, text POS analysis, anaphora detection, diversity content measure as well as sentence reordering are effective

Scientific Publications of the University of Toulouse II Le Mirail

HAL AMU

Open Archive Toulouse Archive Ouverte

A Method for Short Message Contextualization: Experiments at CLEF/INEX

Author: Ermakova Liana
Publication venue: HAL CCSD
Publication date: 08/09/2015
Field of study

International audienceThis paper presents the approach we developed for automatic multi-document summarization applied to short message contextualization, in particular to tweet contextualization. The proposed method is based on named entity recognition, part-of-speech weighting and sentence quality measuring. In contrast to previous research, we introduced an algorithm from smoothing from the local context. Our approach exploits topic-comment structure of a text. Moreover, we developed a graph-based algorithm for sentence reordering. The method has been evaluated at INEX/CLEF tweet contextualization track. We provide the evaluation results over the 4 years of the track. The method was also adapted to snippet retrieval and query expansion. The evaluation results indicate good performance of the approach

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL-Université de Bretagne Occidentale

HAL Descartes

Tweet Contextualization Based on Wikipedia and Dbpedia

Author: Berrut Catherine
Latiri Chiraz
Mulhem Philippe
Slimani Yahya
Zingla Meriem Amina
Publication venue: HAL CCSD
Publication date: 09/03/2016
Field of study

National audienceBound to 140 characters, tweets are short and not written maintaining formal grammar and proper spelling. These spelling variations increase the likelihood of vocabulary mismatch and make them difficult to understand without context. This paper falls under the tweet contextualization task that aims at providing, automatically, a summary that explains a given tweet, allowing a reader to understand it. We propose different tweet expansion approaches based on Wikipeda and Dbpedia as external knowledge sources. These proposed approaches are divided into two steps. The first step consists in generating the candidate terms for a given tweet, while the second one consists in ranking and selecting these candidate terms using asimilarity measure. The effectiveness of our methods is proved through an experimental study conducted on the INEX 2014 collection

Hal - Université Grenoble Alpes

Overview of INEX Tweet Contextualization 2014 track

Author: Bellot Patrice
Moriceau Véronique
Mothe Josiane
Sanjuan Eric
Tannier Xavier
Publication venue: HAL CCSD
Publication date: 01/09/2014
Field of study

International audience140 characters long messages are rarely self-content. The Tweet Contextualization aims at providing automatically information - a summary that explains the tweet. This requires combining multiple types of processing from information retrieval to multi-document sum- marization including entity linking. Running since 2010, the task in 2014 was a slight variant of previous ones considering more complex queries from RepLab 2013. Given a tweet and a related entity, systems had to provide some context about the subject of the tweet from the perspective of the entity, in order to help the reader to understand it

Scientific Publications of the University of Toulouse II Le Mirail

HAL AMU

Open Archive Toulouse Archive Ouverte

From XML Retrieval to Semantic Search and Beyond:The INEX, SBS, and MC2 Labs of CLEF 2012-2018

Author: Bogers T.
Geva S.
Kamps J.
Koolen M.
SanJuan E.
Schenkel R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

International Migration, Integration and Social Cohesion online publications

Évaluation de la contextualisation de tweets

Author: Bellot Patrice
Moriceau Véronique
Mothe Josiane
Sanjuan Eric
Tannier Xavier
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

National audienceCet article s'intéresse à l'évaluation de la contextualisation de tweets. La contextualisation est définie comme un résumé permettant de remettre en contexte un texte qui, de par sa taille, ne contient pas l'ensemble des éléments qui permettent à un lecteur de comprendre tout ou partie de son contenu. Nous définissons un cadre d'évaluation pour la contextualisation de tweets généralisable à d'autres textes courts. Nous proposons une collection de référence ainsi que des mesures d'évaluation adhoc. Ce cadre d'évaluation a été expérimenté avec succès dans la contexte de la campagne INEX Tweet Contextualization. Au regard des résultats obtenus lors de cette campagne, nous discutons ici les mesures utilisées en lien avec les autres mesures de la littérature

Scientific Publications of the University of Toulouse II Le Mirail

HAL AMU

Open Archive Toulouse Archive Ouverte