18 research outputs found

    A Method for Short Message Contextualization: Experiments at CLEF/INEX

    Get PDF
    International audienceThis paper presents the approach we developed for automatic multi-document summarization applied to short message contextualization, in particular to tweet contextualization. The proposed method is based on named entity recognition, part-of-speech weighting and sentence quality measuring. In contrast to previous research, we introduced an algorithm from smoothing from the local context. Our approach exploits topic-comment structure of a text. Moreover, we developed a graph-based algorithm for sentence reordering. The method has been evaluated at INEX/CLEF tweet contextualization track. We provide the evaluation results over the 4 years of the track. The method was also adapted to snippet retrieval and query expansion. The evaluation results indicate good performance of the approach

    INEX Tweet Contextualization Task: Evaluation, Results and Lesson Learned

    Get PDF
    Microblogging platforms such as Twitter are increasingly used for on-line client and market analysis. This motivated the proposal of a new track at CLEF INEX lab of Tweet Contextualization. The objective of this task was to help a user to understand a tweet by providing him with a short explanatory summary (500 words). This summary should be built automatically using resources like Wikipedia and generated by extracting relevant passages and aggregating them into a coherent summary. Running for four years, results show that the best systems combine NLP techniques with more traditional methods. More precisely the best performing systems combine passage retrieval, sentence segmentation and scoring, named entity recognition, text part-of-speech (POS) analysis, anaphora detection, diversity content measure as well as sentence reordering. This paper provides a full summary report on the four-year long task. While yearly overviews focused on system results, in this paper we provide a detailed report on the approaches proposed by the participants and which can be considered as the state of the art for this task. As an important result from the 4 years competition, we also describe the open access resources that have been built and collected. The evaluation measures for automatic summarization designed in DUC or MUC were not appropriate to evaluate tweet contextualization, we explain why and depict in detailed the LogSim measure used to evaluate informativeness of produced contexts or summaries. Finally, we also mention the lessons we learned and that it is worth considering when designing a task

    De quoi parle ce Tweet? Résumer Wikipédia pour contextualiser des microblogs

    Get PDF
    International audienceLes réseaux sociaux sont au centre des communications sur internet et une grande partie des échanges communautaires se fait à travers eux.Parmi eux, l'apparition de Twitter a donné lieu à la création d'un nouveau type de partage d'informations où les messages sont limités à 140 caractères. Les utilisateurs de ce réseau s'expriment donc succinctement, souvent en temps réel à partir d'un smartphone, et la teneur des messages peut parfois être difficile à comprendre sans contexte. Nous proposons dans cet article une méthode permettant de contextualiser automatiquement des Tweets en utilisant des informations provenant directement de l'encyclopédie en ligne Wikipédia, avec comme but final de répondre à la question : De quoi parle ce Tweet?. Nous traitons ce problème comme une approche de résumé automatique où le texte à résumer est composé d'articles Wikipédia liés aux différentes informations exprimées dans un Tweet. Nous explorons l'influence de différentes méthodes de recherche d'articles liés aux Tweets, ainsi que de plusieurs caractéristiques utiles pour la sélection des phrases formant le contexte. Nous évaluons notre approche en utilisant la collection de la tâche Tweet Contextualization d'INEX 2012 et donnons un aperçu sur ce qui caractérise une phrase importante pour déterminer le contexte d'un Tweet

    Social Book Search: A Methodology that Combines both Retrieval and Recommendation

    Get PDF
    University of Minnesota M.S. thesis.August 2014. Major: Computer Science. Advisor: Carolyn Crouch. 1 computer file (PDF); vii, 43 pages.Information Retrieval as an area of research aims at satisfying the information need of a user. Retrieval in the Information Age has expanded exponentially as its underlying technologies have expanded. Traditional IR systems that give response to a user's natural language search query are combined with recommendation through collaborative filtering [6]. This research focuses on a methodology that combines both traditional IR and recommender systems. It is done as part of the Social Book Search (SBS) Track, Suggestion task of INEX (INitiative for the Evaluation of XML Retrieval) 2014 [3]. The Social Book Search Track was introduced by INEX in 2011 with the purpose of providing support to users in terms of easy search and access to books by using metadata. One complexity of the task lies in handling both professional and social metadata which are different in terms of both kind and quantity. Methodology and experiments discussed are inspired by background research [1,2,4,5,6] on the Social Book Search track. Our IR team submitted six runs for the track to the INEX 2014 competition, five of which use a recommender system that re-ranks the otherwise traditional set of results. Background work done to establish a good foundation for the methodology used in the SBS 2014 task includes experiments performed on both the 2011 and 2013 Social Book Search tracks. This research focuses on the 2013 experiments and their impact on results produced for SBS 2014

    Personalized Book Retrieval System Using Amazon-LibraryThing Collection

    Get PDF
    University of Minnesota M.S. thesis. August 2014. Major: Computer Science. Advisor: Carolyn Crouch. 1 computer file (PDF); vi, 41 pages.Information retrieval is the science of retrieving documents or information from a corpus based on the need of user. Selecting a book from a collection of available books based on its topical relevance to the query may not give us the "best" (or all the "best") such book(s). However, by including social data, such as popularity, reviws and ratings, may improve the results. So we include social data with book metadata for this purpose. The main goal of this research is to provide a book retrieval system for the Social Book Search (SBS) Track of the INEX forum. For the SBS track, participants are provided with an XML collection of data from Amazon and LibraryThing (LT) forum, a set of topics from the LT forum enriched with user catalogue data (i.e., books that the topic creator has in his LibraryThing personal catalogue), and anonymous user profiles. Participants must devise a system which provides the ISBN/work IDs of the books which are relevant to the topic creator. For this purpose, we designed a recommender system which provides personalized search results
    corecore