185 research outputs found
Knowledge-based Query Expansion in Real-Time Microblog Search
Since the length of microblog texts, such as tweets, is strictly limited to
140 characters, traditional Information Retrieval techniques suffer from the
vocabulary mismatch problem severely and cannot yield good performance in the
context of microblogosphere. To address this critical challenge, in this paper,
we propose a new language modeling approach for microblog retrieval by
inferring various types of context information. In particular, we expand the
query using knowledge terms derived from Freebase so that the expanded one can
better reflect users' search intent. Besides, in order to further satisfy
users' real-time information need, we incorporate temporal evidences into the
expansion method, which can boost recent tweets in the retrieval results with
respect to a given topic. Experimental results on two official TREC Twitter
corpora demonstrate the significant superiority of our approach over baseline
methods.Comment: 9 pages, 9 figure
Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search
Despite substantial interest in applications of neural networks to
information retrieval, neural ranking models have only been applied to standard
ad hoc retrieval tasks over web pages and newswire documents. This paper
proposes MP-HCNN (Multi-Perspective Hierarchical Convolutional Neural Network)
a novel neural ranking model specifically designed for ranking short social
media posts. We identify document length, informal language, and heterogeneous
relevance signals as features that distinguish documents in our domain, and
present a model specifically designed with these characteristics in mind. Our
model uses hierarchical convolutional layers to learn latent semantic
soft-match relevance signals at the character, word, and phrase levels. A
pooling-based similarity measurement layer integrates evidence from multiple
types of matches between the query, the social media post, as well as URLs
contained in the post. Extensive experiments using Twitter data from the TREC
Microblog Tracks 2011--2014 show that our model significantly outperforms prior
feature-based as well and existing neural ranking models. To our best
knowledge, this paper presents the first substantial work tackling search over
social media posts using neural ranking models.Comment: AAAI 2019, 10 page
A Comparison of Nuggets and Clusters for Evaluating Timeline Summaries
There is growing interest in systems that generate timeline summaries by filtering high-volume streams of documents to retain only those that are relevant to a particular event or topic. Continued advances in algorithms and techniques for this task depend on standardized and reproducible evaluation methodologies for comparing systems. However, timeline summary evaluation is still in its infancy, with competing methodologies currently being explored in international evaluation forums such as TREC. One area of active exploration is how to explicitly represent the units of information that should appear in a 'good' summary. Currently, there are two main approaches, one based on identifying nuggets in an external 'ground truth', and the other based on clustering system outputs. In this paper, by building test collections that have both nugget and cluster annotations, we are able to compare these two approaches. Specifically, we address questions related to evaluation effort, differences in the final evaluation products, and correlations between scores and rankings generated by both approaches. We summarize advantages and disadvantages of nuggets and clusters to offer recommendations for future system evaluation
An evaluation of the role of sentiment in second screen microblog search tasks
The recent prominence of the real-time web is proving both challenging and disruptive for information retrieval and web data mining research. User-generated content on the real-time web is perhaps best epitomised by content on microblogging platforms, such as Twitter. Given the substantial quantity of microblog posts that may be relevant to a user's query at a point in time, automated methods are required to sift through this information. Sentiment analysis offers a promising direction for modelling microblog content. We build and evaluate a sentiment-based filtering system using real-time user studies. We find a significant role played by sentiment in the search scenarios, observing detrimental effects in filtering out certain sentiment types. We make a series of observations regarding associations between document-level sentiment and user feedback, including associations with user profile attributes, and users' prior topic sentiment
IRIT at TREC Microblog Track 2013
National audienceThis paper describes the participation of the IRIT lab, University of Toulouse, France, to the Microblog Track of TREC 2013. Two different approaches are experimented by our team for the real-time ad-hoc search task: (i) a Bayesian network retrieval model for tweet search and (ii) a document and query expansion model for microblog search
- …