4,537 research outputs found
Knowledge-based Query Expansion in Real-Time Microblog Search
Since the length of microblog texts, such as tweets, is strictly limited to
140 characters, traditional Information Retrieval techniques suffer from the
vocabulary mismatch problem severely and cannot yield good performance in the
context of microblogosphere. To address this critical challenge, in this paper,
we propose a new language modeling approach for microblog retrieval by
inferring various types of context information. In particular, we expand the
query using knowledge terms derived from Freebase so that the expanded one can
better reflect users' search intent. Besides, in order to further satisfy
users' real-time information need, we incorporate temporal evidences into the
expansion method, which can boost recent tweets in the retrieval results with
respect to a given topic. Experimental results on two official TREC Twitter
corpora demonstrate the significant superiority of our approach over baseline
methods.Comment: 9 pages, 9 figure
Finding co-solvers on Twitter, with a little help from Linked Data
In this paper we propose a method for suggesting potential collaborators for solving innovation challenges online, based on their competence, similarity of interests and social proximity with the user. We rely on Linked Data to derive a measure of semantic relatedness that we use to enrich both user profiles and innovation problems with additional relevant topics, thereby improving the performance of co-solver recommendation. We evaluate this approach against state of the art methods for query enrichment based on the distribution of topics in user profiles, and demonstrate its usefulness in recommending collaborators that are both complementary in competence and compatible with the user. Our experiments are grounded using data from the social networking service Twitter.com
An evaluation of the role of sentiment in second screen microblog search tasks
The recent prominence of the real-time web is proving both challenging and disruptive for information retrieval and web data mining research. User-generated content on the real-time web is perhaps best epitomised by content on microblogging platforms, such as Twitter. Given the substantial quantity of microblog posts that may be relevant to a user's query at a point in time, automated methods are required to sift through this information. Sentiment analysis offers a promising direction for modelling microblog content. We build and evaluate a sentiment-based filtering system using real-time user studies. We find a significant role played by sentiment in the search scenarios, observing detrimental effects in filtering out certain sentiment types. We make a series of observations regarding associations between document-level sentiment and user feedback, including associations with user profile attributes, and users' prior topic sentiment
Tweetement: Pseudo-relevance Feedback for Twitter Search
Microblogging platforms such as Twitter let users communicate with short messages. Due to the messagesā short content and the usersā tendency to type short queries while searching, it is particularly challenging to locate useful tweets that match user queries. The fundamental problems of word mismatch due to ambiguity are especially acute. To solve this problem, this thesis explores and compares multiple automatic query expansion methods that involve the most frequent hashtags and keywords. We built a Web service that provides real-time Twitter Search results incorporating automatic query expansion. Six pseudo-relevance feedback methods were studied and the numbers indicate that results without query expansion perform just as well as results with query expansion. However, the expanded queries find different relevant tweets than the original query, indicating, from multiple methods, that combining the results is a fruitful area for future investigations. Keywords: microblog, Twitter Search, query expansion, pseudo-relevance feedback, Web servic
Temporal Information Models for Real-Time Microblog Search
Real-time search in Twitter and other social media services is often biased
towards the most recent results due to the āin the momentā nature of topic
trends and their ephemeral relevance to users and media in general. However,
āin the momentā, it is often difficult to look at all emerging topics and single-out
the important ones from the rest of the social media chatter. This thesis proposes
to leverage on external sources to estimate the duration and burstiness of live
Twitter topics. It extends preliminary research where itwas shown that temporal
re-ranking using external sources could indeed improve the accuracy of results.
To further explore this topic we pursued three significant novel approaches: (1)
multi-source information analysis that explores behavioral dynamics of users,
such as Wikipedia live edits and page view streams, to detect topic trends
and estimate the topic interest over time; (2) efficient methods for federated
query expansion towards the improvement of query meaning; and (3) exploiting
multiple sources towards the detection of temporal query intent. It differs from
past approaches in the sense that it will work over real-time queries, leveraging
on live user-generated content. This approach contrasts with previous methods
that require an offline preprocessing step
Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search
Despite substantial interest in applications of neural networks to
information retrieval, neural ranking models have only been applied to standard
ad hoc retrieval tasks over web pages and newswire documents. This paper
proposes MP-HCNN (Multi-Perspective Hierarchical Convolutional Neural Network)
a novel neural ranking model specifically designed for ranking short social
media posts. We identify document length, informal language, and heterogeneous
relevance signals as features that distinguish documents in our domain, and
present a model specifically designed with these characteristics in mind. Our
model uses hierarchical convolutional layers to learn latent semantic
soft-match relevance signals at the character, word, and phrase levels. A
pooling-based similarity measurement layer integrates evidence from multiple
types of matches between the query, the social media post, as well as URLs
contained in the post. Extensive experiments using Twitter data from the TREC
Microblog Tracks 2011--2014 show that our model significantly outperforms prior
feature-based as well and existing neural ranking models. To our best
knowledge, this paper presents the first substantial work tackling search over
social media posts using neural ranking models.Comment: AAAI 2019, 10 page
- ā¦