130,091 research outputs found
Domain-specific queries and Web search personalization: some investigations
Major search engines deploy personalized Web results to enhance users'
experience, by showing them data supposed to be relevant to their interests.
Even if this process may bring benefits to users while browsing, it also raises
concerns on the selection of the search results. In particular, users may be
unknowingly trapped by search engines in protective information bubbles, called
"filter bubbles", which can have the undesired effect of separating users from
information that does not fit their preferences. This paper moves from early
results on quantification of personalization over Google search query results.
Inspired by previous works, we have carried out some experiments consisting of
search queries performed by a battery of Google accounts with differently
prepared profiles. Matching query results, we quantify the level of
personalization, according to topics of the queries and the profile of the
accounts. This work reports initial results and it is a first step a for more
extensive investigation to measure Web search personalization.Comment: In Proceedings WWV 2015, arXiv:1508.0338
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
Topicality and Social Impact: Diverse Messages but Focused Messengers
Are users who comment on a variety of matters more likely to achieve high
influence than those who delve into one focused field? Do general Twitter
hashtags, such as #lol, tend to be more popular than novel ones, such as
#instantlyinlove? Questions like these demand a way to detect topics hidden
behind messages associated with an individual or a hashtag, and a gauge of
similarity among these topics. Here we develop such an approach to identify
clusters of similar hashtags by detecting communities in the hashtag
co-occurrence network. Then the topical diversity of a user's interests is
quantified by the entropy of her hashtags across different topic clusters. A
similar measure is applied to hashtags, based on co-occurring tags. We find
that high topical diversity of early adopters or co-occurring tags implies high
future popularity of hashtags. In contrast, low diversity helps an individual
accumulate social influence. In short, diverse messages and focused messengers
are more likely to gain impact.Comment: 9 pages, 7 figures, 6 table
Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
User response prediction, which models the user preference w.r.t. the
presented items, plays a key role in online services. With two-decade rapid
development, nowadays the cumulated user behavior sequences on mature Internet
service platforms have become extremely long since the user's first
registration. Each user not only has intrinsic tastes, but also keeps changing
her personal interests during lifetime. Hence, it is challenging to handle such
lifelong sequential modeling for each individual user. Existing methodologies
for sequential modeling are only capable of dealing with relatively recent user
behaviors, which leaves huge space for modeling long-term especially lifelong
sequential patterns to facilitate user modeling. Moreover, one user's behavior
may be accounted for various previous behaviors within her whole online
activity history, i.e., long-term dependency with multi-scale sequential
patterns. In order to tackle these challenges, in this paper, we propose a
Hierarchical Periodic Memory Network for lifelong sequential modeling with
personalized memorization of sequential patterns for each user. The model also
adopts a hierarchical and periodical updating mechanism to capture multi-scale
sequential patterns of user interests while supporting the evolving user
behavior logs. The experimental results over three large-scale real-world
datasets have demonstrated the advantages of our proposed model with
significant improvement in user response prediction performance against the
state-of-the-arts.Comment: SIGIR 2019. Reproducible codes and datasets:
https://github.com/alimamarankgroup/HPM
Modeling Taxi Drivers' Behaviour for the Next Destination Prediction
In this paper, we study how to model taxi drivers' behaviour and geographical
information for an interesting and challenging task: the next destination
prediction in a taxi journey. Predicting the next location is a well studied
problem in human mobility, which finds several applications in real-world
scenarios, from optimizing the efficiency of electronic dispatching systems to
predicting and reducing the traffic jam. This task is normally modeled as a
multiclass classification problem, where the goal is to select, among a set of
already known locations, the next taxi destination. We present a Recurrent
Neural Network (RNN) approach that models the taxi drivers' behaviour and
encodes the semantics of visited locations by using geographical information
from Location-Based Social Networks (LBSNs). In particular, RNNs are trained to
predict the exact coordinates of the next destination, overcoming the problem
of producing, in output, a limited set of locations, seen during the training
phase. The proposed approach was tested on the ECML/PKDD Discovery Challenge
2015 dataset - based on the city of Porto -, obtaining better results with
respect to the competition winner, whilst using less information, and on
Manhattan and San Francisco datasets.Comment: preprint version of a paper submitted to IEEE Transactions on
Intelligent Transportation System
- …