3,107 research outputs found
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
Interpretable Machine Learning for Privacy-Preserving Pervasive Systems
Our everyday interactions with pervasive systems generate traces that capture
various aspects of human behavior and enable machine learning algorithms to
extract latent information about users. In this paper, we propose a machine
learning interpretability framework that enables users to understand how these
generated traces violate their privacy
Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization
Recommender systems leverage user demographic information, such as age,
gender, etc., to personalize recommendations and better place their targeted
ads. Oftentimes, users do not volunteer this information due to privacy
concerns, or due to a lack of initiative in filling out their online profiles.
We illustrate a new threat in which a recommender learns private attributes of
users who do not voluntarily disclose them. We design both passive and active
attacks that solicit ratings for strategically selected items, and could thus
be used by a recommender system to pursue this hidden agenda. Our methods are
based on a novel usage of Bayesian matrix factorization in an active learning
setting. Evaluations on multiple datasets illustrate that such attacks are
indeed feasible and use significantly fewer rated items than static inference
methods. Importantly, they succeed without sacrificing the quality of
recommendations to users.Comment: This is the extended version of a paper that appeared in ACM RecSys
201
- …