33,035 research outputs found
Of Wines and Reviews: Measuring and Modeling the Vivino Wine Social Network
This paper presents an analysis of social experiences around wine consumption
through the lens of Vivino, a social network for wine enthusiasts with over 26
million users worldwide. We compare users' perceptions of various wine types
and regional styles across both New and Old World wines, examining them across
price ranges, vintages, regions, varietals, and blends. Among other things, we
find that ratings provided by Vivino users are not biased by cost. We then
study how wine characteristics, language in wine reviews, and the distribution
of wine ratings can be combined to develop prediction models. More
specifically, we model user behavior to develop a regression model for
predicting wine ratings, and a classifier for determining user review
preferences.Comment: A preliminary version of this paper appears in the Proceedings of the
IEEE/ACM International Conference on Advances in Social Networks Analysis and
Mining (ASONAM 2018). This is the full versio
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
Network Model Selection for Task-Focused Attributed Network Inference
Networks are models representing relationships between entities. Often these
relationships are explicitly given, or we must learn a representation which
generalizes and predicts observed behavior in underlying individual data (e.g.
attributes or labels). Whether given or inferred, choosing the best
representation affects subsequent tasks and questions on the network. This work
focuses on model selection to evaluate network representations from data,
focusing on fundamental predictive tasks on networks. We present a modular
methodology using general, interpretable network models, task neighborhood
functions found across domains, and several criteria for robust model
selection. We demonstrate our methodology on three online user activity
datasets and show that network model selection for the appropriate network task
vs. an alternate task increases performance by an order of magnitude in our
experiments
Political Homophily in Independence Movements: Analysing and Classifying Social Media Users by National Identity
Social media and data mining are increasingly being used to analyse political
and societal issues. Here we undertake the classification of social media users
as supporting or opposing ongoing independence movements in their territories.
Independence movements occur in territories whose citizens have conflicting
national identities; users with opposing national identities will then support
or oppose the sense of being part of an independent nation that differs from
the officially recognised country. We describe a methodology that relies on
users' self-reported location to build large-scale datasets for three
territories -- Catalonia, the Basque Country and Scotland. An analysis of these
datasets shows that homophily plays an important role in determining who people
connect with, as users predominantly choose to follow and interact with others
from the same national identity. We show that a classifier relying on users'
follow networks can achieve accurate, language-independent classification
performances ranging from 85% to 97% for the three territories.Comment: Accepted for publication in IEEE Intelligent System
Emergence of Equilibria from Individual Strategies in Online Content Diffusion
Social scientists have observed that human behavior in society can often be
modeled as corresponding to a threshold type policy. A new behavior would
propagate by a procedure in which an individual adopts the new behavior if the
fraction of his neighbors or friends having adopted the new behavior exceeds
some threshold. In this paper we study the question of whether the emergence of
threshold policies may be modeled as a result of some rational process which
would describe the behavior of non-cooperative rational members of some social
network. We focus on situations in which individuals take the decision whether
to access or not some content, based on the number of views that the content
has. Our analysis aims at understanding not only the behavior of individuals,
but also the way in which information about the quality of a given content can
be deduced from view counts when only part of the viewers that access the
content are informed about its quality. In this paper we present a game
formulation for the behavior of individuals using a meanfield model: the number
of individuals is approximated by a continuum of atomless players and for which
the Wardrop equilibrium is the solution concept. We derive conditions on the
problem's parameters that result indeed in the emergence of threshold
equilibria policies. But we also identify some parameters in which other
structures are obtained for the equilibrium behavior of individuals
Effectiveness of dismantling strategies on moderated vs. unmoderated online social platforms
Online social networks are the perfect test bed to better understand
large-scale human behavior in interacting contexts. Although they are broadly
used and studied, little is known about how their terms of service and posting
rules affect the way users interact and information spreads. Acknowledging the
relation between network connectivity and functionality, we compare the
robustness of two different online social platforms, Twitter and Gab, with
respect to dismantling strategies based on the recursive censor of users
characterized by social prominence (degree) or intensity of inflammatory
content (sentiment). We find that the moderated (Twitter) vs unmoderated (Gab)
character of the network is not a discriminating factor for intervention
effectiveness. We find, however, that more complex strategies based upon the
combination of topological and content features may be effective for network
dismantling. Our results provide useful indications to design better strategies
for countervailing the production and dissemination of anti-social content in
online social platforms
Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media
To what extent user's stance towards a given topic could be inferred? Most of
the studies on stance detection have focused on analysing user's posts on a
given topic to predict the stance. However, the stance in social media can be
inferred from a mixture of signals that might reflect user's beliefs including
posts and online interactions. This paper examines various online features of
users to detect their stance towards different topics. We compare multiple set
of features, including on-topic content, network interactions, user's
preferences, and online network connections. Our objective is to understand the
online signals that can reveal the users' stance. Experimentation is applied on
tweets dataset from the SemEval stance detection task, which covers five
topics. Results show that stance of a user can be detected with multiple
signals of user's online activity, including their posts on the topic, the
network they interact with or follow, the websites they visit, and the content
they like. The performance of the stance modelling using different network
features are comparable with the state-of-the-art reported model that used
textual content only. In addition, combining network and content features leads
to the highest reported performance to date on the SemEval dataset with
F-measure of 72.49%. We further present an extensive analysis to show how these
different set of features can reveal stance. Our findings have distinct privacy
implications, where they highlight that stance is strongly embedded in user's
online social network that, in principle, individuals can be profiled from
their interactions and connections even when they do not post about the topic.Comment: Accepted as a full paper at CSCW 2019. Please cite the CSCW versio
- …