33,035 research outputs found

    Of Wines and Reviews: Measuring and Modeling the Vivino Wine Social Network

    Get PDF
    This paper presents an analysis of social experiences around wine consumption through the lens of Vivino, a social network for wine enthusiasts with over 26 million users worldwide. We compare users' perceptions of various wine types and regional styles across both New and Old World wines, examining them across price ranges, vintages, regions, varietals, and blends. Among other things, we find that ratings provided by Vivino users are not biased by cost. We then study how wine characteristics, language in wine reviews, and the distribution of wine ratings can be combined to develop prediction models. More specifically, we model user behavior to develop a regression model for predicting wine ratings, and a classifier for determining user review preferences.Comment: A preliminary version of this paper appears in the Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2018). This is the full versio

    A Survey of Location Prediction on Twitter

    Full text link
    Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur

    Network Model Selection for Task-Focused Attributed Network Inference

    Full text link
    Networks are models representing relationships between entities. Often these relationships are explicitly given, or we must learn a representation which generalizes and predicts observed behavior in underlying individual data (e.g. attributes or labels). Whether given or inferred, choosing the best representation affects subsequent tasks and questions on the network. This work focuses on model selection to evaluate network representations from data, focusing on fundamental predictive tasks on networks. We present a modular methodology using general, interpretable network models, task neighborhood functions found across domains, and several criteria for robust model selection. We demonstrate our methodology on three online user activity datasets and show that network model selection for the appropriate network task vs. an alternate task increases performance by an order of magnitude in our experiments

    Political Homophily in Independence Movements: Analysing and Classifying Social Media Users by National Identity

    Get PDF
    Social media and data mining are increasingly being used to analyse political and societal issues. Here we undertake the classification of social media users as supporting or opposing ongoing independence movements in their territories. Independence movements occur in territories whose citizens have conflicting national identities; users with opposing national identities will then support or oppose the sense of being part of an independent nation that differs from the officially recognised country. We describe a methodology that relies on users' self-reported location to build large-scale datasets for three territories -- Catalonia, the Basque Country and Scotland. An analysis of these datasets shows that homophily plays an important role in determining who people connect with, as users predominantly choose to follow and interact with others from the same national identity. We show that a classifier relying on users' follow networks can achieve accurate, language-independent classification performances ranging from 85% to 97% for the three territories.Comment: Accepted for publication in IEEE Intelligent System

    Emergence of Equilibria from Individual Strategies in Online Content Diffusion

    Get PDF
    Social scientists have observed that human behavior in society can often be modeled as corresponding to a threshold type policy. A new behavior would propagate by a procedure in which an individual adopts the new behavior if the fraction of his neighbors or friends having adopted the new behavior exceeds some threshold. In this paper we study the question of whether the emergence of threshold policies may be modeled as a result of some rational process which would describe the behavior of non-cooperative rational members of some social network. We focus on situations in which individuals take the decision whether to access or not some content, based on the number of views that the content has. Our analysis aims at understanding not only the behavior of individuals, but also the way in which information about the quality of a given content can be deduced from view counts when only part of the viewers that access the content are informed about its quality. In this paper we present a game formulation for the behavior of individuals using a meanfield model: the number of individuals is approximated by a continuum of atomless players and for which the Wardrop equilibrium is the solution concept. We derive conditions on the problem's parameters that result indeed in the emergence of threshold equilibria policies. But we also identify some parameters in which other structures are obtained for the equilibrium behavior of individuals

    Effectiveness of dismantling strategies on moderated vs. unmoderated online social platforms

    Full text link
    Online social networks are the perfect test bed to better understand large-scale human behavior in interacting contexts. Although they are broadly used and studied, little is known about how their terms of service and posting rules affect the way users interact and information spreads. Acknowledging the relation between network connectivity and functionality, we compare the robustness of two different online social platforms, Twitter and Gab, with respect to dismantling strategies based on the recursive censor of users characterized by social prominence (degree) or intensity of inflammatory content (sentiment). We find that the moderated (Twitter) vs unmoderated (Gab) character of the network is not a discriminating factor for intervention effectiveness. We find, however, that more complex strategies based upon the combination of topological and content features may be effective for network dismantling. Our results provide useful indications to design better strategies for countervailing the production and dissemination of anti-social content in online social platforms

    Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media

    Get PDF
    To what extent user's stance towards a given topic could be inferred? Most of the studies on stance detection have focused on analysing user's posts on a given topic to predict the stance. However, the stance in social media can be inferred from a mixture of signals that might reflect user's beliefs including posts and online interactions. This paper examines various online features of users to detect their stance towards different topics. We compare multiple set of features, including on-topic content, network interactions, user's preferences, and online network connections. Our objective is to understand the online signals that can reveal the users' stance. Experimentation is applied on tweets dataset from the SemEval stance detection task, which covers five topics. Results show that stance of a user can be detected with multiple signals of user's online activity, including their posts on the topic, the network they interact with or follow, the websites they visit, and the content they like. The performance of the stance modelling using different network features are comparable with the state-of-the-art reported model that used textual content only. In addition, combining network and content features leads to the highest reported performance to date on the SemEval dataset with F-measure of 72.49%. We further present an extensive analysis to show how these different set of features can reveal stance. Our findings have distinct privacy implications, where they highlight that stance is strongly embedded in user's online social network that, in principle, individuals can be profiled from their interactions and connections even when they do not post about the topic.Comment: Accepted as a full paper at CSCW 2019. Please cite the CSCW versio
    • …
    corecore