481 research outputs found

    Large scale homophily analysis in twitter using a twixonomy

    Get PDF
    In this paper we perform a large-scale homophily analysis on Twitter using a hierarchical representation of users' interests which we call a Twixonomy. In order to build a population, community, or single-user Twixonomy we first associate "topical" friends in users' friendship lists (i.e. friends representing an interest rather than a social relation between peers) with Wikipedia categories. A wordsense disambiguation algorithm is used to select the appropriate wikipage for each topical friend. Starting from the set of wikipages representing "primitive" interests, we extract all paths connecting these pages with topmost Wikipedia category nodes, and we then prune the resulting graph G efficiently so as to induce a direct acyclic graph. This graph is the Twixonomy. Then, to analyze homophily, we compare different methods to detect communities in a peer friends Twitter network, and then for each community we compute the degree of homophily on the basis of a measure of pairwise semantic similarity. We show that the Twixonomy provides a means for describing users' interests in a compact and readable way and allows for a fine-grained homophily analysis. Furthermore, we show that midlow level categories in the Twixonomy represent the best balance between informativeness and compactness of the representation

    When Politicians Talk: Assessing Online Conversational Practices of Political Parties on Twitter

    Full text link
    Assessing political conversations in social media requires a deeper understanding of the underlying practices and styles that drive these conversations. In this paper, we present a computational approach for assessing online conversational practices of political parties. Following a deductive approach, we devise a number of quantitative measures from a discussion of theoretical constructs in sociological theory. The resulting measures make different - mostly qualitative - aspects of online conversational practices amenable to computation. We evaluate our computational approach by applying it in a case study. In particular, we study online conversational practices of German politicians on Twitter during the German federal election 2013. We find that political parties share some interesting patterns of behavior, but also exhibit some unique and interesting idiosyncrasies. Our work sheds light on (i) how complex cultural phenomena such as online conversational practices are amenable to quantification and (ii) the way social media such as Twitter are utilized by political parties.Comment: 10 pages, 2 figures, 3 tables, Proc. 8th International AAAI Conference on Weblogs and Social Media (ICWSM 2014

    Can Real Social Epistemic Networks Deliver the Wisdom of Crowds?

    Get PDF
    In this paper, we explain and showcase the promising methodology of testimonial network analysis and visualization for experimental epistemology, arguing that it can be used to gain insights and answer philosophical questions in social epistemology. Our use case is the epistemic community that discusses vaccine safety primarily in English on Twitter. In two studies, we show, using both statistical analysis and exploratory data visualization, that there is almost no neutral or ambivalent discussion of vaccine safety on Twitter. Roughly half the accounts engaging with this topic are pro-vaccine, while the other half are con-vaccine. We also show that these two camps rarely engage with one another, and that the con-vaccine camp has greater epistemic reach and receptivity than the pro-vaccine camp. In light of these findings, we question whether testimonial networks as they are currently constituted on popular fora such as Twitter are living up to their promise of delivering the wisdom of crowds. We conclude by pointing to directions for further research in digital social epistemology

    Homofilia por tópicos em uma rede social online

    Get PDF
    Orientador: André SantanchèDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Entender a dinâmica de interações sociais é crucial para o entendimento questões que envolvem o comportamento humano. O surgimento de mídias sociais online, tal como Facebook e Twitter, possibilitou o acesso a dados de relacionamentos de pessoas em larga escala. Essas redes são orientadas à informação, com seus usuários compartilhando e consumindo informação. Nesta dissertação, estamos interessados na presença de homofilia por tópicos em uma rede social. Especificamente, nós exploramos como as conexões entre indivíduos estão ligadas com a sua similaridade por tópicos, i.e., a sua proximidade em consideração com os diferentes tipos de conteúdo que circulam pela rede. Para fazê-lo, representamos usuários utilizando as informações de suas mensagens. Nossos resultados demonstram que usuários, na média, estão conectados com usuários similares a eles e que interações mais fortes estão relacionadas com uma alta similaridade por tópicos. Nós também verificamos que, quando se considera apenas usuários dentro de um tópico, a homofilia se manifesta diferentemente de acordo com o tópico. Nós acreditamos que esta pesquisa, além de fornecer uma maneira de aferir similaridade por tópicos, aumenta as evidências de homofilia entre indivíduos, contribuindo para um melhor entendimento de como sistemas sociais complexos são estruturadosAbstract: Understanding the dynamics of social interactions is crucial to address questions involving human behavior. The emergence of online social medias, such as Facebook and Twitter, has enabled the access to data of people relationships at a large scale. These networks are information oriented, with users sharing and consuming information. In this dissertation, we are interested in the presence of topical homophily in an online social network. Specifically, we explore how individuals connections are related to their topical similarity, i.e., their proximity regarding the different kinds of content that are shared in the network. To do so, we represent users using the information of their messages. Our results show that users, on average, are connected with users which are similar to them and that stronger interactions are related to a high topical similarity. We also verified that, when considering only users inside a topic, homophily manifests differently according to the topic. We believe that this research, besides providing a way to assess the topical similarity of users, deepens the evidence of homophily among individuals, contributing to a better understanding of how complex social systems are structuredMestradoCiência da ComputaçãoMestre em Ciência da Computação1629113, 1461658CAPE

    Identification of Online Users' Social Status via Mining User-Generated Data

    Get PDF
    With the burst of available online user-generated data, identifying online users’ social status via mining user-generated data can play a significant role in many commercial applications, research and policy-making in many domains. Social status refers to the position of a person in relation to others within a society, which is an abstract concept. The actual definition of social status is specific in terms of specific measure indicator. For example, opinion leadership measures individual social status in terms of influence and expertise in an online society, while socioeconomic status characterizes personal real-life social status based on social and economic factors. Compared with traditional survey method which is time-consuming, expensive and sometimes difficult, some efforts have been made to identify specific social status of users based on specific user-generated data using classic machine learning methods. However, in fact, regarding specific social status identification based on specific user-generated data, the specific case has several specific challenges. However, classic machine learning methods in existing works fail to address these challenges, which lead to low identification accuracy. Given the importance of improving identification accuracy, this thesis studies three specific cases on identification of online and offline social status. For each work, this thesis proposes novel effective identification method to address the specific challenges for improving accuracy. The first work aims at identifying users’ online social status in terms of topic-sensitive influence and knowledge authority in social community question answering sites, namely identifying topical opinion leaders who are both influential and expert. Social community question answering (SCQA) site, an innovative community question answering platform, not only offers traditional question answering (QA) services but also integrates an online social network where users can follow each other. Identifying topical opinion leaders in SCQA has become an important research area due to the significant role of topical opinion leaders. However, most previous related work either focus on using knowledge expertise to find experts for improving the quality of answers, or aim at measuring user influence to identify influential ones. In order to identify the true topical opinion leaders, we propose a topical opinion leader identification framework called QALeaderRank which takes account of both topic-sensitive influence and topical knowledge expertise. In the proposed framework, to measure the topic-sensitive influence of each user, we design a novel influence measure algorithm that exploits both the social and QA features of SCQA, taking into account social network structure, topical similarity and knowledge authority. In addition, we propose three topic-relevant metrics to infer the topical expertise of each user. The extensive experiments along with an online user study show that the proposed QALeaderRank achieves significant improvement compared with the state-of-the-art methods. Furthermore, we analyze the topic interest change behaviors of users over time and examine the predictability of user topic interest through experiments. The second work focuses on predicting individual socioeconomic status from mobile phone data. Socioeconomic Status (SES) is an important social and economic aspect widely concerned. Assessing individual SES can assist related organizations in making a variety of policy decisions. Traditional approach suffers from the extremely high cost in collecting large-scale SES-related survey data. With the ubiquity of smart phones, mobile phone data has become a novel data source for predicting individual SES with low cost. However, the task of predicting individual SES on mobile phone data also proposes some new challenges, including sparse individual records, scarce explicit relationships and limited labeled samples, unconcerned in prior work restricted to regional or household-oriented SES prediction. To address these issues, we propose a semi-supervised Hypergraph based Factor Graph Model (HyperFGM) for individual SES prediction. HyperFGM is able to efficiently capture the associations between SES and individual mobile phone records to handle the individual record sparsity. For the scarce explicit relationships, HyperFGM models implicit high-order relationships among users on the hypergraph structure. Besides, HyperFGM explores the limited labeled data and unlabeled data in a semi-supervised way. Experimental results show that HyperFGM greatly outperforms the baseline methods on individual SES prediction with using a set of anonymized real mobile phone data. The third work is to predict social media users’ socioeconomic status based on their social media content, which is useful for related organizations and companies in a range of applications, such as economic and social policy-making. Previous work leverage manually defined textual features and platform-based user level attributes from social media content and feed them into a machine learning based classifier for SES prediction. However, they ignore some important information of social media content, containing the order and the hierarchical structure of social media text as well as the relationships among user level attributes. To this end, we propose a novel coupled social media content representation model for individual SES prediction, which not only utilizes a hierarchical neural network to incorporate the order and the hierarchical structure of social media text but also employs a coupled attribute representation method to take into account intra-coupled and inter-coupled interaction relationships among user level attributes. The experimental results show that the proposed model significantly outperforms other stat-of-the-art models on a real dataset, which validate the efficiency and robustness of the proposed model

    Topical Alignment in Online Social Systems

    Get PDF
    Understanding the dynamics of social interactions is crucial to comprehend human behavior. The emergence of online social media has enabled access to data regarding people relationships at a large scale. Twitter, specifically, is an information oriented network, with users sharing and consuming information. In this work, we study whether users tend to be in contact with people interested in similar topics, i.e., if they are topically aligned. To do so, we propose an approach based on the use of hashtags to extract information topics from Twitter messages and model users' interests. Our results show that, on average, users are connected with other users similar to them. Furthermore, we show that topical alignment provides interesting information that can eventually allow inferring users' connectivity. Our work, besides providing a way to assess the topical similarity of users, quantifies topical alignment among individuals, contributing to a better understanding of how complex social systems are structured

    POISED: Spotting Twitter Spam Off the Beaten Paths

    Get PDF
    Cybercriminals have found in online social networks a propitious medium to spread spam and malicious content. Existing techniques for detecting spam include predicting the trustworthiness of accounts and analyzing the content of these messages. However, advanced attackers can still successfully evade these defenses. Online social networks bring people who have personal connections or share common interests to form communities. In this paper, we first show that users within a networked community share some topics of interest. Moreover, content shared on these social network tend to propagate according to the interests of people. Dissemination paths may emerge where some communities post similar messages, based on the interests of those communities. Spam and other malicious content, on the other hand, follow different spreading patterns. In this paper, we follow this insight and present POISED, a system that leverages the differences in propagation between benign and malicious messages on social networks to identify spam and other unwanted content. We test our system on a dataset of 1.3M tweets collected from 64K users, and we show that our approach is effective in detecting malicious messages, reaching 91% precision and 93% recall. We also show that POISED's detection is more comprehensive than previous systems, by comparing it to three state-of-the-art spam detection systems that have been proposed by the research community in the past. POISED significantly outperforms each of these systems. Moreover, through simulations, we show how POISED is effective in the early detection of spam messages and how it is resilient against two well-known adversarial machine learning attacks

    Of Echo Chambers and Contrarian Clubs:Exposure to Political Disagreement Among German and Italian Users of Twitter

    Get PDF
    Scholars have debated whether social media platforms, by allowing users to select the information to which they are exposed, may lead people to isolate themselves from viewpoints with which they disagree, thereby serving as political “echo chambers.” We investigate hypotheses concerning the circumstances under which Twitter users who communicate about elections would engage with (a) supportive, (b) oppositional, and (c) mixed political networks. Based on online surveys of representative samples of Italian and German individuals who posted at least one Twitter message about elections in 2013, we find substantial differences in the extent to which social media facilitates exposure to similar versus dissimilar political views. Our results suggest that exposure to supportive, oppositional, or mixed political networks on social media can be explained by broader patterns of political conversation (i.e., structure of offline networks) and specific habits in the political use of social media (i.e., the intensity of political discussion). These findings suggest that disagreement persists on social media even when ideological homophily is the modal outcome, and that scholars should pay more attention to specific situational and dispositional factors when evaluating the implications of social media for political communication
    corecore