99 research outputs found

    On the Role of Social Identity and Cohesion in Characterizing Online Social Communities

    Get PDF
    Two prevailing theories for explaining social group or community structure are cohesion and identity. The social cohesion approach posits that social groups arise out of an aggregation of individuals that have mutual interpersonal attraction as they share common characteristics. These characteristics can range from common interests to kinship ties and from social values to ethnic backgrounds. In contrast, the social identity approach posits that an individual is likely to join a group based on an intrinsic self-evaluation at a cognitive or perceptual level. In other words group members typically share an awareness of a common category membership. In this work we seek to understand the role of these two contrasting theories in explaining the behavior and stability of social communities in Twitter. A specific focal point of our work is to understand the role of these theories in disparate contexts ranging from disaster response to socio-political activism. We extract social identity and social cohesion features-of-interest for large scale datasets of five real-world events and examine the effectiveness of such features in capturing behavioral characteristics and the stability of groups. We also propose a novel measure of social group sustainability based on the divergence in group discussion. Our main findings are: 1) Sharing of social identities (especially physical location) among group members has a positive impact on group sustainability, 2) Structural cohesion (represented by high group density and low average shortest path length) is a strong indicator of group sustainability, and 3) Event characteristics play a role in shaping group sustainability, as social groups in transient events behave differently from groups in events that last longer

    Reverse Engineering Socialbot Infiltration Strategies in Twitter

    Full text link
    Data extracted from social networks like Twitter are increasingly being used to build applications and services that mine and summarize public reactions to events, such as traffic monitoring platforms, identification of epidemic outbreaks, and public perception about people and brands. However, such services are vulnerable to attacks from socialbots −- automated accounts that mimic real users −- seeking to tamper statistics by posting messages generated automatically and interacting with legitimate users. Potentially, if created in large scale, socialbots could be used to bias or even invalidate many existing services, by infiltrating the social networks and acquiring trust of other users with time. This study aims at understanding infiltration strategies of socialbots in the Twitter microblogging platform. To this end, we create 120 socialbot accounts with different characteristics and strategies (e.g., gender specified in the profile, how active they are, the method used to generate their tweets, and the group of users they interact with), and investigate the extent to which these bots are able to infiltrate the Twitter social network. Our results show that even socialbots employing simple automated mechanisms are able to successfully infiltrate the network. Additionally, using a 2k2^k factorial design, we quantify infiltration effectiveness of different bot strategies. Our analysis unveils findings that are key for the design of detection and counter measurements approaches

    Social Networks Influence Analysis

    Get PDF
    Pew Research Center estimates that as of 2014, 74% of the Internet Users used social media, i.e., more than 2.4 billion users. With the growing popularity of social media where Internet users exchange their opinions on many things including their daily life encounters, it is not surprising that many organizations are interested in learning what users say about their products and services. To be able to play a proactive role in steering what user’s say, many organizations have engaged in efforts aiming at identifying efficient ways of marketing certain products and services, and making sure user reviews are somewhat favorable. Favorable reviews are typically achieved through identifying users on social networks who have a strong influence power over a large number of other users, i.e. influential users. Twitter has emerged as one of the prominent social network services with 320 million monthly active users worldwide. Based on the literature, influential Twitter users have been typically analyzed using the following three models: topic-based model, topology-based model, and user characteristics-based model. The topology-based model is criticized for being static, i.e., it does not adapt to the social network changes such as user’s new posts, or new relationships. The user characteristics-based model was presented as an alternative approach; however, it was criticized for discounting the impact of interactions between users, and users’ interests. Lastly, the topic-based model, while sensitive to users’ interests, typically suffers from ignoring the inclusion of inter-user interactions. This thesis research introduces a dynamic, comprehensive and topic-sensitive approach for identifying social network influencers leveraging the strengths of the aforementioned models. Three separate experiments were conducted to evaluate the new approach using the information diffusion measure. In these experiments, software was developed to capture users’ tweets pertinent to a topic over a period of time, and store the tweet’s metadata in a relational database. A graph representing users was extracted from the database. The new approach was applied to the users’ graph to compute an influence score for each user. Results show that the new composite influence score is more accurate in comprehensively identifying true influential users, when compared to scores calculated using the characteristics-based, topic-based, and topology-based models. Also, this research shows that the new approach could leverage a variety of machine learning algorithms to accurately identify influencers. Last, while the focus of this research was on Twitter, our approach may be applicable to other social networks and micro-blogging services

    Identifying the topic-specific influential users in Twitter

    Get PDF
    Social Influence can be described as the ability to have an effect on the thoughts or actions of others. Influential members in online communities are becoming the new media to market products and sway opinions. Also, their guidance and recommendations can save some people the search time and assist their selective decision making. The objective of this research is to detect the influential users in a specific topic on Twitter. In more detail, from a collection of tweets matching a specified query, we want to detect the influential users, in an online fashion. In order to address this objective, we first want to focus our search on the individuals who write in their personal accounts, so we investigate how we can differentiate between the personal and non-personal accounts. Secondly, we investigate which set of features can best lead us to the topic-specific influential users, and how these features can be expressed in a model to produce a ranked list of influential users. Finally, we look into the use of the language and if it can be used as a supporting feature for detecting the author\u27s influence. In order to decide on how to differentiate between the personal and non-personal accounts, we compared between the effectiveness of using SVM and using a manually assembled list of the non-personal accounts. In order to decide on the features that can best lead us to the influential users, we ran a few experiments on a set of features inspired from the literature. Two ranking methods were then developed, using feature combinations, to identify the candidate users for being influential. For evaluation we manually examined the users, looking at their tweets and profile page in order to decide on their influence. To address our final objective, we ran a few experiments to investigate if the SLM could be used to identify the influential users\u27 tweets. For user account classification into personal and non-personal accounts, the SVM was found to be domain independent, reliable and consistent with a precision of over 0.9. The results showed that over time the list performance deteriorates and when the domain of the test data was changed, the SVM performed better than the list with higher precision and specificity values. We extracted eight independent features from a set of 12, and ran experiments on these eight and found that the best features at identifying influential users to be the Followers count, the Average Retweets count, The Average Retweets Frequency and the Age_Activity combination. Two ranking methods were developed and tested on a set of tweets retrieved using a specific query. In the first method, these best four features were combined in different ways. The best combination was the one that took the average of the Followers count and the Average Retweets count, producing a precision at 10 value of 0.9. In the second method, the users were ranked according to the eight independent features and the top 50 users of each were included in separate lists. The users were then ranked according to their appearance frequency in these lists. The best result was obtained when we considered the users who appeared in six or more of the lists, which resulted in a precision of 1.0. Both ranking methods were then conducted on 20 different collections of retrieved tweets to verify their effectiveness in detecting influential users, and to compare their performance. The best result was obtained by the second method, for the set of users who appeared in six or more of the lists, with the highest precision mean of 0.692. Finally, for the SLM, we found a correlation between the users\u27 average Retweets counts and their tweets\u27 perplexity values, which consolidates the hypothesis that SLM can be trained to detect the highly retweeted tweets. However, the use of the perplexity for identifying influential users resulted in very low precision values. The contributions of this thesis can be summarized into the following. A method to classify the personal accounts was proposed. The features that help detecting influential users were identified to be the Followers count, the Average Retweets count, the Average Retweet Frequency and the Age_Activity combination. Two methods for identifying the influential users were proposed. Finally, the simplistic approach using SLM did not produce good results, and there is still a lot of work to be done for the SLM to be used for identifying influential users

    Probabilistic Personalized Recommendation Models For Heterogeneous Social Data

    Get PDF
    Content recommendation has risen to a new dimension with the advent of platforms like Twitter, Facebook, FriendFeed, Dailybooth, and Instagram. Although this uproar of data has provided us with a goldmine of real-world information, the problem of information overload has become a major barrier in developing predictive models. Therefore, the objective of this The- sis is to propose various recommendation, prediction and information retrieval models that are capable of leveraging such vast heterogeneous content. More specifically, this Thesis focuses on proposing models based on probabilistic generative frameworks for the following tasks: (a) recommending backers and projects in Kickstarter crowdfunding domain and (b) point of interest recommendation in Foursquare. Through comprehensive set of experiments over a variety of datasets, we show that our models are capable of providing practically useful results for recommendation and information retrieval tasks

    Inferring user interests in microblogging social networks: a survey

    Get PDF
    With the growing popularity of microblogging services such as Twitter in recent years, an increasing number of users are using these services in their daily lives. The huge volume of information generated by users raises new opportunities in various applications and areas. Inferring user interests plays a significant role in providing personalized recommendations on microblogging services, and also on third-party applications providing social logins via these services, especially in cold-start situations. In this survey, we review user modeling strategies with respect to inferring user interests from previous studies. To this end, we focus on four dimensions of inferring user interest profiles: (1) data collection, (2) representation of user interest profiles, (3) construction and enhancement of user interest profiles, and (4) the evaluation of the constructed profiles. Through this survey, we aim to provide an overview of state-of-the-art user modeling strategies for inferring user interest profiles on microblogging social networks with respect to the four dimensions. For each dimension, we review and summarize previous studies based on specified criteria. Finally, we discuss some challenges and opportunities for future work in this research domain

    Making news in 140 characters: how the new media environment is changing our examination of audiences, journalists, and content

    Get PDF
    This project answers the following questions: What does political reporting on social media look like? How is political journalists’ use of social media changing their relationships with sources and fellow political journalists? Triangulating qualitative and quantitative research methods (content analysis, social network analysis, and in-depth interviews) in an examination of Twitter, a social media platform popular among journalists, this project provides insight into how changes in media routines are affecting news content
    • …
    corecore