920 research outputs found

    Behavioral Communities and the Atomic Structure of Networks

    Full text link
    We develop a theory of `behavioral communities' and the `atomic structure' of networks. We define atoms to be groups of agents whose behaviors always match each other in a set of coordination games played on the network. This provides a microfoundation for a method of detecting communities in social and economic networks. We provide theoretical results characterizing such behavior-based communities and atomic structures and discussing their properties in large random networks. We also provide an algorithm for identifying behavioral communities. We discuss applications including: a method of estimating underlying preferences by observing behavioral conventions in data, and optimally seeding diffusion processes when there are peer interactions and homophily. We illustrate the techniques with applications to high school friendship networks and rural village networks

    Topics in social network analysis and network science

    Full text link
    This chapter introduces statistical methods used in the analysis of social networks and in the rapidly evolving parallel-field of network science. Although several instances of social network analysis in health services research have appeared recently, the majority involve only the most basic methods and thus scratch the surface of what might be accomplished. Cutting-edge methods using relevant examples and illustrations in health services research are provided

    Latent Space Model for Multi-Modal Social Data

    Full text link
    With the emergence of social networking services, researchers enjoy the increasing availability of large-scale heterogenous datasets capturing online user interactions and behaviors. Traditional analysis of techno-social systems data has focused mainly on describing either the dynamics of social interactions, or the attributes and behaviors of the users. However, overwhelming empirical evidence suggests that the two dimensions affect one another, and therefore they should be jointly modeled and analyzed in a multi-modal framework. The benefits of such an approach include the ability to build better predictive models, leveraging social network information as well as user behavioral signals. To this purpose, here we propose the Constrained Latent Space Model (CLSM), a generalized framework that combines Mixed Membership Stochastic Blockmodels (MMSB) and Latent Dirichlet Allocation (LDA) incorporating a constraint that forces the latent space to concurrently describe the multiple data modalities. We derive an efficient inference algorithm based on Variational Expectation Maximization that has a computational cost linear in the size of the network, thus making it feasible to analyze massive social datasets. We validate the proposed framework on two problems: prediction of social interactions from user attributes and behaviors, and behavior prediction exploiting network information. We perform experiments with a variety of multi-modal social systems, spanning location-based social networks (Gowalla), social media services (Instagram, Orkut), e-commerce and review sites (Amazon, Ciao), and finally citation networks (Cora). The results indicate significant improvement in prediction accuracy over state of the art methods, and demonstrate the flexibility of the proposed approach for addressing a variety of different learning problems commonly occurring with multi-modal social data.Comment: 12 pages, 7 figures, 2 table

    POISED: Spotting Twitter Spam Off the Beaten Paths

    Get PDF
    Cybercriminals have found in online social networks a propitious medium to spread spam and malicious content. Existing techniques for detecting spam include predicting the trustworthiness of accounts and analyzing the content of these messages. However, advanced attackers can still successfully evade these defenses. Online social networks bring people who have personal connections or share common interests to form communities. In this paper, we first show that users within a networked community share some topics of interest. Moreover, content shared on these social network tend to propagate according to the interests of people. Dissemination paths may emerge where some communities post similar messages, based on the interests of those communities. Spam and other malicious content, on the other hand, follow different spreading patterns. In this paper, we follow this insight and present POISED, a system that leverages the differences in propagation between benign and malicious messages on social networks to identify spam and other unwanted content. We test our system on a dataset of 1.3M tweets collected from 64K users, and we show that our approach is effective in detecting malicious messages, reaching 91% precision and 93% recall. We also show that POISED's detection is more comprehensive than previous systems, by comparing it to three state-of-the-art spam detection systems that have been proposed by the research community in the past. POISED significantly outperforms each of these systems. Moreover, through simulations, we show how POISED is effective in the early detection of spam messages and how it is resilient against two well-known adversarial machine learning attacks

    Large scale homophily analysis in twitter using a twixonomy

    Get PDF
    In this paper we perform a large-scale homophily analysis on Twitter using a hierarchical representation of users' interests which we call a Twixonomy. In order to build a population, community, or single-user Twixonomy we first associate "topical" friends in users' friendship lists (i.e. friends representing an interest rather than a social relation between peers) with Wikipedia categories. A wordsense disambiguation algorithm is used to select the appropriate wikipage for each topical friend. Starting from the set of wikipages representing "primitive" interests, we extract all paths connecting these pages with topmost Wikipedia category nodes, and we then prune the resulting graph G efficiently so as to induce a direct acyclic graph. This graph is the Twixonomy. Then, to analyze homophily, we compare different methods to detect communities in a peer friends Twitter network, and then for each community we compute the degree of homophily on the basis of a measure of pairwise semantic similarity. We show that the Twixonomy provides a means for describing users' interests in a compact and readable way and allows for a fine-grained homophily analysis. Furthermore, we show that midlow level categories in the Twixonomy represent the best balance between informativeness and compactness of the representation

    A SYSTEMATIC REVIEW OF COMPUTATIONAL METHODS IN AND RESEARCH TAXONOMY OF HOMOPHILY IN INFORMATION SYSTEMS

    Get PDF
    Homophily is both a principle for social group formation with like-minded people as well as a mechanism for social interactions. Recent years have seen a growing body of management research on homophily particularly on large-scale social media and digital platforms. However, the predominant traditional qualitative and quantitative methods employed face validity issues and/or are not well-suited for big social data. There are scant guidelines for applying computational methods to specific research domains concerning descriptive patterns, explanatory mechanisms, or predictive indicators of homophily. To fill this research gap, this paper offers a structured review of the emerging literature on computational social science approaches to homophily with a particular emphasis on their relevance, appropriateness, and importance to information systems research. We derive a research taxonomy for homophily and offer methodological reflections and recommendations to help inform future research
    • …
    corecore