32,706 research outputs found

    Jointly Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN)

    Full text link
    The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. proposed Social-Attribute Network (SAN), an attribute-augmented social network, to integrate network structure and node attributes to perform both link prediction and attribute inference. They focused on generalizing the random walk with restart algorithm to the SAN framework and showed improved performance. In this paper, we extend the SAN framework with several leading supervised and unsupervised link prediction algorithms and demonstrate performance improvement for each algorithm on both link prediction and attribute inference. Moreover, we make the novel observation that attribute inference can help inform link prediction, i.e., link prediction accuracy is further improved by first inferring missing attributes. We comprehensively evaluate these algorithms and compare them with other existing algorithms using a novel, large-scale Google+ dataset, which we make publicly available.Comment: 9 pages, 4 figures and 4 table

    Link Prediction in Social Networks: the State-of-the-Art

    Full text link
    In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems is presented. Then link prediction techniques and problems are analyzed and discussed. Typical applications of link prediction are also addressed. Achievements and roadmaps of some active research groups are introduced. Finally, some future challenges of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201

    Multimodal Deep Network Embedding with Integrated Structure and Attribute Information

    Full text link
    Network embedding is the process of learning low-dimensional representations for nodes in a network, while preserving node features. Existing studies only leverage network structure information and focus on preserving structural features. However, nodes in real-world networks often have a rich set of attributes providing extra semantic information. It has been demonstrated that both structural and attribute features are important for network analysis tasks. To preserve both features, we investigate the problem of integrating structure and attribute information to perform network embedding and propose a Multimodal Deep Network Embedding (MDNE) method. MDNE captures the non-linear network structures and the complex interactions among structures and attributes, using a deep model consisting of multiple layers of non-linear functions. Since structures and attributes are two different types of information, a multimodal learning method is adopted to pre-process them and help the model to better capture the correlations between node structure and attribute information. We employ both structural proximity and attribute proximity in the loss function to preserve the respective features and the representations are obtained by minimizing the loss function. Results of extensive experiments on four real-world datasets show that the proposed method performs significantly better than baselines on a variety of tasks, which demonstrate the effectiveness and generality of our method.Comment: 15 pages, 10 figure

    Deep Generative Models for Relational Data with Side Information

    Full text link
    We present a probabilistic framework for overlapping community discovery and link prediction for relational data, given as a graph. The proposed framework has: (1) a deep architecture which enables us to infer multiple layers of latent features/communities for each node, providing superior link prediction performance on more complex networks and better interpretability of the latent features; and (2) a regression model which allows directly conditioning the node latent features on the side information available in form of node attributes. Our framework handles both (1) and (2) via a clean, unified model, which enjoys full local conjugacy via data augmentation, and facilitates efficient inference via closed form Gibbs sampling. Moreover, inference cost scales in the number of edges which is attractive for massive but sparse networks. Our framework is also easily extendable to model weighted networks with count-valued edges. We compare with various state-of-the-art methods and report results, both quantitative and qualitative, on several benchmark data sets

    A Survey of Heterogeneous Information Network Analysis

    Full text link
    Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous networks, without distinguishing different types of objects and links in the networks. Recently, more and more researchers begin to consider these interconnected, multi-typed data as heterogeneous information networks, and develop structural analysis approaches by leveraging the rich semantic meaning of structural types of objects and links in the networks. Compared to widely studied homogeneous network, the heterogeneous information network contains richer structure and semantic information, which provides plenty of opportunities as well as a lot of challenges for data mining. In this paper, we provide a survey of heterogeneous information network analysis. We will introduce basic concepts of heterogeneous information network analysis, examine its developments on different data mining tasks, discuss some advanced topics, and point out some future research directions.Comment: 45 pages, 12 figure

    Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks

    Full text link
    There has been an explosion of multimodal content generated on social media networks in the last few years, which has necessitated a deeper understanding of social media content and user behavior. We present a novel content-independent content-user-reaction model for social multimedia content analysis. Compared to prior works that generally tackle semantic content understanding and user behavior modeling in isolation, we propose a generalized solution to these problems within a unified framework. We embed users, images and text drawn from open social media in a common multimodal geometric space, using a novel loss function designed to cope with distant and disparate modalities, and thereby enable seamless three-way retrieval. Our model not only outperforms unimodal embedding based methods on cross-modal retrieval tasks but also shows improvements stemming from jointly solving the two tasks on Twitter data. We also show that the user embeddings learned within our joint multimodal embedding model are better at predicting user interests compared to those learned with unimodal content on Instagram data. Our framework thus goes beyond the prior practice of using explicit leader-follower link information to establish affiliations by extracting implicit content-centric affiliations from isolated users. We provide qualitative results to show that the user clusters emerging from learned embeddings have consistent semantics and the ability of our model to discover fine-grained semantics from noisy and unstructured data. Our work reveals that social multimodal content is inherently multimodal and possesses a consistent structure because in social networks meaning is created through interactions between users and content.Comment: Preprint submitted to IJC

    Privacy in Social Media: Identification, Mitigation and Applications

    Full text link
    The increasing popularity of social media has attracted a huge number of people to participate in numerous activities on a daily basis. This results in tremendous amounts of rich user-generated data. This data provides opportunities for researchers and service providers to study and better understand users' behaviors and further improve the quality of the personalized services. Publishing user-generated data risks exposing individuals' privacy. Users privacy in social media is an emerging task and has attracted increasing attention in recent years. These works study privacy issues in social media from the two different points of views: identification of vulnerabilities, and mitigation of privacy risks. Recent research has shown the vulnerability of user-generated data against the two general types of attacks, identity disclosure and attribute disclosure. These privacy issues mandate social media data publishers to protect users' privacy by sanitizing user-generated data before publishing it. Consequently, various protection techniques have been proposed to anonymize user-generated social media data. There is a vast literature on privacy of users in social media from many perspectives. In this survey, we review the key achievements of user privacy in social media. In particular, we review and compare the state-of-the-art algorithms in terms of the privacy leakage attacks and anonymization algorithms. We overview the privacy risks from different aspects of social media and categorize the relevant works into five groups 1) graph data anonymization and de-anonymization, 2) author identification, 3) profile attribute disclosure, 4) user location and privacy, and 5) recommender systems and privacy issues. We also discuss open problems and future research directions for user privacy issues in social media.Comment: This survey is currently under revie

    Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks

    Full text link
    Inferring latent attributes of people online is an important social computing task, but requires integrating the many heterogeneous sources of information available on the web. We propose learning individual representations of people using neural nets to integrate rich linguistic and network evidence gathered from social media. The algorithm is able to combine diverse cues, such as the text a person writes, their attributes (e.g. gender, employer, education, location) and social relations to other people. We show that by integrating both textual and network evidence, these representations offer improved performance at four important tasks in social media inference on Twitter: predicting (1) gender, (2) occupation, (3) location, and (4) friendships for users. Our approach scales to large datasets and the learned representations can be used as general features in and have the potential to benefit a large number of downstream tasks including link prediction, community detection, or probabilistic reasoning over social networks

    Machine Learning on Graphs: A Model and Comprehensive Taxonomy

    Full text link
    There has been a surge of recent interest in learning representations for graph-structured data. Graph representation learning methods have generally fallen into three main categories, based on the availability of labeled data. The first, network embedding (such as shallow graph embedding or graph auto-encoders), focuses on learning unsupervised representations of relational structure. The second, graph regularized neural networks, leverages graphs to augment neural network losses with a regularization objective for semi-supervised learning. The third, graph neural networks, aims to learn differentiable functions over discrete topologies with arbitrary structure. However, despite the popularity of these areas there has been surprisingly little work on unifying the three paradigms. Here, we aim to bridge the gap between graph neural networks, network embedding and graph regularization models. We propose a comprehensive taxonomy of representation learning methods for graph-structured data, aiming to unify several disparate bodies of work. Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs (e.g. GraphSage, Graph Convolutional Networks, Graph Attention Networks), and unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc) into a single consistent approach. To illustrate the generality of this approach, we fit over thirty existing methods into this framework. We believe that this unifying view both provides a solid foundation for understanding the intuition behind these methods, and enables future research in the area

    Deep Representation Learning for Social Network Analysis

    Full text link
    Social network analysis is an important problem in data mining. A fundamental step for analyzing social networks is to encode network data into low-dimensional representations, i.e., network embeddings, so that the network topology structure and other attribute information can be effectively preserved. Network representation leaning facilitates further applications such as classification, link prediction, anomaly detection and clustering. In addition, techniques based on deep neural networks have attracted great interests over the past a few years. In this survey, we conduct a comprehensive review of current literature in network representation learning utilizing neural network models. First, we introduce the basic models for learning node representations in homogeneous networks. Meanwhile, we will also introduce some extensions of the base models in tackling more complex scenarios, such as analyzing attributed networks, heterogeneous networks and dynamic networks. Then, we introduce the techniques for embedding subgraphs. After that, we present the applications of network representation learning. At the end, we discuss some promising research directions for future work
    • …
    corecore