5,721 research outputs found

    Link Prediction in Social Networks: the State-of-the-Art

    Full text link
    In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems is presented. Then link prediction techniques and problems are analyzed and discussed. Typical applications of link prediction are also addressed. Achievements and roadmaps of some active research groups are introduced. Finally, some future challenges of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201

    Predicting Anchor Links between Heterogeneous Social Networks

    Full text link
    People usually get involved in multiple social networks to enjoy new services or to fulfill their needs. Many new social networks try to attract users of other existing networks to increase the number of their users. Once a user (called source user) of a social network (called source network) joins a new social network (called target network), a new inter-network link (called anchor link) is formed between the source and target networks. In this paper, we concentrated on predicting the formation of such anchor links between heterogeneous social networks. Unlike conventional link prediction problems in which the formation of a link between two existing users within a single network is predicted, in anchor link prediction, the target user is missing and will be added to the target network once the anchor link is created. To solve this problem, we use meta-paths as a powerful tool for utilizing heterogeneous information in both the source and target networks. To this end, we propose an effective general meta-path-based approach called Connector and Recursive Meta-Paths (CRMP). By using those two different categories of meta-paths, we model different aspects of social factors that may affect a source user to join the target network, resulting in the formation of a new anchor link. Extensive experiments on real-world heterogeneous social networks demonstrate the effectiveness of the proposed method against the recent methods.Comment: To be published in "Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)

    Reciprocal versus Parasocial Relationships in Online Social Networks

    Full text link
    Many online social networks are fundamentally directed, i.e., they consist of both reciprocal edges (i.e., edges that have already been linked back) and parasocial edges (i.e., edges that haven't been linked back). Thus, understanding the structures and evolutions of reciprocal edges and parasocial ones, exploring the factors that influence parasocial edges to become reciprocal ones, and predicting whether a parasocial edge will turn into a reciprocal one are basic research problems. However, there have been few systematic studies about such problems. In this paper, we bridge this gap using a novel large-scale Google+ dataset crawled by ourselves as well as one publicly available social network dataset. First, we compare the structures and evolutions of reciprocal edges and those of parasocial edges. For instance, we find that reciprocal edges are more likely to connect users with similar degrees while parasocial edges are more likely to link ordinary users (e.g., users with low degrees) and popular users (e.g., celebrities). However, the impacts of reciprocal edges linking ordinary and popular users on the network structures increase slowly as the social networks evolve. Second, we observe that factors including user behaviors, node attributes, and edge attributes all have significant impacts on the formation of reciprocal edges. Third, in contrast to previous studies that treat reciprocal edge prediction as either a supervised or a semi-supervised learning problem, we identify that reciprocal edge prediction is better modeled as an outlier detection problem. Finally, we perform extensive evaluations with the two datasets, and we show that our proposal outperforms previous reciprocal edge prediction approaches.Comment: Social Network Analysis and Mining, Springer, 201

    Anxious Depression Prediction in Real-time Social Data

    Full text link
    Mental well-being and social media have been closely related domains of study. In this research a novel model, AD prediction model, for anxious depression prediction in real-time tweets is proposed. This mixed anxiety-depressive disorder is a predominantly associated with erratic thought process, restlessness and sleeplessness. Based on the linguistic cues and user posting patterns, the feature set is defined using a 5-tuple vector <word, timing, frequency, sentiment, contrast>. An anxiety-related lexicon is built to detect the presence of anxiety indicators. Time and frequency of tweet is analyzed for irregularities and opinion polarity analytics is done to find inconsistencies in posting behaviour. The model is trained using three classifiers (multinomial na\"ive bayes, gradient boosting, and random forest) and majority voting using an ensemble voting classifier is done. Preliminary results are evaluated for tweets of sampled 100 users and the proposed model achieves a classification accuracy of 85.09%

    mvn2vec: Preservation and Collaboration in Multi-View Network Embedding

    Full text link
    Multi-view networks are broadly present in real-world applications. In the meantime, network embedding has emerged as an effective representation learning approach for networked data. Therefore, we are motivated to study the problem of multi-view network embedding with a focus on the optimization objectives that are specific and important in embedding this type of network. In our practice of embedding real-world multi-view networks, we explicitly identify two such objectives, which we refer to as preservation and collaboration. The in-depth analysis of these two objectives is discussed throughout this paper. In addition, the mvn2vec algorithms are proposed to (i) study how varied extent of preservation and collaboration can impact embedding learning and (ii) explore the feasibility of achieving better embedding quality by modeling them simultaneously. With experiments on a series of synthetic datasets, a large-scale internal Snapchat dataset, and two public datasets, we confirm the validity and importance of preservation and collaboration as two objectives for multi-view network embedding. These experiments further demonstrate that better embedding can be obtained by simultaneously modeling the two objectives, while not over-complicating the model or requiring additional supervision. The code and the processed datasets are available at http://yushi2.web.engr.illinois.edu/

    Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks

    Full text link
    Inferring latent attributes of people online is an important social computing task, but requires integrating the many heterogeneous sources of information available on the web. We propose learning individual representations of people using neural nets to integrate rich linguistic and network evidence gathered from social media. The algorithm is able to combine diverse cues, such as the text a person writes, their attributes (e.g. gender, employer, education, location) and social relations to other people. We show that by integrating both textual and network evidence, these representations offer improved performance at four important tasks in social media inference on Twitter: predicting (1) gender, (2) occupation, (3) location, and (4) friendships for users. Our approach scales to large datasets and the learned representations can be used as general features in and have the potential to benefit a large number of downstream tasks including link prediction, community detection, or probabilistic reasoning over social networks

    Link Prediction in Multiplex Networks based on Interlayer Similarity

    Full text link
    Some networked systems can be better modelled by multilayer structure where the individual nodes develop relationships in multiple layers. Multilayer networks with similar nodes across layers are also known as multiplex networks. This manuscript proposes a novel framework for predicting forthcoming or missing links in multiplex networks. The link prediction problem in multiplex networks is how to predict links in one of the layers, taking into account the structural information of other layers. The proposed link prediction framework is based on interlayer similarity and proximity-based features extracted from the layer for which the link prediction is considered. To this end, commonly used proximity-based features such as Adamic-Adar and Jaccard Coefficient are considered. These features that have been originally proposed to predict missing links in monolayer networks, do not require learning, and thus are simple to compute. The proposed method introduces a systematic approach to take into account interlayer similarity for the link prediction purpose. Experimental results on both synthetic and real multiplex networks reveal the effectiveness of the proposed method and show its superior performance than state-of-the-art algorithms proposed for the link prediction problem in multiplex networks

    Supervised Rank Aggregation for Predicting Influence in Networks

    Full text link
    Much work in Social Network Analysis has focused on the identification of the most important actors in a social network. This has resulted in several measures of influence and authority. While most of such sociometrics (e.g., PageRank) are driven by intuitions based on an actors location in a network, asking for the "most influential" actors in itself is an ill-posed question, unless it is put in context with a specific measurable task. Constructing a predictive task of interest in a given domain provides a mechanism to quantitatively compare different measures of influence. Furthermore, when we know what type of actionable insight to gather, we need not rely on a single network centrality measure. A combination of measures is more likely to capture various aspects of the social network that are predictive and beneficial for the task. Towards this end, we propose an approach to supervised rank aggregation, driven by techniques from Social Choice Theory. We illustrate the effectiveness of this method through experiments on Twitter and citation networks

    CONE: Community Oriented Network Embedding

    Full text link
    Detecting communities has long been popular in the research on networks. It is usually modeled as an unsupervised clustering problem on graphs, based on heuristic assumptions about community characteristics, such as edge density and node homogeneity. In this work, we doubt the universality of these widely adopted assumptions and compare human labeled communities with machine predicted ones obtained via various mainstream algorithms. Based on supportive results, we argue that communities are defined by various social patterns and unsupervised learning based on heuristics is incapable of capturing all of them. Therefore, we propose to inject supervision into community detection through Community Oriented Network Embedding (CONE), which leverages limited ground-truth communities as examples to learn an embedding model aware of the social patterns underlying them. Specifically, a deep architecture is developed by combining recurrent neural networks with random-walks on graphs towards capturing social patterns directed by ground-truth communities. Generic clustering algorithms on the embeddings of other nodes produced by the learned model then effectively reveals more communities that share similar social patterns with the ground-truth ones.Comment: 10 pages, accepted by IJCNN 201

    Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey

    Full text link
    Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and applied in various fields such as software engineering, political science, medical and linguistic science, etc. There are various methods for topic modeling, which Latent Dirichlet allocation (LDA) is one of the most popular methods in this field. Researchers have proposed various models based on the LDA in topic modeling. According to previous work, this paper can be very useful and valuable for introducing LDA approaches in topic modeling. In this paper, we investigated scholarly articles highly (between 2003 to 2016) related to Topic Modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling. Also, we summarize challenges and introduce famous tools and datasets in topic modeling based on LDA.Comment: arXiv admin note: text overlap with arXiv:1505.07302 by other author
    • …
    corecore