1,969 research outputs found

    Don't Walk, Skip! Online Learning of Multi-scale Network Embeddings

    Full text link
    We present Walklets, a novel approach for learning multiscale representations of vertices in a network. In contrast to previous works, these representations explicitly encode multiscale vertex relationships in a way that is analytically derivable. Walklets generates these multiscale relationships by subsampling short random walks on the vertices of a graph. By `skipping' over steps in each random walk, our method generates a corpus of vertex pairs which are reachable via paths of a fixed length. This corpus can then be used to learn a series of latent representations, each of which captures successively higher order relationships from the adjacency matrix. We demonstrate the efficacy of Walklets's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, DBLP, Flickr, and YouTube. Our results show that Walklets outperforms new methods based on neural matrix factorization. Specifically, we outperform DeepWalk by up to 10% and LINE by 58% Micro-F1 on challenging multi-label classification tasks. Finally, Walklets is an online algorithm, and can easily scale to graphs with millions of vertices and edges.Comment: 8 pages, ASONAM'1

    Models for Capturing Temporal Smoothness in Evolving Networks for Learning Latent Representation of Nodes

    Full text link
    In a dynamic network, the neighborhood of the vertices evolve across different temporal snapshots of the network. Accurate modeling of this temporal evolution can help solve complex tasks involving real-life social and interaction networks. However, existing models for learning latent representation are inadequate for obtaining the representation vectors of the vertices for different time-stamps of a dynamic network in a meaningful way. In this paper, we propose latent representation learning models for dynamic networks which overcome the above limitation by considering two different kinds of temporal smoothness: (i) retrofitted, and (ii) linear transformation. The retrofitted model tracks the representation vector of a vertex over time, facilitating vertex-based temporal analysis of a network. On the other hand, linear transformation based model provides a smooth transition operator which maps the representation vectors of all vertices from one temporal snapshot to the next (unobserved) snapshot-this facilitates prediction of the state of a network in a future time-stamp. We validate the performance of our proposed models by employing them for solving the temporal link prediction task. Experiments on 9 real-life networks from various domains validate that the proposed models are significantly better than the existing models for predicting the dynamics of an evolving network

    Learning Representations using Spectral-Biased Random Walks on Graphs

    Full text link
    Several state-of-the-art neural graph embedding methods are based on short random walks (stochastic processes) because of their ease of computation, simplicity in capturing complex local graph properties, scalability, and interpretibility. In this work, we are interested in studying how much a probabilistic bias in this stochastic process affects the quality of the nodes picked by the process. In particular, our biased walk, with a certain probability, favors movement towards nodes whose neighborhoods bear a structural resemblance to the current node's neighborhood. We succinctly capture this neighborhood as a probability measure based on the spectrum of the node's neighborhood subgraph represented as a normalized laplacian matrix. We propose the use of a paragraph vector model with a novel Wasserstein regularization term. We empirically evaluate our approach against several state-of-the-art node embedding techniques on a wide variety of real-world datasets and demonstrate that our proposed method significantly improves upon existing methods on both link prediction and node classification tasks.Comment: Accepted at IJCNN 2020: International Joint Conference on Neural Network

    Link Prediction in Social Networks: the State-of-the-Art

    Full text link
    In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems is presented. Then link prediction techniques and problems are analyzed and discussed. Typical applications of link prediction are also addressed. Achievements and roadmaps of some active research groups are introduced. Finally, some future challenges of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201

    Capturing Edge Attributes via Network Embedding

    Full text link
    Network embedding, which aims to learn low-dimensional representations of nodes, has been used for various graph related tasks including visualization, link prediction and node classification. Most existing embedding methods rely solely on network structure. However, in practice we often have auxiliary information about the nodes and/or their interactions, e.g., content of scientific papers in co-authorship networks, or topics of communication in Twitter mention networks. Here we propose a novel embedding method that uses both network structure and edge attributes to learn better network representations. Our method jointly minimizes the reconstruction error for higher-order node neighborhood, social roles and edge attributes using a deep architecture that can adequately capture highly non-linear interactions. We demonstrate the efficacy of our model over existing state-of-the-art methods on a variety of real-world networks including collaboration networks, and social networks. We also observe that using edge attributes to inform network embedding yields better performance in downstream tasks such as link prediction and node classification

    Efficient inference of overlapping communities in complex networks

    Get PDF
    We discuss two views on extending existing methods for complex network modeling which we dub the communities first and the networks first view, respectively. Inspired by the networks first view that we attribute to White, Boorman, and Breiger (1976)[1], we formulate the multiple-networks stochastic blockmodel (MNSBM), which seeks to separate the observed network into subnetworks of different types and where the problem of inferring structure in each subnetwork becomes easier. We show how this model is specified in a generative Bayesian framework where parameters can be inferred efficiently using Gibbs sampling. The result is an effective multiple-membership model without the drawbacks of introducing complex definitions of "groups" and how they interact. We demonstrate results on the recovery of planted structure in synthetic networks and show very encouraging results on link prediction performances using multiple-networks models on a number of real-world network data sets

    Musical recommendations and personalization in a social network

    Full text link
    This paper presents a set of algorithms used for music recommendations and personalization in a general purpose social network www.ok.ru, the second largest social network in the CIS visited by more then 40 millions users per day. In addition to classical recommendation features like "recommend a sequence" and "find similar items" the paper describes novel algorithms for construction of context aware recommendations, personalization of the service, handling of the cold-start problem, and more. All algorithms described in the paper are working on-line and are able to detect and address changes in the user's behavior and needs in the real time. The core component of the algorithms is a taste graph containing information about different entities (users, tracks, artists, etc.) and relations between them (for example, user A likes song B with certainty X, track B created by artist C, artist C is similar to artist D with certainty Y and so on). Using the graph it is possible to select tracks a user would most probably like, to arrange them in a way that they match each other well, to estimate which items from a fixed list are most relevant for the user, and more. In addition, the paper describes the approach used to estimate algorithms efficiency and analyze the impact of different recommendation related features on the users' behavior and overall activity at the service.Comment: This is a full version of a 4 pages article published at ACM RecSys 201

    Graph Based Recommendations: From Data Representation to Feature Extraction and Application

    Full text link
    Modeling users for the purpose of identifying their preferences and then personalizing services on the basis of these models is a complex task, primarily due to the need to take into consideration various explicit and implicit signals, missing or uncertain information, contextual aspects, and more. In this study, a novel generic approach for uncovering latent preference patterns from user data is proposed and evaluated. The approach relies on representing the data using graphs, and then systematically extracting graph-based features and using them to enrich the original user models. The extracted features encapsulate complex relationships between users, items, and metadata. The enhanced user models can then serve as an input to any recommendation algorithm. The proposed approach is domain-independent (demonstrated on data from movies, music, and business recommender systems), and is evaluated using several state-of-the-art machine learning methods, on different recommendation tasks, and using different evaluation metrics. The results show a unanimous improvement in the recommendation accuracy across tasks and domains. In addition, the evaluation provides a deeper analysis regarding the performance of the approach in special scenarios, including high sparsity and variability of ratings

    CONE: Community Oriented Network Embedding

    Full text link
    Detecting communities has long been popular in the research on networks. It is usually modeled as an unsupervised clustering problem on graphs, based on heuristic assumptions about community characteristics, such as edge density and node homogeneity. In this work, we doubt the universality of these widely adopted assumptions and compare human labeled communities with machine predicted ones obtained via various mainstream algorithms. Based on supportive results, we argue that communities are defined by various social patterns and unsupervised learning based on heuristics is incapable of capturing all of them. Therefore, we propose to inject supervision into community detection through Community Oriented Network Embedding (CONE), which leverages limited ground-truth communities as examples to learn an embedding model aware of the social patterns underlying them. Specifically, a deep architecture is developed by combining recurrent neural networks with random-walks on graphs towards capturing social patterns directed by ground-truth communities. Generic clustering algorithms on the embeddings of other nodes produced by the learned model then effectively reveals more communities that share similar social patterns with the ground-truth ones.Comment: 10 pages, accepted by IJCNN 201

    Inferring gene ontologies from pairwise similarity data.

    Get PDF
    MotivationWhile the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene-gene pairwise similarities from -omics data; infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; and respect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge-none has been evaluated for GO inference.MethodsWe consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method's ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast.ResultsFor task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20-25% precision, recall).ConclusionThis study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data
    • …
    corecore