Search CORE

1,969 research outputs found

Don't Walk, Skip! Online Learning of Multi-scale Network Embeddings

Author: Chen Haochen
Kulkarni Vivek
Perozzi Bryan
Skiena Steven
Publication venue
Publication date: 24/06/2017
Field of study

We present Walklets, a novel approach for learning multiscale representations of vertices in a network. In contrast to previous works, these representations explicitly encode multiscale vertex relationships in a way that is analytically derivable. Walklets generates these multiscale relationships by subsampling short random walks on the vertices of a graph. By `skipping' over steps in each random walk, our method generates a corpus of vertex pairs which are reachable via paths of a fixed length. This corpus can then be used to learn a series of latent representations, each of which captures successively higher order relationships from the adjacency matrix. We demonstrate the efficacy of Walklets's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, DBLP, Flickr, and YouTube. Our results show that Walklets outperforms new methods based on neural matrix factorization. Specifically, we outperform DeepWalk by up to 10% and LINE by 58% Micro-F1 on challenging multi-label classification tasks. Finally, Walklets is an online algorithm, and can easily scale to graphs with millions of vertices and edges.Comment: 8 pages, ASONAM'1

arXiv.org e-Print Archive

Models for Capturing Temporal Smoothness in Evolving Networks for Learning Latent Representation of Nodes

Author: Hasan Mohammad Al
Joty Shafiq
Saha Tanay Kumar
Varberg Nicholas K.
Williams Thomas
Publication venue
Publication date: 16/04/2018
Field of study

In a dynamic network, the neighborhood of the vertices evolve across different temporal snapshots of the network. Accurate modeling of this temporal evolution can help solve complex tasks involving real-life social and interaction networks. However, existing models for learning latent representation are inadequate for obtaining the representation vectors of the vertices for different time-stamps of a dynamic network in a meaningful way. In this paper, we propose latent representation learning models for dynamic networks which overcome the above limitation by considering two different kinds of temporal smoothness: (i) retrofitted, and (ii) linear transformation. The retrofitted model tracks the representation vector of a vertex over time, facilitating vertex-based temporal analysis of a network. On the other hand, linear transformation based model provides a smooth transition operator which maps the representation vectors of all vertices from one temporal snapshot to the next (unobserved) snapshot-this facilitates prediction of the state of a network in a future time-stamp. We validate the performance of our proposed models by employing them for solving the temporal link prediction task. Experiments on 9 real-life networks from various domains validate that the proposed models are significantly better than the existing models for predicting the dynamics of an evolving network

arXiv.org e-Print Archive

Learning Representations using Spectral-Biased Random Walks on Graphs

Author: Chauhan Jatin
Kaul Manohar
Sharma Charu
Publication venue
Publication date: 29/07/2020
Field of study

Several state-of-the-art neural graph embedding methods are based on short random walks (stochastic processes) because of their ease of computation, simplicity in capturing complex local graph properties, scalability, and interpretibility. In this work, we are interested in studying how much a probabilistic bias in this stochastic process affects the quality of the nodes picked by the process. In particular, our biased walk, with a certain probability, favors movement towards nodes whose neighborhoods bear a structural resemblance to the current node's neighborhood. We succinctly capture this neighborhood as a probability measure based on the spectrum of the node's neighborhood subgraph represented as a normalized laplacian matrix. We propose the use of a paragraph vector model with a novel Wasserstein regularization term. We empirically evaluate our approach against several state-of-the-art node embedding techniques on a wide variety of real-world datasets and demonstrate that our proposed method significantly improves upon existing methods on both link prediction and node classification tasks.Comment: Accepted at IJCNN 2020: International Joint Conference on Neural Network

arXiv.org e-Print Archive

Link Prediction in Social Networks: the State-of-the-Art

Author: Wang Peng
Wu Yurong
Xu Baowen
Zhou Xiaoyu
Publication venue
Publication date: 08/12/2014
Field of study

In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems is presented. Then link prediction techniques and problems are analyzed and discussed. Typical applications of link prediction are also addressed. Achievements and roadmaps of some active research groups are introduced. Finally, some future challenges of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201

arXiv.org e-Print Archive

Capturing Edge Attributes via Network Embedding

Author: Ferrara Emilio
Galstyan Aram
Goyal Palash
Hosseinmardi Homa
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2018
Field of study

Network embedding, which aims to learn low-dimensional representations of nodes, has been used for various graph related tasks including visualization, link prediction and node classification. Most existing embedding methods rely solely on network structure. However, in practice we often have auxiliary information about the nodes and/or their interactions, e.g., content of scientific papers in co-authorship networks, or topics of communication in Twitter mention networks. Here we propose a novel embedding method that uses both network structure and edge attributes to learn better network representations. Our method jointly minimizes the reconstruction error for higher-order node neighborhood, social roles and edge attributes using a deep architecture that can adequately capture highly non-linear interactions. We demonstrate the efficacy of our model over existing state-of-the-art methods on a variety of real-world networks including collaboration networks, and social networks. We also observe that using edge attributes to inform network embedding yields better performance in downstream tasks such as link prediction and node classification

arXiv.org e-Print Archive

Efficient inference of overlapping communities in complex networks

Author: Fruergaard Bjarne Ørum
Herlau Tue
Publication venue
Publication date: 01/01/2014
Field of study

We discuss two views on extending existing methods for complex network modeling which we dub the communities first and the networks first view, respectively. Inspired by the networks first view that we attribute to White, Boorman, and Breiger (1976)[1], we formulate the multiple-networks stochastic blockmodel (MNSBM), which seeks to separate the observed network into subnetworks of different types and where the problem of inferring structure in each subnetwork becomes easier. We show how this model is specified in a generative Bayesian framework where parameters can be inferred efficiently using Gibbs sampling. The result is an effective multiple-membership model without the drawbacks of introducing complex definitions of "groups" and how they interact. We demonstrate results on the recovery of planted structure in synthetic networks and show very encouraging results on link prediction performances using multiple-networks models on a number of real-world network data sets

arXiv.org e-Print Archive

Musical recommendations and personalization in a social network

Author: Bugaychenko Dmitry
Dzuba Alexandr
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/10/2013
Field of study

This paper presents a set of algorithms used for music recommendations and personalization in a general purpose social network www.ok.ru, the second largest social network in the CIS visited by more then 40 millions users per day. In addition to classical recommendation features like "recommend a sequence" and "find similar items" the paper describes novel algorithms for construction of context aware recommendations, personalization of the service, handling of the cold-start problem, and more. All algorithms described in the paper are working on-line and are able to detect and address changes in the user's behavior and needs in the real time. The core component of the algorithms is a taste graph containing information about different entities (users, tracks, artists, etc.) and relations between them (for example, user A likes song B with certainty X, track B created by artist C, artist C is similar to artist D with certainty Y and so on). Using the graph it is possible to select tracks a user would most probably like, to arrange them in a way that they match each other well, to estimate which items from a fixed list are most relevant for the user, and more. In addition, the paper describes the approach used to estimate algorithms efficiency and analyze the impact of different recommendation related features on the users' behavior and overall activity at the service.Comment: This is a full version of a 4 pages article published at ACM RecSys 201

arXiv.org e-Print Archive

Graph Based Recommendations: From Data Representation to Feature Extraction and Application

Author: Berkovsky Shlomo
Kaafar Mohamed Ali
Kuflik Tsvi
Tiroshi Amit
Publication venue
Publication date: 05/07/2017
Field of study

Modeling users for the purpose of identifying their preferences and then personalizing services on the basis of these models is a complex task, primarily due to the need to take into consideration various explicit and implicit signals, missing or uncertain information, contextual aspects, and more. In this study, a novel generic approach for uncovering latent preference patterns from user data is proposed and evaluated. The approach relies on representing the data using graphs, and then systematically extracting graph-based features and using them to enrich the original user models. The extracted features encapsulate complex relationships between users, items, and metadata. The enhanced user models can then serve as an input to any recommendation algorithm. The proposed approach is domain-independent (demonstrated on data from movies, music, and business recommender systems), and is evaluated using several state-of-the-art machine learning methods, on different recommendation tasks, and using different evaluation metrics. The results show a unanimous improvement in the recommendation accuracy across tasks and domains. In addition, the evaluation provides a deeper analysis regarding the performance of the approach in special scenarios, including high sparsity and variability of ratings

arXiv.org e-Print Archive

CONE: Community Oriented Network Embedding

Author: Chang Kevin Chen-Chuan
Lu Hanqing
Yang Carl
Publication venue
Publication date: 22/04/2018
Field of study

Detecting communities has long been popular in the research on networks. It is usually modeled as an unsupervised clustering problem on graphs, based on heuristic assumptions about community characteristics, such as edge density and node homogeneity. In this work, we doubt the universality of these widely adopted assumptions and compare human labeled communities with machine predicted ones obtained via various mainstream algorithms. Based on supportive results, we argue that communities are defined by various social patterns and unsupervised learning based on heuristics is incapable of capturing all of them. Therefore, we propose to inject supervision into community detection through Community Oriented Network Embedding (CONE), which leverages limited ground-truth communities as examples to learn an embedding model aware of the social patterns underlying them. Specifically, a deep architecture is developed by combining recurrent neural networks with random-walks on graphs towards capturing social patterns directed by ground-truth communities. Generic clustering algorithms on the embeddings of other nodes produced by the learned model then effectively reveals more communities that share similar social patterns with the ground-truth ones.Comment: 10 pages, accepted by IJCNN 201

arXiv.org e-Print Archive

Inferring gene ontologies from pairwise similarity data.

Author: Bafna Vineet
Dutkowski Janusz
Ideker Trey
Kramer Michael
Yu Michael
Publication venue: eScholarship, University of California
Publication date: 01/06/2014
Field of study

MotivationWhile the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene-gene pairwise similarities from -omics data; infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; and respect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge-none has been evaluated for GO inference.MethodsWe consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method's ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast.ResultsFor task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20-25% precision, recall).ConclusionThis study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data

eScholarship - University of California