1,969 research outputs found
Don't Walk, Skip! Online Learning of Multi-scale Network Embeddings
We present Walklets, a novel approach for learning multiscale representations
of vertices in a network. In contrast to previous works, these representations
explicitly encode multiscale vertex relationships in a way that is analytically
derivable.
Walklets generates these multiscale relationships by subsampling short random
walks on the vertices of a graph. By `skipping' over steps in each random walk,
our method generates a corpus of vertex pairs which are reachable via paths of
a fixed length. This corpus can then be used to learn a series of latent
representations, each of which captures successively higher order relationships
from the adjacency matrix.
We demonstrate the efficacy of Walklets's latent representations on several
multi-label network classification tasks for social networks such as
BlogCatalog, DBLP, Flickr, and YouTube. Our results show that Walklets
outperforms new methods based on neural matrix factorization. Specifically, we
outperform DeepWalk by up to 10% and LINE by 58% Micro-F1 on challenging
multi-label classification tasks. Finally, Walklets is an online algorithm, and
can easily scale to graphs with millions of vertices and edges.Comment: 8 pages, ASONAM'1
Models for Capturing Temporal Smoothness in Evolving Networks for Learning Latent Representation of Nodes
In a dynamic network, the neighborhood of the vertices evolve across
different temporal snapshots of the network. Accurate modeling of this temporal
evolution can help solve complex tasks involving real-life social and
interaction networks. However, existing models for learning latent
representation are inadequate for obtaining the representation vectors of the
vertices for different time-stamps of a dynamic network in a meaningful way. In
this paper, we propose latent representation learning models for dynamic
networks which overcome the above limitation by considering two different kinds
of temporal smoothness: (i) retrofitted, and (ii) linear transformation. The
retrofitted model tracks the representation vector of a vertex over time,
facilitating vertex-based temporal analysis of a network. On the other hand,
linear transformation based model provides a smooth transition operator which
maps the representation vectors of all vertices from one temporal snapshot to
the next (unobserved) snapshot-this facilitates prediction of the state of a
network in a future time-stamp. We validate the performance of our proposed
models by employing them for solving the temporal link prediction task.
Experiments on 9 real-life networks from various domains validate that the
proposed models are significantly better than the existing models for
predicting the dynamics of an evolving network
Learning Representations using Spectral-Biased Random Walks on Graphs
Several state-of-the-art neural graph embedding methods are based on short
random walks (stochastic processes) because of their ease of computation,
simplicity in capturing complex local graph properties, scalability, and
interpretibility. In this work, we are interested in studying how much a
probabilistic bias in this stochastic process affects the quality of the nodes
picked by the process. In particular, our biased walk, with a certain
probability, favors movement towards nodes whose neighborhoods bear a
structural resemblance to the current node's neighborhood. We succinctly
capture this neighborhood as a probability measure based on the spectrum of the
node's neighborhood subgraph represented as a normalized laplacian matrix. We
propose the use of a paragraph vector model with a novel Wasserstein
regularization term. We empirically evaluate our approach against several
state-of-the-art node embedding techniques on a wide variety of real-world
datasets and demonstrate that our proposed method significantly improves upon
existing methods on both link prediction and node classification tasks.Comment: Accepted at IJCNN 2020: International Joint Conference on Neural
Network
Link Prediction in Social Networks: the State-of-the-Art
In social networks, link prediction predicts missing links in current
networks and new or dissolution links in future networks, is important for
mining and analyzing the evolution of social networks. In the past decade, many
works have been done about the link prediction in social networks. The goal of
this paper is to comprehensively review, analyze and discuss the
state-of-the-art of the link prediction in social networks. A systematical
category for link prediction techniques and problems is presented. Then link
prediction techniques and problems are analyzed and discussed. Typical
applications of link prediction are also addressed. Achievements and roadmaps
of some active research groups are introduced. Finally, some future challenges
of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201
Capturing Edge Attributes via Network Embedding
Network embedding, which aims to learn low-dimensional representations of
nodes, has been used for various graph related tasks including visualization,
link prediction and node classification. Most existing embedding methods rely
solely on network structure. However, in practice we often have auxiliary
information about the nodes and/or their interactions, e.g., content of
scientific papers in co-authorship networks, or topics of communication in
Twitter mention networks. Here we propose a novel embedding method that uses
both network structure and edge attributes to learn better network
representations. Our method jointly minimizes the reconstruction error for
higher-order node neighborhood, social roles and edge attributes using a deep
architecture that can adequately capture highly non-linear interactions. We
demonstrate the efficacy of our model over existing state-of-the-art methods on
a variety of real-world networks including collaboration networks, and social
networks. We also observe that using edge attributes to inform network
embedding yields better performance in downstream tasks such as link prediction
and node classification
Efficient inference of overlapping communities in complex networks
We discuss two views on extending existing methods for complex network
modeling which we dub the communities first and the networks first view,
respectively. Inspired by the networks first view that we attribute to White,
Boorman, and Breiger (1976)[1], we formulate the multiple-networks stochastic
blockmodel (MNSBM), which seeks to separate the observed network into
subnetworks of different types and where the problem of inferring structure in
each subnetwork becomes easier. We show how this model is specified in a
generative Bayesian framework where parameters can be inferred efficiently
using Gibbs sampling. The result is an effective multiple-membership model
without the drawbacks of introducing complex definitions of "groups" and how
they interact. We demonstrate results on the recovery of planted structure in
synthetic networks and show very encouraging results on link prediction
performances using multiple-networks models on a number of real-world network
data sets
Musical recommendations and personalization in a social network
This paper presents a set of algorithms used for music recommendations and
personalization in a general purpose social network www.ok.ru, the second
largest social network in the CIS visited by more then 40 millions users per
day. In addition to classical recommendation features like "recommend a
sequence" and "find similar items" the paper describes novel algorithms for
construction of context aware recommendations, personalization of the service,
handling of the cold-start problem, and more. All algorithms described in the
paper are working on-line and are able to detect and address changes in the
user's behavior and needs in the real time.
The core component of the algorithms is a taste graph containing information
about different entities (users, tracks, artists, etc.) and relations between
them (for example, user A likes song B with certainty X, track B created by
artist C, artist C is similar to artist D with certainty Y and so on). Using
the graph it is possible to select tracks a user would most probably like, to
arrange them in a way that they match each other well, to estimate which items
from a fixed list are most relevant for the user, and more.
In addition, the paper describes the approach used to estimate algorithms
efficiency and analyze the impact of different recommendation related features
on the users' behavior and overall activity at the service.Comment: This is a full version of a 4 pages article published at ACM RecSys
201
Graph Based Recommendations: From Data Representation to Feature Extraction and Application
Modeling users for the purpose of identifying their preferences and then
personalizing services on the basis of these models is a complex task,
primarily due to the need to take into consideration various explicit and
implicit signals, missing or uncertain information, contextual aspects, and
more. In this study, a novel generic approach for uncovering latent preference
patterns from user data is proposed and evaluated. The approach relies on
representing the data using graphs, and then systematically extracting
graph-based features and using them to enrich the original user models. The
extracted features encapsulate complex relationships between users, items, and
metadata. The enhanced user models can then serve as an input to any
recommendation algorithm. The proposed approach is domain-independent
(demonstrated on data from movies, music, and business recommender systems),
and is evaluated using several state-of-the-art machine learning methods, on
different recommendation tasks, and using different evaluation metrics. The
results show a unanimous improvement in the recommendation accuracy across
tasks and domains. In addition, the evaluation provides a deeper analysis
regarding the performance of the approach in special scenarios, including high
sparsity and variability of ratings
CONE: Community Oriented Network Embedding
Detecting communities has long been popular in the research on networks. It
is usually modeled as an unsupervised clustering problem on graphs, based on
heuristic assumptions about community characteristics, such as edge density and
node homogeneity. In this work, we doubt the universality of these widely
adopted assumptions and compare human labeled communities with machine
predicted ones obtained via various mainstream algorithms. Based on supportive
results, we argue that communities are defined by various social patterns and
unsupervised learning based on heuristics is incapable of capturing all of
them. Therefore, we propose to inject supervision into community detection
through Community Oriented Network Embedding (CONE), which leverages limited
ground-truth communities as examples to learn an embedding model aware of the
social patterns underlying them. Specifically, a deep architecture is developed
by combining recurrent neural networks with random-walks on graphs towards
capturing social patterns directed by ground-truth communities. Generic
clustering algorithms on the embeddings of other nodes produced by the learned
model then effectively reveals more communities that share similar social
patterns with the ground-truth ones.Comment: 10 pages, accepted by IJCNN 201
Inferring gene ontologies from pairwise similarity data.
MotivationWhile the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene-gene pairwise similarities from -omics data; infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; and respect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge-none has been evaluated for GO inference.MethodsWe consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method's ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast.ResultsFor task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20-25% precision, recall).ConclusionThis study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data
- …