96,103 research outputs found
Predicting Social Links for New Users across Aligned Heterogeneous Social Networks
Online social networks have gained great success in recent years and many of
them involve multiple kinds of nodes and complex relationships. Among these
relationships, social links among users are of great importance. Many existing
link prediction methods focus on predicting social links that will appear in
the future among all users based upon a snapshot of the social network. In
real-world social networks, many new users are joining in the service every
day. Predicting links for new users are more important. Different from
conventional link prediction problems, link prediction for new users are more
challenging due to the following reasons: (1) differences in information
distributions between new users and the existing active users (i.e., old
users); (2) lack of information from the new users in the network. We propose a
link prediction method called SCAN-PS (Supervised Cross Aligned Networks link
prediction with Personalized Sampling), to solve the link prediction problem
for new users with information transferred from both the existing active users
in the target network and other source networks through aligned accounts. We
proposed a within-target-network personalized sampling method to process the
existing active users' information in order to accommodate the differences in
information distributions before the intra-network knowledge transfer. SCAN-PS
can also exploit information in other source networks, where the user accounts
are aligned with the target network. In this way, SCAN-PS could solve the cold
start problem when information of these new users is total absent in the target
network.Comment: 11 pages, 10 figures, 4 table
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
Individual heterogeneity generates explosive system network dynamics
Individual heterogeneity is a key characteristic of many real-world systems,
from organisms to humans. However its role in determining the system's
collective dynamics is typically not well understood. Here we study how
individual heterogeneity impacts the system network dynamics by comparing
linking mechanisms that favor similar or dissimilar individuals. We find that
this heterogeneity-based evolution can drive explosive network behavior and
dictates how a polarized population moves toward consensus. Our model shows
good agreement with data from both biological and social science domains. We
conclude that individual heterogeneity likely plays a key role in the
collective development of real-world networks and communities, and cannot be
ignored.Comment: 6 pages, 4 figure
Exploring Student Check-In Behavior for Improved Point-of-Interest Prediction
With the availability of vast amounts of user visitation history on
location-based social networks (LBSN), the problem of Point-of-Interest (POI)
prediction has been extensively studied. However, much of the research has been
conducted solely on voluntary checkin datasets collected from social apps such
as Foursquare or Yelp. While these data contain rich information about
recreational activities (e.g., restaurants, nightlife, and entertainment),
information about more prosaic aspects of people's lives is sparse. This not
only limits our understanding of users' daily routines, but more importantly
the modeling assumptions developed based on characteristics of recreation-based
data may not be suitable for richer check-in data. In this work, we present an
analysis of education "check-in" data using WiFi access logs collected at
Purdue University. We propose a heterogeneous graph-based method to encode the
correlations between users, POIs, and activities, and then jointly learn
embeddings for the vertices. We evaluate our method compared to previous
state-of-the-art POI prediction methods, and show that the assumptions made by
previous methods significantly degrade performance on our data with dense(r)
activity signals. We also show how our learned embeddings could be used to
identify similar students (e.g., for friend suggestions).Comment: published in KDD'1
Multiscale mixing patterns in networks
Assortative mixing in networks is the tendency for nodes with the same
attributes, or metadata, to link to each other. It is a property often found in
social networks manifesting as a higher tendency of links occurring between
people with the same age, race, or political belief. Quantifying the level of
assortativity or disassortativity (the preference of linking to nodes with
different attributes) can shed light on the factors involved in the formation
of links and contagion processes in complex networks. It is common practice to
measure the level of assortativity according to the assortativity coefficient,
or modularity in the case of discrete-valued metadata. This global value is the
average level of assortativity across the network and may not be a
representative statistic when mixing patterns are heterogeneous. For example, a
social network spanning the globe may exhibit local differences in mixing
patterns as a consequence of differences in cultural norms. Here, we introduce
an approach to localise this global measure so that we can describe the
assortativity, across multiple scales, at the node level. Consequently we are
able to capture and qualitatively evaluate the distribution of mixing patterns
in the network. We find that for many real-world networks the distribution of
assortativity is skewed, overdispersed and multimodal. Our method provides a
clearer lens through which we can more closely examine mixing patterns in
networks.Comment: 11 pages, 7 figure
Dynamic Exploration of Networks: from general principles to the traceroute process
Dynamical processes taking place on real networks define on them evolving
subnetworks whose topology is not necessarily the same of the underlying one.
We investigate the problem of determining the emerging degree distribution,
focusing on a class of tree-like processes, such as those used to explore the
Internet's topology. A general theory based on mean-field arguments is
proposed, both for single-source and multiple-source cases, and applied to the
specific example of the traceroute exploration of networks. Our results provide
a qualitative improvement in the understanding of dynamical sampling and of the
interplay between dynamics and topology in large networks like the Internet.Comment: 13 pages, 6 figure
- …