32,706 research outputs found
Jointly Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN)
The effects of social influence and homophily suggest that both network
structure and node attribute information should inform the tasks of link
prediction and node attribute inference. Recently, Yin et al. proposed
Social-Attribute Network (SAN), an attribute-augmented social network, to
integrate network structure and node attributes to perform both link prediction
and attribute inference. They focused on generalizing the random walk with
restart algorithm to the SAN framework and showed improved performance. In this
paper, we extend the SAN framework with several leading supervised and
unsupervised link prediction algorithms and demonstrate performance improvement
for each algorithm on both link prediction and attribute inference. Moreover,
we make the novel observation that attribute inference can help inform link
prediction, i.e., link prediction accuracy is further improved by first
inferring missing attributes. We comprehensively evaluate these algorithms and
compare them with other existing algorithms using a novel, large-scale Google+
dataset, which we make publicly available.Comment: 9 pages, 4 figures and 4 table
Link Prediction in Social Networks: the State-of-the-Art
In social networks, link prediction predicts missing links in current
networks and new or dissolution links in future networks, is important for
mining and analyzing the evolution of social networks. In the past decade, many
works have been done about the link prediction in social networks. The goal of
this paper is to comprehensively review, analyze and discuss the
state-of-the-art of the link prediction in social networks. A systematical
category for link prediction techniques and problems is presented. Then link
prediction techniques and problems are analyzed and discussed. Typical
applications of link prediction are also addressed. Achievements and roadmaps
of some active research groups are introduced. Finally, some future challenges
of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201
Multimodal Deep Network Embedding with Integrated Structure and Attribute Information
Network embedding is the process of learning low-dimensional representations
for nodes in a network, while preserving node features. Existing studies only
leverage network structure information and focus on preserving structural
features. However, nodes in real-world networks often have a rich set of
attributes providing extra semantic information. It has been demonstrated that
both structural and attribute features are important for network analysis
tasks. To preserve both features, we investigate the problem of integrating
structure and attribute information to perform network embedding and propose a
Multimodal Deep Network Embedding (MDNE) method. MDNE captures the non-linear
network structures and the complex interactions among structures and
attributes, using a deep model consisting of multiple layers of non-linear
functions. Since structures and attributes are two different types of
information, a multimodal learning method is adopted to pre-process them and
help the model to better capture the correlations between node structure and
attribute information. We employ both structural proximity and attribute
proximity in the loss function to preserve the respective features and the
representations are obtained by minimizing the loss function. Results of
extensive experiments on four real-world datasets show that the proposed method
performs significantly better than baselines on a variety of tasks, which
demonstrate the effectiveness and generality of our method.Comment: 15 pages, 10 figure
Deep Generative Models for Relational Data with Side Information
We present a probabilistic framework for overlapping community discovery and
link prediction for relational data, given as a graph. The proposed framework
has: (1) a deep architecture which enables us to infer multiple layers of
latent features/communities for each node, providing superior link prediction
performance on more complex networks and better interpretability of the latent
features; and (2) a regression model which allows directly conditioning the
node latent features on the side information available in form of node
attributes. Our framework handles both (1) and (2) via a clean, unified model,
which enjoys full local conjugacy via data augmentation, and facilitates
efficient inference via closed form Gibbs sampling. Moreover, inference cost
scales in the number of edges which is attractive for massive but sparse
networks. Our framework is also easily extendable to model weighted networks
with count-valued edges. We compare with various state-of-the-art methods and
report results, both quantitative and qualitative, on several benchmark data
sets
A Survey of Heterogeneous Information Network Analysis
Most real systems consist of a large number of interacting, multi-typed
components, while most contemporary researches model them as homogeneous
networks, without distinguishing different types of objects and links in the
networks. Recently, more and more researchers begin to consider these
interconnected, multi-typed data as heterogeneous information networks, and
develop structural analysis approaches by leveraging the rich semantic meaning
of structural types of objects and links in the networks. Compared to widely
studied homogeneous network, the heterogeneous information network contains
richer structure and semantic information, which provides plenty of
opportunities as well as a lot of challenges for data mining. In this paper, we
provide a survey of heterogeneous information network analysis. We will
introduce basic concepts of heterogeneous information network analysis, examine
its developments on different data mining tasks, discuss some advanced topics,
and point out some future research directions.Comment: 45 pages, 12 figure
Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks
There has been an explosion of multimodal content generated on social media
networks in the last few years, which has necessitated a deeper understanding
of social media content and user behavior. We present a novel
content-independent content-user-reaction model for social multimedia content
analysis. Compared to prior works that generally tackle semantic content
understanding and user behavior modeling in isolation, we propose a generalized
solution to these problems within a unified framework. We embed users, images
and text drawn from open social media in a common multimodal geometric space,
using a novel loss function designed to cope with distant and disparate
modalities, and thereby enable seamless three-way retrieval. Our model not only
outperforms unimodal embedding based methods on cross-modal retrieval tasks but
also shows improvements stemming from jointly solving the two tasks on Twitter
data. We also show that the user embeddings learned within our joint multimodal
embedding model are better at predicting user interests compared to those
learned with unimodal content on Instagram data. Our framework thus goes beyond
the prior practice of using explicit leader-follower link information to
establish affiliations by extracting implicit content-centric affiliations from
isolated users. We provide qualitative results to show that the user clusters
emerging from learned embeddings have consistent semantics and the ability of
our model to discover fine-grained semantics from noisy and unstructured data.
Our work reveals that social multimodal content is inherently multimodal and
possesses a consistent structure because in social networks meaning is created
through interactions between users and content.Comment: Preprint submitted to IJC
Privacy in Social Media: Identification, Mitigation and Applications
The increasing popularity of social media has attracted a huge number of
people to participate in numerous activities on a daily basis. This results in
tremendous amounts of rich user-generated data. This data provides
opportunities for researchers and service providers to study and better
understand users' behaviors and further improve the quality of the personalized
services. Publishing user-generated data risks exposing individuals' privacy.
Users privacy in social media is an emerging task and has attracted increasing
attention in recent years. These works study privacy issues in social media
from the two different points of views: identification of vulnerabilities, and
mitigation of privacy risks. Recent research has shown the vulnerability of
user-generated data against the two general types of attacks, identity
disclosure and attribute disclosure. These privacy issues mandate social media
data publishers to protect users' privacy by sanitizing user-generated data
before publishing it. Consequently, various protection techniques have been
proposed to anonymize user-generated social media data. There is a vast
literature on privacy of users in social media from many perspectives. In this
survey, we review the key achievements of user privacy in social media. In
particular, we review and compare the state-of-the-art algorithms in terms of
the privacy leakage attacks and anonymization algorithms. We overview the
privacy risks from different aspects of social media and categorize the
relevant works into five groups 1) graph data anonymization and
de-anonymization, 2) author identification, 3) profile attribute disclosure, 4)
user location and privacy, and 5) recommender systems and privacy issues. We
also discuss open problems and future research directions for user privacy
issues in social media.Comment: This survey is currently under revie
Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks
Inferring latent attributes of people online is an important social computing
task, but requires integrating the many heterogeneous sources of information
available on the web. We propose learning individual representations of people
using neural nets to integrate rich linguistic and network evidence gathered
from social media. The algorithm is able to combine diverse cues, such as the
text a person writes, their attributes (e.g. gender, employer, education,
location) and social relations to other people. We show that by integrating
both textual and network evidence, these representations offer improved
performance at four important tasks in social media inference on Twitter:
predicting (1) gender, (2) occupation, (3) location, and (4) friendships for
users. Our approach scales to large datasets and the learned representations
can be used as general features in and have the potential to benefit a large
number of downstream tasks including link prediction, community detection, or
probabilistic reasoning over social networks
Machine Learning on Graphs: A Model and Comprehensive Taxonomy
There has been a surge of recent interest in learning representations for
graph-structured data. Graph representation learning methods have generally
fallen into three main categories, based on the availability of labeled data.
The first, network embedding (such as shallow graph embedding or graph
auto-encoders), focuses on learning unsupervised representations of relational
structure. The second, graph regularized neural networks, leverages graphs to
augment neural network losses with a regularization objective for
semi-supervised learning. The third, graph neural networks, aims to learn
differentiable functions over discrete topologies with arbitrary structure.
However, despite the popularity of these areas there has been surprisingly
little work on unifying the three paradigms. Here, we aim to bridge the gap
between graph neural networks, network embedding and graph regularization
models. We propose a comprehensive taxonomy of representation learning methods
for graph-structured data, aiming to unify several disparate bodies of work.
Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which
generalizes popular algorithms for semi-supervised learning on graphs (e.g.
GraphSage, Graph Convolutional Networks, Graph Attention Networks), and
unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc)
into a single consistent approach. To illustrate the generality of this
approach, we fit over thirty existing methods into this framework. We believe
that this unifying view both provides a solid foundation for understanding the
intuition behind these methods, and enables future research in the area
Deep Representation Learning for Social Network Analysis
Social network analysis is an important problem in data mining. A fundamental
step for analyzing social networks is to encode network data into
low-dimensional representations, i.e., network embeddings, so that the network
topology structure and other attribute information can be effectively
preserved. Network representation leaning facilitates further applications such
as classification, link prediction, anomaly detection and clustering. In
addition, techniques based on deep neural networks have attracted great
interests over the past a few years. In this survey, we conduct a comprehensive
review of current literature in network representation learning utilizing
neural network models. First, we introduce the basic models for learning node
representations in homogeneous networks. Meanwhile, we will also introduce some
extensions of the base models in tackling more complex scenarios, such as
analyzing attributed networks, heterogeneous networks and dynamic networks.
Then, we introduce the techniques for embedding subgraphs. After that, we
present the applications of network representation learning. At the end, we
discuss some promising research directions for future work
- …