404 research outputs found
Network Representation Learning: A Survey
With the widespread use of information technologies, information networks are
becoming increasingly popular to capture complex relationships across various
disciplines, such as social networks, citation networks, telecommunication
networks, and biological networks. Analyzing these networks sheds light on
different aspects of social life such as the structure of societies,
information diffusion, and communication patterns. In reality, however, the
large scale of information networks often makes network analytic tasks
computationally expensive or intractable. Network representation learning has
been recently proposed as a new learning paradigm to embed network vertices
into a low-dimensional vector space, by preserving network topology structure,
vertex content, and other side information. This facilitates the original
network to be easily handled in the new vector space for further analysis. In
this survey, we perform a comprehensive review of the current literature on
network representation learning in the data mining and machine learning field.
We propose new taxonomies to categorize and summarize the state-of-the-art
network representation learning techniques according to the underlying learning
mechanisms, the network information intended to preserve, as well as the
algorithmic designs and methodologies. We summarize evaluation protocols used
for validating network representation learning including published benchmark
datasets, evaluation methods, and open source algorithms. We also perform
empirical studies to compare the performance of representative algorithms on
common datasets, and analyze their computational complexity. Finally, we
suggest promising research directions to facilitate future study.Comment: Accepted by IEEE transactions on Big Data; 25 pages, 10 tables, 6
figures and 127 reference
Neural‑Brane: Neural Bayesian Personalized Ranking for Attributed Network Embedding
Network embedding methodologies, which learn a distributed vector representation for each vertex in a network, have attracted considerable interest in recent years. Existing works have demonstrated that vertex representation learned through an embedding method provides superior performance in many real-world applications, such as node classification, link prediction, and community detection. However, most of the existing methods for network embedding only utilize topological information of a vertex, ignoring a rich set of nodal attributes (such as user profiles of an online social network, or textual contents of a citation network), which is abundant in all real-life networks. A joint network embedding that takes into account both attributional and relational information entails a complete network information and could further enrich the learned vector representations. In this work, we present Neural-Brane, a novel Neural Bayesian Personalized Ranking based Attributed Network Embedding. For a given network, Neural-Brane extracts latent feature representation of its vertices using a designed neural network model that unifies network topological information and nodal attributes. Besides, it utilizes Bayesian personalized ranking objective, which exploits the proximity ordering between a similar node pair and a dissimilar node pair. We evaluate the quality of vertex embedding produced by Neural-Brane by solving the node classification and clustering tasks on four real-world datasets. Experimental results demonstrate the superiority of our proposed method over the state-of-the-art existing methods
Truncated Affinity Maximization: One-class Homophily Modeling for Graph Anomaly Detection
One prevalent property we find empirically in real-world graph anomaly
detection (GAD) datasets is a one-class homophily, i.e., normal nodes tend to
have strong connection/affinity with each other, while the homophily in
abnormal nodes is significantly weaker than normal nodes. However, this
anomaly-discriminative property is ignored by existing GAD methods that are
typically built using a conventional anomaly detection objective, such as data
reconstruction. In this work, we explore this property to introduce a novel
unsupervised anomaly scoring measure for GAD -- local node affinity -- that
assigns a larger anomaly score to nodes that are less affiliated with their
neighbors, with the affinity defined as similarity on node
attributes/representations. We further propose Truncated Affinity Maximization
(TAM) that learns tailored node representations for our anomaly measure by
maximizing the local affinity of nodes to their neighbors. Optimizing on the
original graph structure can be biased by non-homophily edges (i.e., edges
connecting normal and abnormal nodes). Thus, TAM is instead optimized on
truncated graphs where non-homophily edges are removed iteratively to mitigate
this bias. The learned representations result in significantly stronger local
affinity for normal nodes than abnormal nodes. Extensive empirical results on
six real-world GAD datasets show that TAM substantially outperforms seven
competing models, achieving over 10% increase in AUROC/AUPRC compared to the
best contenders on challenging datasets. Our code will be made available at
https: //github.com/mala-lab/TAM-master/.Comment: 19 pages, 9 figure
- …