4,118 research outputs found
Semi-supervised Embedding in Attributed Networks with Outliers
In this paper, we propose a novel framework, called Semi-supervised Embedding
in Attributed Networks with Outliers (SEANO), to learn a low-dimensional vector
representation that systematically captures the topological proximity,
attribute affinity and label similarity of vertices in a partially labeled
attributed network (PLAN). Our method is designed to work in both transductive
and inductive settings while explicitly alleviating noise effects from
outliers. Experimental results on various datasets drawn from the web, text and
image domains demonstrate the advantages of SEANO over state-of-the-art methods
in semi-supervised classification under transductive as well as inductive
settings. We also show that a subset of parameters in SEANO is interpretable as
outlier score and can significantly outperform baseline methods when applied
for detecting network outliers. Finally, we present the use of SEANO in a
challenging real-world setting -- flood mapping of satellite images and show
that it is able to outperform modern remote sensing algorithms for this task.Comment: in Proceedings of SIAM International Conference on Data Mining
(SDM'18
Structure of Heterogeneous Networks
Heterogeneous networks play a key role in the evolution of communities and
the decisions individuals make. These networks link different types of
entities, for example, people and the events they attend. Network analysis
algorithms usually project such networks unto simple graphs composed of
entities of a single type. In the process, they conflate relations between
entities of different types and loose important structural information. We
develop a mathematical framework that can be used to compactly represent and
analyze heterogeneous networks that combine multiple entity and link types. We
generalize Bonacich centrality, which measures connectivity between nodes by
the number of paths between them, to heterogeneous networks and use this
measure to study network structure. Specifically, we extend the popular
modularity-maximization method for community detection to use this centrality
metric. We also rank nodes based on their connectivity to other nodes. One
advantage of this centrality metric is that it has a tunable parameter we can
use to set the length scale of interactions. By studying how rankings change
with this parameter allows us to identify important nodes in the network. We
apply the proposed method to analyze the structure of several heterogeneous
networks. We show that exploiting additional sources of evidence corresponding
to links between, as well as among, different entity types yields new insights
into network structure
Multistep greedy algorithm identifies community structure in real-world and computer-generated networks
We have recently introduced a multistep extension of the greedy algorithm for
modularity optimization. The extension is based on the idea that merging l
pairs of communities (l>1) at each iteration prevents premature condensation
into few large communities. Here, an empirical formula is presented for the
choice of the step width l that generates partitions with (close to) optimal
modularity for 17 real-world and 1100 computer-generated networks. Furthermore,
an in-depth analysis of the communities of two real-world networks (the
metabolic network of the bacterium E. coli and the graph of coappearing words
in the titles of papers coauthored by Martin Karplus) provides evidence that
the partition obtained by the multistep greedy algorithm is superior to the one
generated by the original greedy algorithm not only with respect to modularity
but also according to objective criteria. In other words, the multistep
extension of the greedy algorithm reduces the danger of getting trapped in
local optima of modularity and generates more reasonable partitions.Comment: 17 pages, 2 figure
The Community Structure of R&D Cooperation in Europe. Evidence from a social network perspective
The focus of this paper is on pre-competitive R&D cooperation across Europe, as captured by R&D joint ventures funded by the European Commission in the time period 1998-2002, within the 5th Framework Program. The cooperations in this Framework Program give rise to a bipartite network with 72,745 network edges between 25,839 actors (representing organizations that include firms, universities, research organizations and public agencies) and 9,490 R&D projects. With this construction, participating actors are linked only through joint projects.
In this paper we describe the community identification problem based on the concept of modularity, and use the recently introduced label-propagation algorithm to identify communities in the network, and differentiate the identified communities by developing community-specific profiles using social network analysis and geographic visualization techniques. We expect the results to enrich our picture of the European Research Area by providing new insights into the global and local structures of R&D cooperation across Europe
Neural‑Brane: Neural Bayesian Personalized Ranking for Attributed Network Embedding
Network embedding methodologies, which learn a distributed vector representation for each vertex in a network, have attracted considerable interest in recent years. Existing works have demonstrated that vertex representation learned through an embedding method provides superior performance in many real-world applications, such as node classification, link prediction, and community detection. However, most of the existing methods for network embedding only utilize topological information of a vertex, ignoring a rich set of nodal attributes (such as user profiles of an online social network, or textual contents of a citation network), which is abundant in all real-life networks. A joint network embedding that takes into account both attributional and relational information entails a complete network information and could further enrich the learned vector representations. In this work, we present Neural-Brane, a novel Neural Bayesian Personalized Ranking based Attributed Network Embedding. For a given network, Neural-Brane extracts latent feature representation of its vertices using a designed neural network model that unifies network topological information and nodal attributes. Besides, it utilizes Bayesian personalized ranking objective, which exploits the proximity ordering between a similar node pair and a dissimilar node pair. We evaluate the quality of vertex embedding produced by Neural-Brane by solving the node classification and clustering tasks on four real-world datasets. Experimental results demonstrate the superiority of our proposed method over the state-of-the-art existing methods
- …