958 research outputs found
Search Efficient Binary Network Embedding
Traditional network embedding primarily focuses on learning a dense vector
representation for each node, which encodes network structure and/or node
content information, such that off-the-shelf machine learning algorithms can be
easily applied to the vector-format node representations for network analysis.
However, the learned dense vector representations are inefficient for
large-scale similarity search, which requires to find the nearest neighbor
measured by Euclidean distance in a continuous vector space. In this paper, we
propose a search efficient binary network embedding algorithm called BinaryNE
to learn a sparse binary code for each node, by simultaneously modeling node
context relations and node attribute relations through a three-layer neural
network. BinaryNE learns binary node representations efficiently through a
stochastic gradient descent based online learning algorithm. The learned binary
encoding not only reduces memory usage to represent each node, but also allows
fast bit-wise comparisons to support much quicker network node search compared
to Euclidean distance or other distance measures. Our experiments and
comparisons show that BinaryNE not only delivers more than 23 times faster
search speed, but also provides comparable or better search quality than
traditional continuous vector based network embedding methods
Defining and Evaluating Network Communities based on Ground-truth
Nodes in real-world networks organize into densely linked communities where
edges appear with high concentration among the members of the community.
Identifying such communities of nodes has proven to be a challenging task
mainly due to a plethora of definitions of a community, intractability of
algorithms, issues with evaluation and the lack of a reliable gold-standard
ground-truth.
In this paper we study a set of 230 large real-world social, collaboration
and information networks where nodes explicitly state their group memberships.
For example, in social networks nodes explicitly join various interest based
social groups. We use such groups to define a reliable and robust notion of
ground-truth communities. We then propose a methodology which allows us to
compare and quantitatively evaluate how different structural definitions of
network communities correspond to ground-truth communities. We choose 13
commonly used structural definitions of network communities and examine their
sensitivity, robustness and performance in identifying the ground-truth. We
show that the 13 structural definitions are heavily correlated and naturally
group into four classes. We find that two of these definitions, Conductance and
Triad-participation-ratio, consistently give the best performance in
identifying ground-truth communities. We also investigate a task of detecting
communities given a single seed node. We extend the local spectral clustering
algorithm into a heuristic parameter-free community detection method that
easily scales to networks with more than hundred million nodes. The proposed
method achieves 30% relative improvement over current local clustering methods.Comment: Proceedings of 2012 IEEE International Conference on Data Mining
(ICDM), 201
struc2gauss: Structural role preserving network embedding via Gaussian embedding
Network embedding (NE) is playing a principal role in network mining, due to its ability to map nodes into efficient low-dimensional embedding vectors. However, two major limitations exist in state-of-the-art NE methods: role preservation and uncertainty modeling. Almost all previous methods represent a node into a point in space and focus on local structural information, i.e., neighborhood information. However, neighborhood information does not capture global structural information and point vector representation fails in modeling the uncertainty of node representations. In this paper, we propose a new NE framework, struc2gauss, which learns node representations in the space of Gaussian distributions and performs network embedding based on global structural information. struc2gauss first employs a given node similarity metric to measure the global structural information, then generates structural context for nodes and finally learns node representations via Gaussian embedding. Different structural similarity measures of networks and energy functions of Gaussian embedding are investigated. Experiments conducted on real-world networks demonstrate that struc2gauss effectively captures global structural information while state-of-the-art network embedding methods fail to, outperforms other methods on the structure-based clustering and classification task and provides more information on uncertainties of node representations
- …