93,867 research outputs found
Spatial correlations in attribute communities
Community detection is an important tool for exploring and classifying the
properties of large complex networks and should be of great help for spatial
networks. Indeed, in addition to their location, nodes in spatial networks can
have attributes such as the language for individuals, or any other
socio-economical feature that we would like to identify in communities. We
discuss in this paper a crucial aspect which was not considered in previous
studies which is the possible existence of correlations between space and
attributes. Introducing a simple toy model in which both space and node
attributes are considered, we discuss the effect of space-attribute
correlations on the results of various community detection methods proposed for
spatial networks in this paper and in previous studies. When space is
irrelevant, our model is equivalent to the stochastic block model which has
been shown to display a detectability-non detectability transition. In the
regime where space dominates the link formation process, most methods can fail
to recover the communities, an effect which is particularly marked when
space-attributes correlations are strong. In this latter case, community
detection methods which remove the spatial component of the network can miss a
large part of the community structure and can lead to incorrect results.Comment: 10 pages and 7 figure
HIGH PERFORMANCE DECENTRALISED COMMUNITY DETECTION ALGORITHMS FOR BIG DATA FROM SMART COMMUNICATION APPLICATIONS
Many systems in the world can be represented as models of complex networks and subsequently be analysed fruitfully. One fundamental property of the real-world networks is that they usually exhibit inhomogeneity in which the network tends to organise according to an underlying modular structure, commonly referred to as community structure or clustering. Analysing such communities in large networks can help people better understand the structural makeup of the networks. For example, it can be used in mobile ad-hoc and sensor networks to improve the energy consumption and communication tasks. Thus, community detection in networks has become an important research area within many application fields such as computer science, physical sciences, mathematics and biology. Driven by the recent emergence of big data, clustering of real-world networks using traditional methods and algorithms is almost impossible to be processed in a single machine. The existing methods are limited by their computational requirements and most of them cannot be directly parallelised. Furthermore, in many cases the data set is very big and does not fit into the main memory of a single machine, therefore needs to be distributed among several machines. The main topic of this thesis is about network community detection within these big data networks. More specifically, in this thesis, a novel approach, namely Decentralized Iterative Community Clustering Approach (DICCA) for clustering large and undirected networks is introduced. An important property of this approach is its ability to cluster the entire network without the global knowledge of the network topology. Moreover, an extension of the DICCA called Parallel Decentralized Iterative Community Clustering approach (PDICCA) is proposed for efficiently processing data distributed across several machines. PDICCA is based on MapReduce computing platform to work efficiently in distributed and parallel fashion. In addition, the real-world networks are usually noisy and imperfect with missing and false edges. These imperfections are often difficult to eliminate and highly affect the quality and accuracy of conventional methods used to find the community structure in the network. However, in real-world networks, node attribute information is also available in addition to topology information. Considering more than one source of information for community detection could produce meaningful clusters and improve the robustness of the network. Therefore, a pre-processing approach that considers attribute information, shared neighbours and connectivity information aspects of the network for community detection is presented in this thesis as part of my research. Finally, a set of real-world mobile phone usage data obtained from Cambridge Laboratories (Device Analyzer) has been analysed as an exploratory step for viability to apply the algorithms developed in this thesis. All the proposed approaches have been evaluated and verified for feasibility using real-world large data set. The evaluation results of these experimentations prove very promising for the type of large data networks considered
Community Structure Characterization
This entry discusses the problem of describing some communities identified in
a complex network of interest, in a way allowing to interpret them. We suppose
the community structure has already been detected through one of the many
methods proposed in the literature. The question is then to know how to extract
valuable information from this first result, in order to allow human
interpretation. This requires subsequent processing, which we describe in the
rest of this entry
Ordered community structure in networks
Community structure in networks is often a consequence of homophily, or
assortative mixing, based on some attribute of the vertices. For example,
researchers may be grouped into communities corresponding to their research
topic. This is possible if vertex attributes have discrete values, but many
networks exhibit assortative mixing by some continuous-valued attribute, such
as age or geographical location. In such cases, no discrete communities can be
identified. We consider how the notion of community structure can be
generalized to networks that are based on continuous-valued attributes: in
general, a network may contain discrete communities which are ordered according
to their attribute values. We propose a method of generating synthetic ordered
networks and investigate the effect of ordered community structure on the
spread of infectious diseases. We also show that community detection algorithms
fail to recover community structure in ordered networks, and evaluate an
alternative method using a layout algorithm to recover the ordering.Comment: This is an extended preprint version that includes an extra example:
the college football network as an ordered (spatial) network. Further
improvements, not included here, appear in the journal version. Original
title changed (from "Ordered and continuous community structure in networks")
to match journal versio
LATTE: Application Oriented Social Network Embedding
In recent years, many research works propose to embed the network structured
data into a low-dimensional feature space, where each node is represented as a
feature vector. However, due to the detachment of embedding process with
external tasks, the learned embedding results by most existing embedding models
can be ineffective for application tasks with specific objectives, e.g.,
community detection or information diffusion. In this paper, we propose study
the application oriented heterogeneous social network embedding problem.
Significantly different from the existing works, besides the network structure
preservation, the problem should also incorporate the objectives of external
applications in the objective function. To resolve the problem, in this paper,
we propose a novel network embedding framework, namely the "appLicAtion
orienTed neTwork Embedding" (Latte) model. In Latte, the heterogeneous network
structure can be applied to compute the node "diffusive proximity" scores,
which capture both local and global network structures. Based on these computed
scores, Latte learns the network representation feature vectors by extending
the autoencoder model model to the heterogeneous network scenario, which can
also effectively unite the objectives of network embedding and external
application tasks. Extensive experiments have been done on real-world
heterogeneous social network datasets, and the experimental results have
demonstrated the outstanding performance of Latte in learning the
representation vectors for specific application tasks.Comment: 11 Pages, 12 Figures, 1 Tabl
BL-MNE: Emerging Heterogeneous Social Network Embedding through Broad Learning with Aligned Autoencoder
Network embedding aims at projecting the network data into a low-dimensional
feature space, where the nodes are represented as a unique feature vector and
network structure can be effectively preserved. In recent years, more and more
online application service sites can be represented as massive and complex
networks, which are extremely challenging for traditional machine learning
algorithms to deal with. Effective embedding of the complex network data into
low-dimension feature representation can both save data storage space and
enable traditional machine learning algorithms applicable to handle the network
data. Network embedding performance will degrade greatly if the networks are of
a sparse structure, like the emerging networks with few connections. In this
paper, we propose to learn the embedding representation for a target emerging
network based on the broad learning setting, where the emerging network is
aligned with other external mature networks at the same time. To solve the
problem, a new embedding framework, namely "Deep alIgned autoencoder based
eMbEdding" (DIME), is introduced in this paper. DIME handles the diverse link
and attribute in a unified analytic based on broad learning, and introduces the
multiple aligned attributed heterogeneous social network concept to model the
network structure. A set of meta paths are introduced in the paper, which
define various kinds of connections among users via the heterogeneous link and
attribute information. The closeness among users in the networks are defined as
the meta proximity scores, which will be fed into DIME to learn the embedding
vectors of users in the emerging network. Extensive experiments have been done
on real-world aligned social networks, which have demonstrated the
effectiveness of DIME in learning the emerging network embedding vectors.Comment: 10 pages, 9 figures, 4 tables. Full paper is accepted by ICDM 2017,
In: Proceedings of the 2017 IEEE International Conference on Data Mining
Community Detection in Networks with Node Attributes
Community detection algorithms are fundamental tools that allow us to uncover
organizational principles in networks. When detecting communities, there are
two possible sources of information one can use: the network structure, and the
features and attributes of nodes. Even though communities form around nodes
that have common edges and common attributes, typically, algorithms have only
focused on one of these two data modalities: community detection algorithms
traditionally focus only on the network structure, while clustering algorithms
mostly consider only node attributes. In this paper, we develop Communities
from Edge Structure and Node Attributes (CESNA), an accurate and scalable
algorithm for detecting overlapping communities in networks with node
attributes. CESNA statistically models the interaction between the network
structure and the node attributes, which leads to more accurate community
detection as well as improved robustness in the presence of noise in the
network structure. CESNA has a linear runtime in the network size and is able
to process networks an order of magnitude larger than comparable approaches.
Last, CESNA also helps with the interpretation of detected communities by
finding relevant node attributes for each community.Comment: Published in the proceedings of IEEE ICDM '1
- …