54,312 research outputs found
A framework for community detection in heterogeneous multi-relational networks
There has been a surge of interest in community detection in homogeneous
single-relational networks which contain only one type of nodes and edges.
However, many real-world systems are naturally described as heterogeneous
multi-relational networks which contain multiple types of nodes and edges. In
this paper, we propose a new method for detecting communities in such networks.
Our method is based on optimizing the composite modularity, which is a new
modularity proposed for evaluating partitions of a heterogeneous
multi-relational network into communities. Our method is parameter-free,
scalable, and suitable for various networks with general structure. We
demonstrate that it outperforms the state-of-the-art techniques in detecting
pre-planted communities in synthetic networks. Applied to a real-world Digg
network, it successfully detects meaningful communities.Comment: 27 pages, 10 figure
Topics in Network Analysis with Applications to Brain Connectomics
Large complex network data have become common in many scientific domains, and require new statistical tools for discovering the underlying structures and features of interest. This thesis presents new methodology for network data analysis, with a focus on problems arising in the field of brain connectomics. Our overall goal is to learn parsimonious and interpretable network features, with computationally efficient and theoretically justified methods.
The first project in the thesis focuses on prediction with network covariates. This setting is motivated by neuroimaging applications, in which each subject has an associated brain network constructed from fMRI data, and the goal is to derive interpretable prediction rules for a phenotype of interest or a clinical outcome. Existing approaches to this problem typically either reduce the data to a small set of global network summaries, losing a lot of local information, or treat network edges as a ``bag of features'' and use standard statistical tools without accounting for the network nature of the data. We develop a method that uses all edge weights, while still effectively incorporating network structure by using a penalty that encourages sparsity in both the number of edges and the number of nodes used. We develop efficient optimization algorithms for implementing this method and show it achieves state-of-the-art accuracy on a dataset of schizophrenic patients and healthy controls while using a smaller and more readily interpretable set of features than methods which ignore network structure. We also establish theoretical performance guarantees.
Communities in networks are observed in many different domains, and in brain networks they typically correspond to regions of the brain responsible for different functions. In connectomic analyses, there are standard parcellations of the brain into such regions, typically obtained by applying clustering methods to brain connectomes of healthy subjects. However, there is now increasing evidence that these communities are dynamic, and when the goal is predicting a phenotype or distinguishing between different conditions, these static communities from an unrelated set of healthy subjects may not be the most useful for prediction. We present a method for supervised community detection, that is, a method that finds a partition of the network into communities that is most useful for predicting a particular response. We use a block-structured regularization and compute the solution with a combination of a spectral method and an ADMM optimization algorithm. The method performs well on both simulated and real brain networks, providing support for the idea of task-dependent brain regions.
The last part of the thesis focuses on the problem of community detection in the general network setting. Unlike in neuroimaging, statistical network analysis is typically applied to a single network, motivated by datasets from the social sciences. While community detection has been well studied, in practice nodes in a network often belong to more than one community, leading to the much harder problem of overlapping community detection. We propose a new approach for overlapping community detection based on sparse principal component analysis, and develop efficient algorithms that are able to accurately recover community memberships, provided each node does not belong to too many communities at once. The method has a very low computational cost relative to other methods available for this problem. We show asymptotic consistency of recovering community memberships by the new method, and good empirical performance on both simulated and real-world networks.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145883/1/jarroyor_1.pd
Community Detection via Maximization of Modularity and Its Variants
In this paper, we first discuss the definition of modularity (Q) used as a
metric for community quality and then we review the modularity maximization
approaches which were used for community detection in the last decade. Then, we
discuss two opposite yet coexisting problems of modularity optimization: in
some cases, it tends to favor small communities over large ones while in
others, large communities over small ones (so called the resolution limit
problem). Next, we overview several community quality metrics proposed to solve
the resolution limit problem and discuss Modularity Density (Qds) which
simultaneously avoids the two problems of modularity. Finally, we introduce two
novel fine-tuned community detection algorithms that iteratively attempt to
improve the community quality measurements by splitting and merging the given
network community structure. The first of them, referred to as Fine-tuned Q, is
based on modularity (Q) while the second one is based on Modularity Density
(Qds) and denoted as Fine-tuned Qds. Then, we compare the greedy algorithm of
modularity maximization (denoted as Greedy Q), Fine-tuned Q, and Fine-tuned Qds
on four real networks, and also on the classical clique network and the LFR
benchmark networks, each of which is instantiated by a wide range of
parameters. The results indicate that Fine-tuned Qds is the most effective
among the three algorithms discussed. Moreover, we show that Fine-tuned Qds can
be applied to the communities detected by other algorithms to significantly
improve their results
Network Community Detection on Metric Space
Community detection in a complex network is an important problem of much
interest in recent years. In general, a community detection algorithm chooses
an objective function and captures the communities of the network by optimizing
the objective function, and then, one uses various heuristics to solve the
optimization problem to extract the interesting communities for the user. In
this article, we demonstrate the procedure to transform a graph into points of
a metric space and develop the methods of community detection with the help of
a metric defined for a pair of points. We have also studied and analyzed the
community structure of the network therein. The results obtained with our
approach are very competitive with most of the well-known algorithms in the
literature, and this is justified over the large collection of datasets. On the
other hand, it can be observed that time taken by our algorithm is quite less
compared to other methods and justifies the theoretical findings
Clustering and Community Detection in Directed Networks: A Survey
Networks (or graphs) appear as dominant structures in diverse domains,
including sociology, biology, neuroscience and computer science. In most of the
aforementioned cases graphs are directed - in the sense that there is
directionality on the edges, making the semantics of the edges non symmetric.
An interesting feature that real networks present is the clustering or
community structure property, under which the graph topology is organized into
modules commonly called communities or clusters. The essence here is that nodes
of the same community are highly similar while on the contrary, nodes across
communities present low similarity. Revealing the underlying community
structure of directed complex networks has become a crucial and
interdisciplinary topic with a plethora of applications. Therefore, naturally
there is a recent wealth of research production in the area of mining directed
graphs - with clustering being the primary method and tool for community
detection and evaluation. The goal of this paper is to offer an in-depth review
of the methods presented so far for clustering directed networks along with the
relevant necessary methodological background and also related applications. The
survey commences by offering a concise review of the fundamental concepts and
methodological base on which graph clustering algorithms capitalize on. Then we
present the relevant work along two orthogonal classifications. The first one
is mostly concerned with the methodological principles of the clustering
algorithms, while the second one approaches the methods from the viewpoint
regarding the properties of a good cluster in a directed network. Further, we
present methods and metrics for evaluating graph clustering results,
demonstrate interesting application domains and provide promising future
research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
- …