4,920 research outputs found
Correlations and Clustering in Wholesale Electricity Markets
We study the structure of locational marginal prices in day-ahead and
real-time wholesale electricity markets. In particular, we consider the case of
two North American markets and show that the price correlations contain
information on the locational structure of the grid. We study various
clustering methods and introduce a type of correlation function based on event
synchronization for spiky time series, and another based on string correlations
of location names provided by the markets. This allows us to reconstruct
aspects of the locational structure of the grid.Comment: 30 pages, several picture
Flow-based Influence Graph Visual Summarization
Visually mining a large influence graph is appealing yet challenging. People
are amazed by pictures of newscasting graph on Twitter, engaged by hidden
citation networks in academics, nevertheless often troubled by the unpleasant
readability of the underlying visualization. Existing summarization methods
enhance the graph visualization with blocked views, but have adverse effect on
the latent influence structure. How can we visually summarize a large graph to
maximize influence flows? In particular, how can we illustrate the impact of an
individual node through the summarization? Can we maintain the appealing graph
metaphor while preserving both the overall influence pattern and fine
readability?
To answer these questions, we first formally define the influence graph
summarization problem. Second, we propose an end-to-end framework to solve the
new problem. Our method can not only highlight the flow-based influence
patterns in the visual summarization, but also inherently support rich graph
attributes. Last, we present a theoretic analysis and report our experiment
results. Both evidences demonstrate that our framework can effectively
approximate the proposed influence graph summarization objective while
outperforming previous methods in a typical scenario of visually mining
academic citation networks.Comment: to appear in IEEE International Conference on Data Mining (ICDM),
Shen Zhen, China, December 201
Consensus clustering approach to group brain connectivity matrices
A novel approach rooted on the notion of consensus clustering, a strategy
developed for community detection in complex networks, is proposed to cope with
the heterogeneity that characterizes connectivity matrices in health and
disease. The method can be summarized as follows:
(i) define, for each node, a distance matrix for the set of subjects by
comparing the connectivity pattern of that node in all pairs of subjects; (ii)
cluster the distance matrix for each node; (iii) build the consensus network
from the corresponding partitions; (iv) extract groups of subjects by finding
the communities of the consensus network thus obtained.
Differently from the previous implementations of consensus clustering, we
thus propose to use the consensus strategy to combine the information arising
from the connectivity patterns of each node. The proposed approach may be seen
either as an exploratory technique or as an unsupervised pre-training step to
help the subsequent construction of a supervised classifier. Applications on a
toy model and two real data sets, show the effectiveness of the proposed
methodology, which represents heterogeneity of a set of subjects in terms of a
weighted network, the consensus matrix
Analysis of heat kernel highlights the strongly modular and heat-preserving structure of proteins
In this paper, we study the structure and dynamical properties of protein
contact networks with respect to other biological networks, together with
simulated archetypal models acting as probes. We consider both classical
topological descriptors, such as the modularity and statistics of the shortest
paths, and different interpretations in terms of diffusion provided by the
discrete heat kernel, which is elaborated from the normalized graph Laplacians.
A principal component analysis shows high discrimination among the network
types, either by considering the topological and heat kernel based vector
characterizations. Furthermore, a canonical correlation analysis demonstrates
the strong agreement among those two characterizations, providing thus an
important justification in terms of interpretability for the heat kernel.
Finally, and most importantly, the focused analysis of the heat kernel provides
a way to yield insights on the fact that proteins have to satisfy specific
structural design constraints that the other considered networks do not need to
obey. Notably, the heat trace decay of an ensemble of varying-size proteins
denotes subdiffusion, a peculiar property of proteins
How is a data-driven approach better than random choice in label space division for multi-label classification?
We propose using five data-driven community detection approaches from social
networks to partition the label space for the task of multi-label
classification as an alternative to random partitioning into equal subsets as
performed by RAkELd: modularity-maximizing fastgreedy and leading eigenvector,
infomap, walktrap and label propagation algorithms. We construct a label
co-occurence graph (both weighted an unweighted versions) based on training
data and perform community detection to partition the label set. We include
Binary Relevance and Label Powerset classification methods for comparison. We
use gini-index based Decision Trees as the base classifier. We compare educated
approaches to label space divisions against random baselines on 12 benchmark
data sets over five evaluation measures. We show that in almost all cases seven
educated guess approaches are more likely to outperform RAkELd than otherwise
in all measures, but Hamming Loss. We show that fastgreedy and walktrap
community detection methods on weighted label co-occurence graphs are 85-92%
more likely to yield better F1 scores than random partitioning. Infomap on the
unweighted label co-occurence graphs is on average 90% of the times better than
random paritioning in terms of Subset Accuracy and 89% when it comes to Jaccard
similarity. Weighted fastgreedy is better on average than RAkELd when it comes
to Hamming Loss
Different approaches to community detection
A precise definition of what constitutes a community in networks has remained
elusive. Consequently, network scientists have compared community detection
algorithms on benchmark networks with a particular form of community structure
and classified them based on the mathematical techniques they employ. However,
this comparison can be misleading because apparent similarities in their
mathematical machinery can disguise different reasons for why we would want to
employ community detection in the first place. Here we provide a focused review
of these different motivations that underpin community detection. This
problem-driven classification is useful in applied network science, where it is
important to select an appropriate algorithm for the given purpose. Moreover,
highlighting the different approaches to community detection also delineates
the many lines of research and points out open directions and avenues for
future research.Comment: 14 pages, 2 figures. Written as a chapter for forthcoming Advances in
network clustering and blockmodeling, and based on an extended version of The
many facets of community detection in complex networks, Appl. Netw. Sci. 2: 4
(2017) by the same author
A generalised significance test for individual communities in networks
Many empirical networks have community structure, in which nodes are densely
interconnected within each community (i.e., a group of nodes) and sparsely
across different communities. Like other local and meso-scale structure of
networks, communities are generally heterogeneous in various aspects such as
the size, density of edges, connectivity to other communities and significance.
In the present study, we propose a method to statistically test the
significance of individual communities in a given network. Compared to the
previous methods, the present algorithm is unique in that it accepts different
community-detection algorithms and the corresponding quality function for
single communities. The present method requires that a quality of each
community can be quantified and that community detection is performed as
optimisation of such a quality function summed over the communities. Various
community detection algorithms including modularity maximisation and graph
partitioning meet this criterion. Our method estimates a distribution of the
quality function for randomised networks to calculate a likelihood of each
community in the given network. We illustrate our algorithm by synthetic and
empirical networks.Comment: 20 pages, 4 figures and 4 table
Post-processing partitions to identify domains of modularity optimization
We introduce the Convex Hull of Admissible Modularity Partitions (CHAMP)
algorithm to prune and prioritize different network community structures
identified across multiple runs of possibly various computational heuristics.
Given a set of partitions, CHAMP identifies the domain of modularity
optimization for each partition ---i.e., the parameter-space domain where it
has the largest modularity relative to the input set---discarding partitions
with empty domains to obtain the subset of partitions that are "admissible"
candidate community structures that remain potentially optimal over indicated
parameter domains. Importantly, CHAMP can be used for multi-dimensional
parameter spaces, such as those for multilayer networks where one includes a
resolution parameter and interlayer coupling. Using the results from CHAMP, a
user can more appropriately select robust community structures by observing the
sizes of domains of optimization and the pairwise comparisons between
partitions in the admissible subset. We demonstrate the utility of CHAMP with
several example networks. In these examples, CHAMP focuses attention onto
pruned subsets of admissible partitions that are 20-to-1785 times smaller than
the sets of unique partitions obtained by community detection heuristics that
were input into CHAMP.Comment: http://www.mdpi.com/1999-4893/10/3/9
Median evidential c-means algorithm and its application to community detection
Median clustering is of great value for partitioning relational data. In this
paper, a new prototype-based clustering method, called Median Evidential
C-Means (MECM), which is an extension of median c-means and median fuzzy
c-means on the theoretical framework of belief functions is proposed. The
median variant relaxes the restriction of a metric space embedding for the
objects but constrains the prototypes to be in the original data set. Due to
these properties, MECM could be applied to graph clustering problems. A
community detection scheme for social networks based on MECM is investigated
and the obtained credal partitions of graphs, which are more refined than crisp
and fuzzy ones, enable us to have a better understanding of the graph
structures. An initial prototype-selection scheme based on evidential
semi-centrality is presented to avoid local premature convergence and an
evidential modularity function is defined to choose the optimal number of
communities. Finally, experiments in synthetic and real data sets illustrate
the performance of MECM and show its difference to other methods
- …