1,779 research outputs found
Computing Vertex Centrality Measures in Massive Real Networks with a Neural Learning Model
Vertex centrality measures are a multi-purpose analysis tool, commonly used
in many application environments to retrieve information and unveil knowledge
from the graphs and network structural properties. However, the algorithms of
such metrics are expensive in terms of computational resources when running
real-time applications or massive real world networks. Thus, approximation
techniques have been developed and used to compute the measures in such
scenarios. In this paper, we demonstrate and analyze the use of neural network
learning algorithms to tackle such task and compare their performance in terms
of solution quality and computation time with other techniques from the
literature. Our work offers several contributions. We highlight both the pros
and cons of approximating centralities though neural learning. By empirical
means and statistics, we then show that the regression model generated with a
feedforward neural networks trained by the Levenberg-Marquardt algorithm is not
only the best option considering computational resources, but also achieves the
best solution quality for relevant applications and large-scale networks.
Keywords: Vertex Centrality Measures, Neural Networks, Complex Network Models,
Machine Learning, Regression ModelComment: 8 pages, 5 tables, 2 figures, version accepted at IJCNN 2018. arXiv
admin note: text overlap with arXiv:1810.1176
The architecture of the protein domain universe
Understanding the design of the universe of protein structures may provide
insights into protein evolution. We study the architecture of the protein
domain universe, which has been found to poses peculiar scale-free properties
(Dokholyan et al., Proc. Natl. Acad. Sci. USA 99: 14132-14136 (2002)). We
examine the origin of these scale-free properties of the graph of protein
domain structures (PDUG) and determine that that the PDUG is not modular, i.e.
it does not consist of modules with uniform properties. Instead, we find the
PDUG to be self-similar at all scales. We further characterize the PDUG
architecture by studying the properties of the hub nodes that are responsible
for the scale-free connectivity of the PDUG. We introduce a measure of the
betweenness centrality of protein domains in the PDUG and find a power-law
distribution of the betweenness centrality values. The scale-free distribution
of hubs in the protein universe suggests that a set of specific statistical
mechanics models, such as the self-organized criticality model, can potentially
identify the principal driving forces of molecular evolution. We also find a
gatekeeper protein domain, removal of which partitions the largest cluster into
two large sub-clusters. We suggest that the loss of such gatekeeper protein
domains in the course of evolution is responsible for the creation of new fold
families.Comment: 14 pages, 3 figure
GLB: Lifeline-based Global Load Balancing library in X10
We present GLB, a programming model and an associated implementation that can
handle a wide range of irregular paral- lel programming problems running over
large-scale distributed systems. GLB is applicable both to problems that are
easily load-balanced via static scheduling and to problems that are hard to
statically load balance. GLB hides the intricate syn- chronizations (e.g.,
inter-node communication, initialization and startup, load balancing,
termination and result collection) from the users. GLB internally uses a
version of the lifeline graph based work-stealing algorithm proposed by
Saraswat et al. Users of GLB are simply required to write several pieces of
sequential code that comply with the GLB interface. GLB then schedules and
orchestrates the parallel execution of the code correctly and efficiently at
scale. We have applied GLB to two representative benchmarks: Betweenness
Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be
statically load-balanced whereas UTS cannot. In either case, GLB scales well--
achieving nearly linear speedup on different computer architectures (Power,
Blue Gene/Q, and K) -- up to 16K cores
- …