25,718 research outputs found
Community Detection via Semi-Synchronous Label Propagation Algorithms
A recently introduced novel community detection strategy is based on a label
propagation algorithm (LPA) which uses the diffusion of information in the
network to identify communities. Studies of LPAs showed that the strategy is
effective in finding a good community structure. Label propagation step can be
performed in parallel on all nodes (synchronous model) or sequentially
(asynchronous model); both models present some drawback, e.g., algorithm
termination is nor granted in the first case, performances can be worst in the
second case. In this paper, we present a semi-synchronous version of LPA which
aims to combine the advantages of both synchronous and asynchronous models. We
prove that our models always converge to a stable labeling. Moreover, we
experimentally investigate the effectiveness of the proposed strategy comparing
its performance with the asynchronous model both in terms of quality,
efficiency and stability. Tests show that the proposed protocol does not harm
the quality of the partitioning. Moreover it is quite efficient; each
propagation step is extremely parallelizable and it is more stable than the
asynchronous model, thanks to the fact that only a small amount of
randomization is used by our proposal.Comment: In Proc. of The International Workshop on Business Applications of
Social Network Analysis (BASNA '10
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance
In order to efficiently use the future generations of supercomputers, fault
tolerance and power consumption are two of the prime challenges anticipated by
the High Performance Computing (HPC) community. Checkpoint/Restart (CR) has
been and still is the most widely used technique to deal with hard failures.
Application-level CR is the most effective CR technique in terms of overhead
efficiency but it takes a lot of implementation effort. This work presents the
implementation of our C++ based library CRAFT (Checkpoint-Restart and Automatic
Fault Tolerance), which serves two purposes. First, it provides an extendable
library that significantly eases the implementation of application-level
checkpointing. The most basic and frequently used checkpoint data types are
already part of CRAFT and can be directly used out of the box. The library can
be easily extended to add more data types. As means of overhead reduction, the
library offers a build-in asynchronous checkpointing mechanism and also
supports the Scalable Checkpoint/Restart (SCR) library for node level
checkpointing. Second, CRAFT provides an easier interface for User-Level
Failure Mitigation (ULFM) based dynamic process recovery, which significantly
reduces the complexity and effort of failure detection and communication
recovery mechanism. By utilizing both functionalities together, applications
can write application-level checkpoints and recover dynamically from process
failures with very limited programming effort. This work presents the design
and use of our library in detail. The associated overheads are thoroughly
analyzed using several benchmarks
- …