11,623 research outputs found
Tuning block-parallel all-pairs shortest path algorithm for efficient multi-core implementation
Finding shortest paths in a weighted graph is one of the key problems in computer-science, which has numerous practical applications in multiple domains. This paper analyzes the parallel blocked all-pairs shortest path algorithm at the aim of evaluating the influence of the multi-core system and its hierarchical cache memory on the parameters of algorithm implementation depending on the size of the graph and the size of distance matrix’s block. It proposes a technique of tuning the block-size to the given multi-core system. The technique involves profiling tools in the tuning process and allows the increase of the parallel algorithm throughput. Computational experiments carried out on a rack server equipped with two intel xeon e5-2620 v4 processors of 8 cores and 16 hardware threads each have convincingly shown for various graph sizes that the behavior and parameters of the hierarchical cache memory operation don’t depend on the graph size and are determined only by the distance matrix’s block size. To tune the algorithm to the target multi-core system, the preferable block size can be found once for the graph size whose in-memory matrix representation is larger than the size of cache shared among all processor’s cores. Then this block-size can be reused on graphs of bigger size for efficient solving the all-pairs shortest path problem
Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study
Manycores are consolidating in HPC community as a way of improving
performance while keeping power efficiency. Knights Landing is the recently
released second generation of Intel Xeon Phi architecture. While optimizing
applications on CPUs, GPUs and first Xeon Phi's has been largely studied in the
last years, the new features in Knights Landing processors require the revision
of programming and optimization techniques for these devices. In this work, we
selected the Floyd-Warshall algorithm as a representative case study of graph
and memory-bound applications. Starting from the default serial version, we
show how data, thread and compiler level optimizations help the parallel
implementation to reach 338 GFLOPS.Comment: Computer Science - CACIC 2017. Springer Communications in Computer
and Information Science, vol 79
Routing on the Visibility Graph
We consider the problem of routing on a network in the presence of line
segment constraints (i.e., obstacles that edges in our network are not allowed
to cross). Let be a set of points in the plane and let be a set of
non-crossing line segments whose endpoints are in . We present two
deterministic 1-local -memory routing algorithms that are guaranteed to
find a path of at most linear size between any pair of vertices of the
\emph{visibility graph} of with respect to a set of constraints (i.e.,
the algorithms never look beyond the direct neighbours of the current location
and store only a constant amount of additional information). Contrary to {\em
all} existing deterministic local routing algorithms, our routing algorithms do
not route on a plane subgraph of the visibility graph. Additionally, we provide
lower bounds on the routing ratio of any deterministic local routing algorithm
on the visibility graph.Comment: An extended abstract of this paper appeared in the proceedings of the
28th International Symposium on Algorithms and Computation (ISAAC 2017).
Final version appeared in the Journal of Computational Geometr
- …