3,975 research outputs found
Recommended from our members
Preparing sparse solvers for exascale computing.
Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'
GiViP: A Visual Profiler for Distributed Graph Processing Systems
Analyzing large-scale graphs provides valuable insights in different
application scenarios. While many graph processing systems working on top of
distributed infrastructures have been proposed to deal with big graphs, the
tasks of profiling and debugging their massive computations remain time
consuming and error-prone. This paper presents GiViP, a visual profiler for
distributed graph processing systems based on a Pregel-like computation model.
GiViP captures the huge amount of messages exchanged throughout a computation
and provides an interactive user interface for the visual analysis of the
collected data. We show how to take advantage of GiViP to detect anomalies
related to the computation and to the infrastructure, such as slow computing
units and anomalous message patterns.Comment: Appears in the Proceedings of the 25th International Symposium on
Graph Drawing and Network Visualization (GD 2017
Omniscopes: Large Area Telescope Arrays with only N log N Computational Cost
We show that the class of antenna layouts for telescope arrays allowing cheap
analysis hardware (with correlator cost scaling as N log N rather than N^2 with
the number of antennas N) is encouragingly large, including not only previously
discussed rectangular grids but also arbitrary hierarchies of such grids, with
arbitrary rotations and shears at each level. We show that all correlations for
such a 2D array with an n-level hierarchy can be efficiently computed via a
Fast Fourier Transform in not 2 but 2n dimensions. This can allow major
correlator cost reductions for science applications requiring exquisite
sensitivity at widely separated angular scales, for example 21cm tomography
(where short baselines are needed to probe the cosmological signal and long
baselines are needed for point source removal), helping enable future 21cm
experiments with thousands or millions of cheap dipole-like antennas. Such
hierarchical grids combine the angular resolution advantage of traditional
array layouts with the cost advantage of a rectangular Fast Fourier Transform
Telescope. We also describe an algorithm for how a subclass of hierarchical
arrays can efficiently use rotation synthesis to produce global sky maps with
minimal noise and a well-characterized synthesized beam.Comment: Replaced to match accepted PRD version. 10 pages, 9 fig
Optimal Hierarchical Layouts for Cache-Oblivious Search Trees
This paper proposes a general framework for generating cache-oblivious
layouts for binary search trees. A cache-oblivious layout attempts to minimize
cache misses on any hierarchical memory, independent of the number of memory
levels and attributes at each level such as cache size, line size, and
replacement policy. Recursively partitioning a tree into contiguous subtrees
and prescribing an ordering amongst the subtrees, Hierarchical Layouts
generalize many commonly used layouts for trees such as in-order, pre-order and
breadth-first. They also generalize the various flavors of the van Emde Boas
layout, which have previously been used as cache-oblivious layouts.
Hierarchical Layouts thus unify all previous attempts at deriving layouts for
search trees.
The paper then derives a new locality measure (the Weighted Edge Product)
that mimics the probability of cache misses at multiple levels, and shows that
layouts that reduce this measure perform better. We analyze the various degrees
of freedom in the construction of Hierarchical Layouts, and investigate the
relative effect of each of these decisions in the construction of
cache-oblivious layouts. Optimizing the Weighted Edge Product for complete
binary search trees, we introduce the MinWEP layout, and show that it
outperforms previously used cache-oblivious layouts by almost 20%.Comment: Extended version with proofs added to the appendi
Dynamic Multilevel Graph Visualization
We adapt multilevel, force-directed graph layout techniques to visualizing
dynamic graphs in which vertices and edges are added and removed in an online
fashion (i.e., unpredictably). We maintain multiple levels of coarseness using
a dynamic, randomized coarsening algorithm. To ensure the vertices follow
smooth trajectories, we employ dynamics simulation techniques, treating the
vertices as point particles. We simulate fine and coarse levels of the graph
simultaneously, coupling the dynamics of adjacent levels. Projection from
coarser to finer levels is adaptive, with the projection determined by an
affine transformation that evolves alongside the graph layouts. The result is a
dynamic graph visualizer that quickly and smoothly adapts to changes in a
graph.Comment: 21 page
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
Edge Routing with Ordered Bundles
Edge bundling reduces the visual clutter in a drawing of a graph by uniting
the edges into bundles. We propose a method of edge bundling drawing each edge
of a bundle separately as in metro-maps and call our method ordered bundles. To
produce aesthetically looking edge routes it minimizes a cost function on the
edges. The cost function depends on the ink, required to draw the edges, the
edge lengths, widths and separations. The cost also penalizes for too many
edges passing through narrow channels by using the constrained Delaunay
triangulation. The method avoids unnecessary edge-node and edge-edge crossings.
To draw edges with the minimal number of crossings and separately within the
same bundle we develop an efficient algorithm solving a variant of the
metro-line crossing minimization problem. In general, the method creates clear
and smooth edge routes giving an overview of the global graph structure, while
still drawing each edge separately and thus enabling local analysis
- …