37,141 research outputs found
A Study of Energy and Locality Effects using Space-filling Curves
The cost of energy is becoming an increasingly important driver for the
operating cost of HPC systems, adding yet another facet to the challenge of
producing efficient code. In this paper, we investigate the energy implications
of trading computation for locality using Hilbert and Morton space-filling
curves with dense matrix-matrix multiplication. The advantage of these curves
is that they exhibit an inherent tiling effect without requiring specific
architecture tuning. By accessing the matrices in the order determined by the
space-filling curves, we can trade computation for locality. The index
computation overhead of the Morton curve is found to be balanced against its
locality and energy efficiency, while the overhead of the Hilbert curve
outweighs its improvements on our test system.Comment: Proceedings of the 2014 IEEE International Parallel & Distributed
Processing Symposium Workshops (IPDPSW
A tetrahedral space-filling curve for non-conforming adaptive meshes
We introduce a space-filling curve for triangular and tetrahedral
red-refinement that can be computed using bitwise interleaving operations
similar to the well-known Z-order or Morton curve for cubical meshes. To store
sufficient information for random access, we define a low-memory encoding using
10 bytes per triangle and 14 bytes per tetrahedron. We present algorithms that
compute the parent, children, and face-neighbors of a mesh element in constant
time, as well as the next and previous element in the space-filling curve and
whether a given element is on the boundary of the root simplex or not. Our
presentation concludes with a scalability demonstration that creates and adapts
selected meshes on a large distributed-memory system.Comment: 33 pages, 12 figures, 8 table
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
Sixteen space-filling curves and traversals for d-dimensional cubes and simplices
This article describes sixteen different ways to traverse d-dimensional space
recursively in a way that is well-defined for any number of dimensions. Each of
these traversals has distinct properties that may be beneficial for certain
applications. Some of the traversals are novel, some have been known in
principle but had not been described adequately for any number of dimensions,
some of the traversals have been known. This article is the first to present
them all in a consistent notation system. Furthermore, with this article, tools
are provided to enumerate points in a regular grid in the order in which they
are visited by each traversal. In particular, we cover: five discontinuous
traversals based on subdividing cubes into 2^d subcubes: Z-traversal (Morton
indexing), U-traversal, Gray-code traversal, Double-Gray-code traversal, and
Inside-out traversal; two discontinuous traversals based on subdividing
simplices into 2^d subsimplices: the Hill-Z traversal and the Maehara-reflected
traversal; five continuous traversals based on subdividing cubes into 2^d
subcubes: the Base-camp Hilbert curve, the Harmonious Hilbert curve, the Alfa
Hilbert curve, the Beta Hilbert curve, and the Butz-Hilbert curve; four
continuous traversals based on subdividing cubes into 3^d subcubes: the Peano
curve, the Coil curve, the Half-coil curve, and the Meurthe curve. All of these
traversals are self-similar in the sense that the traversal in each of the
subcubes or subsimplices of a cube or simplex, on any level of recursive
subdivision, can be obtained by scaling, translating, rotating, reflecting
and/or reversing the traversal of the complete unit cube or simplex.Comment: 28 pages, 12 figures. v2: fixed a confusing typo on page 12, line
Effectiveness of landmark analysis for establishing locality in p2p networks
Locality to other nodes on a peer-to-peer overlay network can be established by means of a set of landmarks shared among the participating nodes. Each node independently collects a set of latency measures to landmark nodes, which are used as a multi-dimensional feature vector. Each peer node uses the feature vector to generate a unique scalar index which is correlated to its topological locality. A popular dimensionality reduction technique is the space filling Hilbert’s curve, as it possesses good locality
preserving properties. However, there exists little comparison between Hilbert’s curve and other techniques for dimensionality reduction. This work carries out a quantitative analysis of their properties. Linear and non-linear techniques for scaling the landmark vectors to a single dimension are investigated. Hilbert’s curve, Sammon’s mapping and Principal Component Analysis
have been used to generate a 1d space with locality preserving properties. This work provides empirical evidence to support the use of Hilbert’s curve in the context of locality preservation when generating peer identifiers by means of landmark vector analysis. A comparative analysis is carried out with an artificial 2d network model and with a realistic network topology model
with a typical power-law distribution of node connectivity in the Internet. Nearest neighbour analysis confirms Hilbert’s curve to be very effective in both artificial and realistic network topologies. Nevertheless, the results in the realistic network model show that there is scope for improvements and better techniques to preserve locality information are required
The DUNE-ALUGrid Module
In this paper we present the new DUNE-ALUGrid module. This module contains a
major overhaul of the sources from the ALUgrid library and the binding to the
DUNE software framework. The main changes include user defined load balancing,
parallel grid construction, and an redesign of the 2d grid which can now also
be used for parallel computations. In addition many improvements have been
introduced into the code to increase the parallel efficiency and to decrease
the memory footprint.
The original ALUGrid library is widely used within the DUNE community due to
its good parallel performance for problems requiring local adaptivity and
dynamic load balancing. Therefore, this new model will benefit a number of DUNE
users. In addition we have added features to increase the range of problems for
which the grid manager can be used, for example, introducing a 3d tetrahedral
grid using a parallel newest vertex bisection algorithm for conforming grid
refinement. In this paper we will discuss the new features, extensions to the
DUNE interface, and explain for various examples how the code is used in
parallel environments.Comment: 25 pages, 11 figure
Algorithmic patterns for -matrices on many-core processors
In this work, we consider the reformulation of hierarchical ()
matrix algorithms for many-core processors with a model implementation on
graphics processing units (GPUs). matrices approximate specific
dense matrices, e.g., from discretized integral equations or kernel ridge
regression, leading to log-linear time complexity in dense matrix-vector
products. The parallelization of matrix operations on many-core
processors is difficult due to the complex nature of the underlying algorithms.
While previous algorithmic advances for many-core hardware focused on
accelerating existing matrix CPU implementations by many-core
processors, we here aim at totally relying on that processor type. As main
contribution, we introduce the necessary parallel algorithmic patterns allowing
to map the full matrix construction and the fast matrix-vector
product to many-core hardware. Here, crucial ingredients are space filling
curves, parallel tree traversal and batching of linear algebra operations. The
resulting model GPU implementation hmglib is the, to the best of the authors
knowledge, first entirely GPU-based Open Source matrix library of
this kind. We conclude this work by an in-depth performance analysis and a
comparative performance study against a standard matrix library,
highlighting profound speedups of our many-core parallel approach
- …