9,107 research outputs found
Recommended from our members
The scheduling of sparse matrix-vector multiplication on a massively parallel dap computer
An efficient data structure is presented which supports general unstructured sparse matrix-vector multiplications on a Distributed Array of Processors (DAP). This approach seeks to reduce the inter-processor data movements and organises the operations in batches of massively parallel steps by a heuristic scheduling procedure performed on the host computer.
The resulting data structure is of particular relevance to iterative schemes for solving linear systems. Performance results for matrices taken from well known Linear Programming (LP) test problems are presented and analysed
Spin-polarized Quantum Transport in Mesoscopic Conductors: Computational Concepts and Physical Phenomena
Mesoscopic conductors are electronic systems of sizes in between nano- and
micrometers, and often of reduced dimensionality. In the phase-coherent regime
at low temperatures, the conductance of these devices is governed by quantum
interference effects, such as the Aharonov-Bohm effect and conductance
fluctuations as prominent examples. While first measurements of quantum charge
transport date back to the 1980s, spin phenomena in mesoscopic transport have
moved only recently into the focus of attention, as one branch of the field of
spintronics. The interplay between quantum coherence with confinement-,
disorder- or interaction-effects gives rise to a variety of unexpected spin
phenomena in mesoscopic conductors and allows moreover to control and engineer
the spin of the charge carriers: spin interference is often the basis for
spin-valves, -filters, -switches or -pumps. Their underlying mechanisms may
gain relevance on the way to possible future semiconductor-based spin devices.
A quantitative theoretical understanding of spin-dependent mesoscopic
transport calls for developing efficient and flexible numerical algorithms,
including matrix-reordering techniques within Green function approaches, which
we will explain, review and employ.Comment: To appear in the Encyclopedia of Complexity and System Scienc
Permutation and Grouping Methods for Sharpening Gaussian Process Approximations
Vecchia's approximate likelihood for Gaussian process parameters depends on
how the observations are ordered, which can be viewed as a deficiency because
the exact likelihood is permutation-invariant. This article takes the
alternative standpoint that the ordering of the observations can be tuned to
sharpen the approximations. Advantageously chosen orderings can drastically
improve the approximations, and in fact, completely random orderings often
produce far more accurate approximations than default coordinate-based
orderings do. In addition to the permutation results, automatic methods for
grouping calculations of components of the approximation are introduced, having
the result of simultaneously improving the quality of the approximation and
reducing its computational burden. In common settings, reordering combined with
grouping reduces Kullback-Leibler divergence from the target model by a factor
of 80 and computation time by a factor of 2 compared to ungrouped
approximations with default ordering. The claims are supported by theory and
numerical results with comparisons to other approximations, including tapered
covariances and stochastic partial differential equation approximations.
Computational details are provided, including efficiently finding the orderings
and ordered nearest neighbors, and profiling out linear mean parameters and
using the approximations for prediction and conditional simulation. An
application to space-time satellite data is presented
BSML: A Binding Schema Markup Language for Data Interchange in Problem Solving Environments (PSEs)
We describe a binding schema markup language (BSML) for describing data
interchange between scientific codes. Such a facility is an important
constituent of scientific problem solving environments (PSEs). BSML is designed
to integrate with a PSE or application composition system that views model
specification and execution as a problem of managing semistructured data. The
data interchange problem is addressed by three techniques for processing
semistructured data: validation, binding, and conversion. We present BSML and
describe its application to a PSE for wireless communications system design
Software trace cache
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying hardware resources regardless of the specific details of the processor/architecture in order to increase fetch performance. The Software Trace Cache (STC) is a code layout algorithm with a broader target than previous layout optimizations. We target not only an improvement in the instruction cache hit rate, but also an increase in the effective fetch width of the fetch engine. The STC algorithm organizes basic blocks into chains trying to make sequentially executed basic blocks reside in consecutive memory positions, then maps the basic block chains in memory to minimize conflict misses in the important sections of the program. We evaluate and analyze in detail the impact of the STC, and code layout optimizations in general, on the three main aspects of fetch performance; the instruction cache hit rate, the effective fetch width, and the branch prediction accuracy. Our results show that layout optimized, codes have some special characteristics that make them more amenable for high-performance instruction fetch. They have a very high rate of not-taken branches and execute long chains of sequential instructions; also, they make very effective use of instruction cache lines, mapping only useful instructions which will execute close in time, increasing both spatial and temporal locality.Peer ReviewedPostprint (published version
Optimizing compilation with preservation of structural code coverage metrics to support software testing
Code-coverage-based testing is a widely-used testing strategy with the aim of providing a meaningful decision criterion for the adequacy of a test suite. Code-coverage-based testing is also mandated for the development of safety-critical applications; for example, the DO178b document requires the application of the modified condition/decision coverage. One critical issue of code-coverage testing is that structural code coverage criteria are typically applied to source code whereas the generated machine code may result in a different code structure because of code optimizations performed by a compiler. In this work, we present the automatic calculation of coverage profiles describing which structural code-coverage criteria are preserved by which code optimization, independently of the concrete test suite. These coverage profiles allow to easily extend compilers with the feature of preserving any given code-coverage criteria by enabling only those code optimizations that preserve it. Furthermore, we describe the integration of these coverage profile into the compiler GCC. With these coverage profiles, we answer the question of how much code optimization is possible without compromising the error-detection likelihood of a given test suite. Experimental results conclude that the performance cost to achieve preservation of structural code coverage in GCC is rather low.Peer reviewedSubmitted Versio
- …