15,644 research outputs found
Holistic debugging - enabling instruction set simulation for software quality assurance
We present holistic debugging, a novel method for observing execution of complex and distributed software. It builds on an instruction set simulator, which provides reproducible experiments and non-intrusive probing of state in a distributed system. Instruction set simulators, however, only provide low-level information, so a holistic debugger contains a translation framework that maps this information to higher abstraction level observation tools, such as source code debuggers. We have created Nornir, a proof-of-concept holistic debugger, built on the simulator Simics. For each observed process in the simulated system, Nornir creates an abstraction translation stack, with virtual machine translators that map machine-level storage contents (e.g. physical memory, registers) provided by Simics, to application-level data (e.g. virtual memory contents) by parsing the data structures of operating systems and virtual machines. Nornir includes a modified version of the GNU debugger (GDB), which supports non-intrusive symbolic debugging of distributed applications. Nornir's main interface is a debugger shepherd, a programmable interface that controls multiple debuggers, and allows users to coherently inspect the entire state of heterogeneous, distributed applications. It provides a robust observation platform for construction of new observation tools
Applying inspection to object-oriented software
The benefits of the object-oriented paradigmare widely cited. At the same time, inspection is deemed to be the most cost-effective means of detecting defects in software products. Why then, is there no published experience, let alone quantitative data, on the application of inspection to object-oriented systems? We describe the facilities of the object-oriented paradigm and the issues that these raise when inspecting object-oriented code. Several problems are caused by the disparity between the static code structure and its dynamic runtime behaviour. The large number of small methods in object-oriented systems can also cause problems. We then go on to describe three areas which may help mitigate problems found. Firstly, the use of various programming methods may assist in making object-oriented code easier to inspect. Secondly, improved program documentation can help the inspector understand the code which is under inspection. Finally, tool support can help the inspector to analyse the dynamic behaviour of the code. We conclude that while both the object-oriented paradigm and inspection provide excellent benefits on their own, combining the two may be a difficult exercise, requiring extensive support if it is to be successful
A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids
The efficient implementation of collective communiction operations has
received much attention. Initial efforts produced "optimal" trees based on
network communication models that assumed equal point-to-point latencies
between any two processes. This assumption is violated in most practical
settings, however, particularly in heterogeneous systems such as clusters of
SMPs and wide-area "computational Grids," with the result that collective
operations perform suboptimally. In response, more recent work has focused on
creating topology-aware trees for collective operations that minimize
communication across slower channels (e.g., a wide-area network). While these
efforts have significant communication benefits, they all limit their view of
the network to only two layers. We present a strategy based upon a multilayer
view of the network. By creating multilevel topology-aware trees we take
advantage of communication cost differences at every level in the network. We
used this strategy to implement topology-aware versions of several MPI
collective operations in MPICH-G2, the Globus Toolkit[tm]-enabled version of
the popular MPICH implementation of the MPI standard. Using information about
topology provided by MPICH-G2, we construct these multilevel topology-aware
trees automatically during execution. We present results demonstrating the
advantages of our multilevel approach by comparing it to the default
(topology-unaware) implementation provided by MPICH and a topology-aware
two-layer implementation.Comment: 16 pages, 8 figure
Recommended from our members
Harnessing enforcement leverage at the border to minimize biological risk from international live species trade
Allocating inspection resources over a diverse set of imports to prevent entry of plant pests and pathogens presents a substantial policy design challenge. We model inspections of live plant imports and producer responses to inspections using a “state-dependent” monitoring and enforcement model. We capture exporter abatement response to a set of feasible inspection policies from the regulator. Conditional on this behavioral response, we solve the regulator’s problem of selecting the parameters for the state-dependent monitoring regime to minimize entry of infested shipments. We account for exporter heterogeneity, fixed penalties for noncompliance, imperfect abatement control and imperfect inspections at the border. Overall, we estimate that state-dependent targeting (based on historical interceptions) cuts the rate of infested shipments that are accepted by one-fifth, relative to uniformly allocated inspections
cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications
Modern processor architectures, in addition to having still more cores, also
require still more consideration to memory-layout in order to run at full
capacity. The usefulness of most languages is deprecating as their
abstractions, structures or objects are hard to map onto modern processor
architectures efficiently.
The work in this paper introduces a new abstract machine framework, cphVB,
that enables vector oriented high-level programming languages to map onto a
broad range of architectures efficiently. The idea is to close the gap between
high-level languages and hardware optimized low-level implementations. By
translating high-level vector operations into an intermediate vector bytecode,
cphVB enables specialized vector engines to efficiently execute the vector
operations.
The primary success parameters are to maintain a complete abstraction from
low-level details and to provide efficient code execution across different,
modern, processors. We evaluate the presented design through a setup that
targets multi-core CPU architectures. We evaluate the performance of the
implementation using Python implementations of well-known algorithms: a jacobi
solver, a kNN search, a shallow water simulation and a synthetic stencil
simulation. All demonstrate good performance
rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Parallel Independent Tasks
Scientific applications often contain large and computationally intensive
parallel loops. Dynamic loop self scheduling (DLS) is used to achieve a
balanced load execution of such applications on high performance computing
(HPC) systems. Large HPC systems are vulnerable to processors or node failures
and perturbations in the availability of resources. Most self-scheduling
approaches do not consider fault-tolerant scheduling or depend on failure or
perturbation detection and react by rescheduling failed tasks. In this work, a
robust dynamic load balancing (rDLB) approach is proposed for the robust self
scheduling of independent tasks. The proposed approach is proactive and does
not depend on failure or perturbation detection. The theoretical analysis of
the proposed approach shows that it is linearly scalable and its cost decrease
quadratically by increasing the system size. rDLB is integrated into an MPI DLS
library to evaluate its performance experimentally with two computationally
intensive scientific applications. Results show that rDLB enables the tolerance
of up to (P minus one) processor failures, where P is the number of processors
executing an application. In the presence of perturbations, rDLB boosted the
robustness of DLS techniques up to 30 times and decreased application execution
time up to 7 times compared to their counterparts without rDLB
Synapse: Synthetic Application Profiler and Emulator
We introduce Synapse motivated by the needs to estimate and emulate workload
execution characteristics on high-performance and distributed heterogeneous
resources. Synapse has a platform independent application profiler, and the
ability to emulate profiled workloads on a variety of heterogeneous resources.
Synapse is used as a proxy application (or "representative application") for
real workloads, with the added advantage that it can be tuned at arbitrary
levels of granularity in ways that are simply not possible using real
applications. Experiments show that automated profiling using Synapse
represents application characteristics with high fidelity. Emulation using
Synapse can reproduce the application behavior in the original runtime
environment, as well as reproducing properties when used in a different
run-time environments
- …