18,643 research outputs found
Interval simulation: raising the level of abstraction in architectural simulation
Detailed architectural simulators suffer from a long development cycle and extremely long evaluation times. This longstanding problem is further exacerbated in the multi-core processor era. Existing solutions address the simulation problem by either sampling the simulated instruction stream or by mapping the simulation models on FPGAs; these approaches achieve substantial simulation speedups while simulating performance in a cycle-accurate manner This paper proposes interval simulation which rakes a completely different approach: interval simulation raises the level of abstraction and replaces the core-level cycle-accurate simulation model by a mechanistic analytical model. The analytical model estimates core-level performance by analyzing intervals, or the timing between two miss events (branch mispredictions and TLB/cache misses); the miss events are determined through simulation of the memory hierarchy, cache coherence protocol, interconnection network and branch predictor By raising the level of abstraction, interval simulation reduces both development time and evaluation time. Our experimental results using the SPEC CPU2000 and PARSEC benchmark suites and the MS multi-core simulator show good accuracy up to eight cores (average error of 4.6% and max error of 11% for the multi-threaded full-system workloads), while achieving a one order of magnitude simulation speedup compared to cycle-accurate simulation. Moreover interval simulation is easy to implement: our implementation of the mechanistic analytical model incurs only one thousand lines of code. Its high accuracy, fast simulation speed and ease-of-use make interval simulation a useful complement to the architect's toolbox for exploring system-level and high-level micro-architecture trade-offs
An occam Style Communications System for UNIX Networks
This document describes the design of a communications system which provides occam style communications primitives under a Unix environment, using TCP/IP protocols, and any number of other protocols deemed suitable as underlying transport layers. The system will integrate with a low overhead scheduler/kernel without incurring significant costs to the execution of processes within the run time environment. A survey of relevant occam and occam3 features and related research is followed by a look at the Unix and TCP/IP facilities which determine our working constraints, and a description of the T9000 transputer's Virtual Channel Processor, which was instrumental in our formulation. Drawing from the information presented here, a design for the communications system is subsequently proposed. Finally, a preliminary investigation of methods for lightweight access control to shared resources in an environment which does not provide support for critical sections, semaphores, or busy waiting, is made. This is presented with relevance to mutual exclusion problems which arise within the proposed design. Future directions for the evolution of this project are discussed in conclusion
Design and Implementation of a Distributed Middleware for Parallel Execution of Legacy Enterprise Applications
A typical enterprise uses a local area network of computers to perform its
business. During the off-working hours, the computational capacities of these
networked computers are underused or unused. In order to utilize this
computational capacity an application has to be recoded to exploit concurrency
inherent in a computation which is clearly not possible for legacy applications
without any source code. This thesis presents the design an implementation of a
distributed middleware which can automatically execute a legacy application on
multiple networked computers by parallelizing it. This middleware runs multiple
copies of the binary executable code in parallel on different hosts in the
network. It wraps up the binary executable code of the legacy application in
order to capture the kernel level data access system calls and perform them
distributively over multiple computers in a safe and conflict free manner. The
middleware also incorporates a dynamic scheduling technique to execute the
target application in minimum time by scavenging the available CPU cycles of
the hosts in the network. This dynamic scheduling also supports the CPU
availability of the hosts to change over time and properly reschedule the
replicas performing the computation to minimize the execution time. A prototype
implementation of this middleware has been developed as a proof of concept of
the design. This implementation has been evaluated with a few typical case
studies and the test results confirm that the middleware works as expected
- …