12,647 research outputs found
Approximate simulation of distributed-memory multithreaded multiprocessors
The performance of modern computer systems is increasingly limited by long latencies of accesses to their memory systems. Instruction-level multithreading is a technique to tolerate long latencies of memory accesses by switching from one instruction thread to another. The paper shows that the simulation-based performance evaluation of distributed-memory multithreaded multiprocessor systems can be significantly simplified by using approximate models, composed of only a few processors, but with some parameters adjusted to represent the behavior of the original system
Asynchronous Graph Pattern Matching on Multiprocessor Systems
Pattern matching on large graphs is the foundation for a variety of
application domains. Strict latency requirements and continuously increasing
graph sizes demand the usage of highly parallel in-memory graph processing
engines that need to consider non-uniform memory access (NUMA) and concurrency
issues to scale up on modern multiprocessor systems. To tackle these aspects,
graph partitioning becomes increasingly important. Hence, we present a
technique to process graph pattern matching on NUMA systems in this paper. As a
scalable pattern matching processing infrastructure, we leverage a
data-oriented architecture that preserves data locality and minimizes
concurrency-related bottlenecks on NUMA systems. We show in detail, how graph
pattern matching can be asynchronously processed on a multiprocessor system.Comment: 14 Pages, Extended version for ADBIS 201
Validation of multiprocessor systems
Experiments that can be used to validate fault free performance of multiprocessor systems in aerospace systems integrating flight controls and avionics are discussed. Engineering prototypes for two fault tolerant multiprocessors are tested
Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary
The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts
A bibliography on parallel and vector numerical algorithms
This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also
Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling
Though the GPGPU concept is well-known
in image processing, much more work remains to be done
to fully exploit GPUs as an alternative computation
engine. This paper investigates the computation-to-core
mapping strategies to probe the efficiency and scalability
of the robust facet image modeling algorithm on GPUs.
Our fine-grained computation-to-core mapping scheme
shows a significant performance gain over the standard
pixel-wise mapping scheme. With in-depth performance
comparisons across the two different mapping schemes,
we analyze the impact of the level of parallelism on
the GPU computation and suggest two principles for
optimizing future image processing applications on the
GPU platform
- …