Search CORE

394 research outputs found

Fast Byte Copying: A Re-Evaluation of the Opportunities for Optimization

Author: Inouye Jon
Walpole Jonathan
Zhang Ke
Publication venue: PDXScholar
Publication date: 01/06/1995
Field of study

High-performance byte copying is important for many operating systems because it is the principle method used for transferring data between kernel and user protection domains. For example, byte copying is commonly used for transferring data from kernel buffers to user buffers during file system read and IPC recv calls and to kernel buffers from user buffers during \u27Write and-send calls. Because of its impact on overall system performance, commercial operating systems tend to employ many specialized byte copy routines, each one optimized for a different circumstance. This paper revisits the opportunities for optimizing byte copy performance by discussing a series of experiments run under HP-UX 9.03 on a range of Hewlett-Packard PA-RISC processors. First, we compare the performance improvements that result from several existing byte copy optimizations. Then we show that byte copy performance is dominated by cache effects that arise when source and target addresses overlap. Finally, we discuss the opportunities and difficulties associated with choosing appropriate source and target addresses to optimize byte copy performance

PDXScholar (Portland State University)

Fast Byte Copying: A Re-Evaluation of the Opportunities for Optimization

Author: Inouye Jon
Walpole Jonathan
Zhang Ke
Publication venue: PDXScholar
Publication date: 01/01/1995
Field of study

CiteSeerX

PDXScholar (Portland State University)

A Full Equilibrium Relevant Market Test: Application to Computer Servers

Author: Ivaldi Marc
Lörincz Szabolcs
Publication venue
Publication date
Field of study

Research Papers in Economics

Reducing consistency traffic and cache misses in the avalanche multiprocessor

Author: Carter John B.
Kuramkote Ravindra
Publication venue: University of Utah
Publication date: 01/01/1995
Field of study

Journal ArticleFor a parallel architecture to scale effectively, communication latency between processors must be avoided. We have found that the source of a large number of avoidable cache misses is the use of hardwired write-invalidate coherency protocols, which often exhibit high cache miss rates due to excessive invalidations and subsequent reloading of shared data. In the Avalanche project at the University of Utah, we are building a 64-node multiprocessor designed to reduce the end-to-end communication latency of both shared memory and message passing programs. As part of our design efforts, we are evaluating the potential performance benefits and implementation complexity of providing hardware support for multiple coherency protocols. Using a detailed architecture simulation of Avalanche, we have found that support for multiple consistency protocols can reduce the time parallel applications spend stalled on memory operations by up to 66% and overall execution time by up to 31%. Most of this reduction in memory stall time is due to a novel release-consistent multiple-writer write-update protocol implemented using a write state buffer

The University of Utah: J. Willard Marriott Digital Library

Analysis of avalanche's shared memory architecture

Author: Carter John B.
Kuramkote Ravindra
Publication venue: University of Utah
Publication date: 01/01/1997
Field of study

technical reportIn this paper, we describe the design of the Avalanche multiprocessor's shared memory subsystem, evaluate its performance, and discuss problems associated with using commodity workstations and network interconnects as the building blocks of a scalable shared memory multiprocessor. Compared to other scalable shared memory architectures, Avalanchehas a number of novel features including its support for the Simple COMA memory architecture and its support for multiple coherency protocols (migratory, delayed write update, and (soon) write invalidate). We describe the performance implications of Avalanche's architecture, the impact of various novel low-level design options, and describe a number of interesting phenomena we encountered while developing a scalable multiprocessor built on the HP PA-RISC platform

The University of Utah: J. Willard Marriott Digital Library

A study of workstation computational performance for real-time flight simulation

Author: Cleveland Jeff I., II
Maddalon Jeffrey M.
Publication venue
Publication date
Field of study

With recent advances in microprocessor technology, some have suggested that modern workstations provide enough computational power to properly operate a real-time simulation. This paper presents the results of a computational benchmark, based on actual real-time flight simulation code used at Langley Research Center, which was executed on various workstation-class machines. The benchmark was executed on different machines from several companies including: CONVEX Computer Corporation, Cray Research, Digital Equipment Corporation, Hewlett-Packard, Intel, International Business Machines, Silicon Graphics, and Sun Microsystems. The machines are compared by their execution speed, computational accuracy, and porting effort. The results of this study show that the raw computational power needed for real-time simulation is now offered by workstations

NASA Technical Reports Server

Message passing support in the Avalanche widget

Author: Stoller Leigh B.
Swanson Mark R.
Publication venue: University of Utah
Publication date: 01/01/1996
Field of study

Journal ArticleMinimizing communication latency in message passing multiprocessing systems is critical. An emerging problem in these systems is the latency contribution costs caused by the need to percolate the message through the memory hierarchy (at both sending and receiving nodes) and the additional cost of managing consistency within the hierarchy. This paper, considers three important aspects of these costs: cache coherence, message copying, and cache miss rates. The paper then shows via a simulation study how a design called the Widget can be used with existing commercial workstation technology to significantly reduce these costs to support efficient message passing in the Avalanche multiprocessing system

The University of Utah: J. Willard Marriott Digital Library

A proposal for a numerical accelerator project

Author: McIntosh E
Pettersson Thomas Sven
Schmidt F
Publication venue
Publication date: 01/01/1996
Field of study

CERN Document Server