14 research outputs found
Network virtual memory
User-mode access, zero-copy transfer, and sender-managed communication have emerged as
essential for improving communication performance in workstation and PC clusters. The goal of
these techniques is to provide application-level DAAA to remote memory. Achieving this goal is
difficult, however, because the network interface accesses physical rather than virtual memory.
As a result, previous systems have confined source and destination data to pages in pinned
physical memory. Unfortunately, this approach increases application complexity and reduces
memory-management effectiveness.
This thesis describes the design and implementation of NetVM, which is a network interface
that supports user-mode access, zero-copy transfer and sender-managed communication without
pinning source or destination memory. To do this, the network interface maintains a
shadow page table, which the host operating system updates whenever it maps or unmaps a
page in host memory. The network interface uses this table to briefly lock and translate the
virtual address of a page when it accesses that page for DAAA transfer. The operating system is
prevented from replacing a page in the short interval that the network interface has locked
that page. If a destination page is not resident in memory, the network interface redirects the
data to an intermediate system buffer, which the operating system uses to complete the transfer
with a single host-to-host memory copy after fetching in the required page. A credit-based
flow-control scheme prevents the system buffer from overflowing.
Application-level DAAA transfers only data. To support control transfers, NetVM implements a
counter-based notification mechanism for applications to issue and detect notifications. The
sending application increments an event counter by specifying its identifier in an RDAAA write
operation. The receiving application detects this event by busy waiting, block waiting or triggering
a user-defined handler whenever the notifying write completes. This range of detection
mechanisms allows the application to decide appropriate tradeoffs between reducing signaling
latency and reducing processor overhead. NetVM enforces ordered notifications over an out-oforder
delivery network by using a sequence window.
NetVM supports efficient mutual-exclusion, wait-queue and semaphore synchronization implementations.
It augments the network interface with atomic operation primitives, which have
low overhead, to provide MCS-lock-inspired scalable and efficient high-level synchronization for
applications. As a result, these operations require lower latency and fewer network transactions
to complete compared with the traditional implementations.
The NetVM prototype is implemented in firmware for the Myrinet LANai-9.2 and integrated with
the FreeBSD 4.6 virtual memory system. NetVM's memory-management overhead is low; it adds
only less than 5.0% write latency compared to a static pinning approach and has a lower pinning
cost compared to a dynamic pinning approach that has up to 94.5% hit rate in the pinnedpage
cache. Minimum write latency is 5.56us and maximum throughput is 155.46MB/s, which is
97.2% of the link bandwidth. Transferring control through notification adds between 2.96us and
17.49us to the write operation, depending on the detection mechanism used. Compared to
standard low-level atomic operations, NetVM adds only up to 18.2% and 12.6% to application
latencies for high-level wait-queue and counting-semaphore operations respectively.Science, Faculty ofComputer Science, Department ofGraduat
Using Embedded Network Processors to Implement Global Memory Management in a Workstation Cluster
Advances in network technology continue to improve the communication performance of workstation and PC clusters, making high-performance workstation-cluster computing increasingly viable. These hardware advances, however, are taxing traditional host-software network protocols to the breaking point. A modern gigabit network can swamp a host's IO bus and processor, limiting communication performance and slowing computation unacceptably. Fortunately, host-programmable network processors used by these networks present a potential solution. Offloading selected host processing to these embedded network processors lowers host overhead and improves latency. This paper examines the use of embedded network processors to improve the performance of workstation-cluster global memory management. We have implemented a revised version of the GMS global memory system that eliminates host overhead by as much as 29% on active nodes and improves page fault latency by as much as 39%. 1. Introduction Adva..
Using Idle Workstations to Implement Predictive Prefetching
The benefits of Markov-based predictive prefetching have been largely overshadowed by the overhead required to produce high quality predictions. While both theoretical and simulation results for prediction algorithms appear promising, substantial limitations exist in practice. This outcome can be partially attributed to the fact that practical implementations ultimately make compromises in order to reduce overhead. These compromises limit the level of algorithm complexity, the variety of access patterns, and the granularity of trace data the implementation supports. This paper describes the design and implementation of GMS-3P, an operating-system kernel extension that offloads prediction overhead to idle network nodes. GMS-3P builds on the GMS global memory system, which pages to and from remote workstation memory. In GMS-3P, the target node sends an on-line trace of an application’s page faults to an idle node that is running a Markov-based prediction algorithm. The prediction node then uses GMS to prefetch pages to the target node from the memory of other workstations in the network. Our preliminary results show that predictive prefetching can reduce remote-memory page fault time by 60 % or more and that by offloading prediction overhead to an idle node, GMS-3P can reduce this improved latency by between 24 % and 44%, depending on Markov-model order. 1