28 research outputs found
Reducing Host Load, Network Load and Latency in a Distributed Shared Memory
Mether is a Distributed Shared Memory (DSM) that runs on Sun¹ workstations under the SunOS 4.0 operating system. User programs access the Mether address space in a way indistinguishable from other memory. Mether had a number of performance problems which we had also seen on a distributed shared memory called Memnet[2]. In this paper we discuss changes we made to Mether and protocols we developed to use Mether that minimize host load, network load, and latency. An interesting (and unexpected) result was that for one problem we studied the same best protocol for Mether is identical to the best protocol for MemNet[6].
The changes to Mether involve exposing an inconsistent store to the application and making access to the consistent and inconsistent versions very convenient; providing both demand-driven and data-driven semantics for updating pages; and allowing the user to specify that only a small subset of a page need be transferred. All of these operations are encoded in a few address bits in the Mether virtual address
The Mether System: Distributed Shared Memory for SunOS 4.0
Mether is a Distributed Shared memory (DSM) that runs on Sun workstations under the SunOS 4.0 operating system. User programs access the Mether address space in a way indistinguishable from other memory. Mether was inspired by the MemNet DSM, but unlike MemNet Mether consists of software communicating over a conventional Ethernet. The kernel part of Mether actually does no data transmission over the network. Data transmission is accomplished by a user-level server. The kernel driver has no preference for a server, and indeed does not know that servers exist. The kernel driver has been made very safe, and in fact panic is not in its dictionary
Recommended from our members
A dynamic kernel modifier for linux
Dynamic Kernel Modifier, or DKM, is a kernel module for Linux that allows user-mode programs to modify the execution of functions in the kernel without recompiling or modifying the kernel source in any way. Functions may be traced, either function entry only or function entry and exit; nullified; or replaced with some other function. For the tracing case, function execution results in the activation of a watchpoint. When the watchpoint is activated, the address of the function is logged in a FIFO buffer that is readable by external applications. The watchpoints are time-stamped with the resolution of the processor high resolution timers, which on most modem processors are accurate to a single processor tick. DKM is very similar to earlier systems such as the SunOS trace device or Linux TT. Unlike these two systems, and other similar systems, DKM requires no kernel modifications. DKM allows users to do initial probing of the kernel to look for performance problems, or even to resolve potential problems by turning functions off or replacing them. DKM watchpoints are not without cost: it takes about 200 nanoseconds to make a log entry on an 800 Mhz Pentium-Ill. The overhead numbers are actually competitive with other hardware-based trace systems, although it has less 'Los Alamos National Laboratory is operated by the University of California for the National Nuclear Security Administration of the United States Department of Energy under contract W-7405-ENG-36. accuracy than an In-Circuit Emulator such as the American Arium. Once the user has zeroed in on a problem, other mechanisms with a higher degree of accuracy can be used
Architecture and Performance of the Mether Network Shared Memory
Mether is a Network Shared Memory (NSM). It allows applications on autonomous computers connected by a network to share a segment of memory.
NSMs offer the attraction of a simple abstraction for shared state, i.e., shared memory. NSMs have a potential performance problem in the cost of remote references, which is typically solved by grouping memory into larger units such as pages, and caching pages. While Mether employs grouping and caching to reduce the average memory reference delay, it also removes the need for many remote references (page faults) by providing a facility with relaxed consistency requirements.
Applications ported from a multiprocessor supercomputer with shared memory to a 16-workstation Mether configuration showed a cost/performance advantage of over 300 in favor of the Mether system. While Mether is currently implemented for Sun-3 and Sun-4 systems connected via Ethernet, other characteristics (such as a choice of page sizes and a semaphore-like access mode useful for process synchronization) should suit it to a wide variety of networks. A reimplementation for an alternate configuration employing packet-switched networks is in progress
Recommended from our members
Approaches for scalable modeling and emulation of cyber systems : LDRD final report.
The goal of this research was to combine theoretical and computational approaches to better understand the potential emergent behaviors of large-scale cyber systems, such as networks of {approx} 10{sup 6} computers. The scale and sophistication of modern computer software, hardware, and deployed networked systems have significantly exceeded the computational research community's ability to understand, model, and predict current and future behaviors. This predictive understanding, however, is critical to the development of new approaches for proactively designing new systems or enhancing existing systems with robustness to current and future cyber threats, including distributed malware such as botnets. We have developed preliminary theoretical and modeling capabilities that can ultimately answer questions such as: How would we reboot the Internet if it were taken down? Can we change network protocols to make them more secure without disrupting existing Internet connectivity and traffic flow? We have begun to address these issues by developing new capabilities for understanding and modeling Internet systems at scale. Specifically, we have addressed the need for scalable network simulation by carrying out emulations of a network with {approx} 10{sup 6} virtualized operating system instances on a high-performance computing cluster - a 'virtual Internet'. We have also explored mappings between previously studied emergent behaviors of complex systems and their potential cyber counterparts. Our results provide foundational capabilities for further research toward understanding the effects of complexity in cyber systems, to allow anticipating and thwarting hackers
Mether: A memory system for network multiprocessors
Memory is an attractive network abstraction for distributed computing systems. This thesis presents the following new results: (1) A high-performance implementation of a Network Shared Memory (NSM) which does not require broadcast capabilities and can be implemented in hardware. (2) A new semantics for shared memory consistency which allows effective interprocess communication (IPC) via sharing while minimizing performance degradation due to synchronization. (3) A detailed performance study of applications which have been ported from shared memory processors to NSM. The results of the research are embodied in Mether 3.0, which serves as an applications platform for such diverse tasks as a Gaussian elimination program for sparse linear systems, a DNA pattern matcher, and network IPC via pipes
Mether: A memory system for network multiprocessors
Memory is an attractive network abstraction for distributed computing systems. This thesis presents the following new results: (1) A high-performance implementation of a Network Shared Memory (NSM) which does not require broadcast capabilities and can be implemented in hardware. (2) A new semantics for shared memory consistency which allows effective interprocess communication (IPC) via sharing while minimizing performance degradation due to synchronization. (3) A detailed performance study of applications which have been ported from shared memory processors to NSM. The results of the research are embodied in Mether 3.0, which serves as an applications platform for such diverse tasks as a Gaussian elimination program for sparse linear systems, a DNA pattern matcher, and network IPC via pipes