5 research outputs found

    High performance communication subsystem for clustering standard high-volume servers using Gigabit Ethernet

    Get PDF
    This paper presents an efficient communication subsystem, DP-II, for clustering standard high-volume (SHV) servers using Gigabit Ethernet. The DP-II employs several lightweight messaging mechanisms to achieve low-latency and high-bandwidth communication. The test shows an 18.32 us single-trip latency and 72.8 MB/s bandwidth on a Gigabit Ethernet network for connecting two Dell PowerEdge 6300 Quad Xeon SMP servers running Linux. To improve the programmability of the DP-II communication subsystem, the development of DP-II was based on a concise yet powerful abstract communication model, Directed Point Model, which can be conveniently used to depict the inter-process communication pattern of a parallel task in the cluster environment. In addition, the API of DP-II preserves the syntax and semantics of traditional UNIX I/O operations, which make it easy to use.published_or_final_versio

    Doctor of Philosophy

    Get PDF
    dissertationHigh-performance supercomputers on the Top500 list are commonly designed around commodity CPUs. Most of the codes executed on these machines are message-passing codes using the message-passing toolkit (MPI). Thus it makes sense to look at these machines from a holistic systems architecture perspective and consider optimizations to commodity processors that make them more efficient in message-passing architectures. Described herein is a new User-Level Notification (ULN) architecture that significantly improves message-passing performance. The architecture integrates a simultaneous multithreaded (SMT) processor with a user-level network interface (NI) that can directly control the execution scheduling of threads on the processor. By allowing the network interface to control the execution of message handling code at the user level, the operating system (OS) related overhead for handling interrupts and user code dispatch related to notifications is eliminated. By using an SMT processor, message handling can be performed in one thread concurrent to user computation in other threads, thus most of the overhead of executing message handlers can be hidden. This dissertation presents measurements showing the OS overheads related to message-passing are significant in modern architectures and describes a new architecture that significantly reduces these overheads. On a communication-intensive real-world application, the ULN architecture provides a 50.9% performance improvement over a more traditional OS-based NIC and a 5.29-31.9% improvement over a best-of-class user-level NIC due to the user-level notifications

    Mechanisms for efficient, protected messaging

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 143-149).by Whay Sing Lee.Ph.D

    Early Experience with Message-Passing on the SHRIMP Multicomputer

    No full text
    The SHRIMP multicomputer provides virtual memorymapped communication (VMMC), which supports protected, user-level message passing, allows user programs to perform their own buffer management, and separates data transfers from control transfers so that a data transfer can be done without the intervention of the receiving node CPU. An important question is whether such a mechanism can indeed deliver all of the available hardware performance to applications which use conventional message-passing libraries. This pape
    corecore