Search CORE

30 research outputs found

Anatomy of a message in the Alewife multiprocessor

Author: Agarwal Anant
Kubiatowicz John
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Shared-memory provides a uniform and attractive mechanism for communication. For efficiency, it is often implemented with a layer of interpretive hardware on top of a message-passing communications network. This interpretive layer is responsible for data location, data movement, and cache coherence. It uses patterns of communication that benefit common programming styles, but which are only heuristics. This suggests that certain styles of communication may benefit from direct access to the underlying communications substrate. The Alewife machine, a shared-memory multiprocessor being built at MIT, provides such an interface. The interface is an integral part of the shared memory implementation and affords direct, user-level access to the network queues, supports an efficient DMA mechanism, and includes fast trap handling for message reception. This paper discusses the design and implementation of the Alewife message-passing interface and addresses the issues and advantages of using such an interface to complement hardware-synthesized shared memory.National Science Foundation (U.S.) (Grant MIP-9012773)United States. Defense Advanced Research Projects Agency (Contract N00014-87-K-0825

DSpace@MIT

Crossref

Message passing support in the Avalanche widget

Author: Stoller Leigh B.
Swanson Mark R.
Publication venue: University of Utah
Publication date: 01/01/1996
Field of study

Journal ArticleMinimizing communication latency in message passing multiprocessing systems is critical. An emerging problem in these systems is the latency contribution costs caused by the need to percolate the message through the memory hierarchy (at both sending and receiving nodes) and the additional cost of managing consistency within the hierarchy. This paper, considers three important aspects of these costs: cache coherence, message copying, and cache miss rates. The paper then shows via a simulation study how a design called the Widget can be used with existing commercial workstation technology to significantly reduce these costs to support efficient message passing in the Avalanche multiprocessing system

The University of Utah: J. Willard Marriott Digital Library

Integrated shared-memory and message-passing communication in the Alewife multiprocessor

Author: Kubiatowicz John, 1964-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1998
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 237-246) and index.by John David Kubiatowicz.Ph.D

DSpace@MIT

An evaluation of Fugu's network deadlock avoidance solution

Author: Lee Victor Wui-Keung
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1996
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (leaves 82-86).by Victor Lee.M.S

DSpace@MIT

QuickStep, a system for performance monitoring and debugging parallel applications on the Alewife multiprocessor

Author: Mitra Sramana
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1995
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (leaves 77-78).by Sramana Mitra.M.S

DSpace@MIT

Mechanisms and interfaces for software-extended coherent shared memory

Author: Chaiken David Lars
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1994
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (p. 140-146).by David L. Chaiken.Ph.D

DSpace@MIT

Coherent network interfaces for fine-grain communication

Author: Falsafi Babak
Hill Mark D.
Mukherjee Shubhendu S.
Wood David A.
Publication venue
Publication date: 06/04/2009
Field of study

Using coherence can improve performance by facilitating burst transfers of whole cache blocks and reducing control overheads. This paper describes an attempt to explore network interfaces that use coherence, i.e., coherent network interfaces (CNIs), to improve communication performance. First, it reports on the development and optimization of two mechanisms that CNIs use to communicate with processors. A taxonomy and comparison of four CNIs with a more conventional NI are then presented

Infoscience - École polytechnique fédérale de Lausanne

Multigrain shared memory

Author: Yeung Donald, 1968-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1998
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 197-203).by Donald Yeung.Ph.D

CiteSeerX

DSpace@MIT

Doctor of Philosophy

Author: Parker Michael Allen
Publication venue: University of Utah
Publication date: 01/08/2013
Field of study

dissertationHigh-performance supercomputers on the Top500 list are commonly designed around commodity CPUs. Most of the codes executed on these machines are message-passing codes using the message-passing toolkit (MPI). Thus it makes sense to look at these machines from a holistic systems architecture perspective and consider optimizations to commodity processors that make them more efficient in message-passing architectures. Described herein is a new User-Level Notification (ULN) architecture that significantly improves message-passing performance. The architecture integrates a simultaneous multithreaded (SMT) processor with a user-level network interface (NI) that can directly control the execution scheduling of threads on the processor. By allowing the network interface to control the execution of message handling code at the user level, the operating system (OS) related overhead for handling interrupts and user code dispatch related to notifications is eliminated. By using an SMT processor, message handling can be performed in one thread concurrent to user computation in other threads, thus most of the overhead of executing message handlers can be hidden. This dissertation presents measurements showing the OS overheads related to message-passing are significant in modern architectures and describes a new architecture that significantly reduces these overheads. On a communication-intensive real-world application, the ULN architecture provides a 50.9% performance improvement over a more traditional OS-based NIC and a 5.29-31.9% improvement over a best-of-class user-level NIC due to the user-level notifications

The University of Utah: J. Willard Marriott Digital Library

Reactive synchronization algorithms for multiprocessors

Author: Lim Beng-Hong
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1995
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 157-162).by Beng-Hong Lim.Ph.D

DSpace@MIT