Search CORE

315 research outputs found

Double Standards: The Suppression of Abortion Protesters\u27 Free Speech Rights

Author: Keleher Christopher P.
Publication venue: DePaul University
Publication date: 01/03/2002
Field of study

Via Sapientiae: The Institutional Repository at DePaul University

Improving fundamental stockpile management procedures

Author: Cameron D.
Keleher P.
Knijnikov M.
Publication venue: 'Sociological Research Online'
Publication date: 01/01/1998
Field of study

Coal Quality management and the control of the flow of coal through complex mining preparation and transport phases of standard mining operations has assumed greater importance over recent years. Considering the history of the Australian industry from 1970, it is significant to note the increase in production levels and the inferred increasein focus on quality control -both of which drive the managemento f product quality into a position of greateri mportance. Quality managementis one fundamentalo f the industry coming under increasedp ressure

Research Online

Leveraging Transparent Data Distribution in OpenMP via User-Level Dynamic Page Migration

Author: C. Koelbel
P. Keleher
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

Author: Cox A.L.
Dwarkadas S.
Keleher P.
Publication venue
Publication date: 20/10/2005
Field of study

TreadMarks is a distributed shared memory (DSM) system for standard Unix systems such as SunOS and Ultrix. This paper presents a performance evaluation of TreadMarks running on Ultrix using DECstation-5000/240's that are connected by a 100-Mbps switch-based ATM LAN and a 10-Mbps Ethernet. Our objective is to determine the efficiency of a user-level DSM implementation on commercially available workstations and operating systems. We achieved good speedups on the 8-processor ATM network for Jacobi (7.4), TSP (7.2), Quicksort (6.3), and ILINK (5.7). For a slightly modified version ofWater from the SPLASH benchmark suite, we achieved only moderate speedups (4.0) due to the high communication and synchronization rate. Speedups decline on the 10-Mbps Ethernet (5.5 for Jacobi, 6.5 for TSP, 4.2 for Quicksort, 5.1 for ILINK, and 2.1 for Water), re ecting the bandwidth limitations of the Ethernet. These results support the contention that, with suitable networking technology, DSM is a viable technique for parallel computation on clusters of workstations. To achieve these speedups, TreadMarks goes to great lengths to reduce the amount of communication performed to maintain memory consistency. It uses a lazy implementation of release consistency, and it allows multiple concurrent writers to modify a page, reducing the impact of false sharing. Great care was taken to minimize communication overhead. In particular, on the ATM network, we used a standard low-level protocol, AAL3/4, bypassing the TCP/IP protocol stack. Unix communication overhead, however, remains the main obstacle in the way of better performance for programs like Water. Compared to the Unix communication overhead, memory management cost (both kernel and user level) is small and wire time is negligible

Infoscience - École polytechnique fédérale de Lausanne

Lazy Release Consistency for Software Distributed Shared Memory

Author: Cox A.L.
Keleher P.
Zwaenepoel Willy
Publication venue
Publication date: 20/10/2005
Field of study

Release consistency, a relaxed memory consistency model that reduces the impact of remote memory access latency in both software and hardware distributed shared memory, is considered. To reduce the number of messages and the amount of data exchanged for remote memory access, a lazy release consistency algorithm is introduced. It pulls modifications across the interconnect only when necessary. Trace-driven simulation using the SPLASH benchmarks indicates that lazy release consistency reduces both the number of messages and the amount of data transferred between processors. These reductions are especially significant for programs that exhibit false sharing and make extensive use of locks

Infoscience - École polytechnique fédérale de Lausanne

A Decision-Process Analysis of Implicit Coscheduling

Author: Baras John S.
Keleher P.
Poovendran R.
Publication venue
Publication date: 01/01/2000
Field of study

This paper presents a theoretical framework based on Bayesian decision theory for analyzing recently reported results on implicit coscheduling of parallel applications on clusters of workstations. Using probabilistic modeling, we show that the approach presented can be applied for processes with arbitrary communication mixes. We also note that our approach can be used for deciding the additional spin times in the case of spin-yield.Finally, we present arguments for the use of a different notion of fairness than assumed by prior work.International Conference on Parallel and Distributed Computing</i

CiteSeerX

Digital Repository at the University of Maryland

TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

Author: Cox A.L.
Dwarkadas S.
Keleher P.
Zwaenepoel Willy
Publication venue
Publication date: 20/10/2005
Field of study

TreadMarks is a distributed shared memory (DSM) system for standard Unix systems such as SunOS and Ultrix. This paper presents a performance evaluation of TreadMarks running on Ultrix using DECstation-5000/240's that are connected by a 100-Mbps switch-based ATM LAN and a 10-Mbps Ethernet. Our objective is to determine the efficiency of a user-level DSM implementation on commercially available workstations and operating systems. We achieved good speedups on the 8-processor ATM network for Jacobi (7.4), TSP (7.2), Quicksort (6.3), and ILINK (5.7). For a slightly modified version ofWater from the SPLASH benchmark suite, we achieved only moderate speedups (4.0) due to the high communication and synchronization rate. Speedups decline on the 10-Mbps Ethernet (5.5 for Jacobi, 6.5 for TSP, 4.2 for Quicksort, 5.1 for ILINK, and 2.1 for Water), reecting the bandwidth limitations of the Ethernet. These results support the contention that, with suitable networking technology, DSM is a viable technique for parallel computation on clusters of workstations. To achieve these speedups, TreadMarks goes to great lengths to reduce the amount of communication performed to maintain memory consistency. It uses a lazy implementation of release consistency, and it allows multiple concurrent writers to modify a page, reducing the impact of false sharing. Great care was taken to minimize communication overhead. In particular, on the ATM network, we used a standard low-level protocol, AAL3/4, bypassing the TCP/IP protocol stack. Unix communication overhead, however, remains the main obstacle in the way of better performance for programs like Water. Compared to the Unix communication overhead, memory management cost (both kernel and user level) is small and wire time is negligible

Infoscience - École polytechnique fédérale de Lausanne

An Evaluation of Software Distributed Shared Memory for Next-Generation Processors and Networks

Author: Cox A.L.
Dwarkadas S.
Keleher P.
Zwaenepoel W
Publication venue
Publication date: 20/10/2005
Field of study

We evaluate the effect of processor speed, network characteristics, and software overhead on the performance of release-consistent software distributed shared memory. We examine five different protocols for implementing release consistency: eager update, eager invalidate, lazy update, lazy invalidate, and a new protocol called lazy hybrid. This lazy hybrid protocol combines the benefits of both lazy update and lazy invalidate. Our simulations indicate that with the processors and networks that are becoming available, coarse-grained applications such as Jacobi and TSP perform well, more or less independent of the protocol used. Medium-grained applications, such as Water, can achieve good performance, but the choice of protocol is critical. For sixteen processors, the best protocol, lazy hybrid, performed more than three times better than the worst, the eager update. Fine-grained applications such as Cholesky achieve little speedup regardless of the protocol used because of the frequency of synchronization operations and the high latency involved. While the use of relaxed memory models, lazy implementations, and multiple-writer protocols has reduced the impact of false sharing, synchronization latency remains a serious problem for software distributed shared memory systems. These results suggest that future work on software DSMs should concentrate on reducing the amount of synchronization or its effect

Infoscience - École polytechnique fédérale de Lausanne

An Evaluation of Software Release-Consistent Protocols

Author: Cox A.L.
Dwarkadas S.
Keleher P.
Zwaenepoel W
Publication venue: 'Elsevier BV'
Publication date: 17/10/2005
Field of study

This paper presents an evaluation of three software implementations of release consistency. Release consistent protocols allow data communication to be aggregated, and multiple writers to simultaneously modify a single page. We evaluated an eager invalidate protocol that enforces consistency when synchronization variables are released, a lazy invalidate protocol that enforces consistency when synchronization variables are acquired, and a lazy hybrid protocol that selectively uses update to reduce access misses. Our evaluation is based on implementations running on DECstation-5000/240s connected by an ATM LAN, and an execution driven simulator that allows us to vary network parameters. Our results show that the lazy protocols consistently outperform the eager protocol for all but one application, and that the lazy hybrid performs the best overall. However, the relative performance of the implementations is highly dependent on the relative speeds of the network, processor, and communication software. Lower bandwidths and high per byte software communication costs favor the lazy invalidate protocol, while high bandwidths and low per byte costs favor the hybrid. Performance of the eager protocol approaches that of the lazy protocols only when communication becomes essentially free

Infoscience - École polytechnique fédérale de Lausanne

Parallelization of General Linkage Analysis Problems

Author: Cottingham R.W.
Cox A.L.
Dwarkadas S.
Keleher P.
Schaffer A.A.
Zwaenepoel W
Publication venue: 'S. Karger AG'
Publication date: 17/10/2005
Field of study

We describe a parallel implementation of a genetic linkage analysis program that achieves good speedups, even for analyses on a single pedigree and with a single starting recombination fraction vector. Our parallel implementation has been run on three different platforms: an Ethernet network of workstations, a higher-bandwidth. Asynchronous Transfer Mode (ATM) network of workstations, and a shared-memory multiprocessor. The same program, written in a shared memory programming style, is used on all platforms. On the workstation networks, the hardware does not provide shared memory, so the program executes on a distributed shared memory system that implements shared memory in software. These three platforms represent different points on the price/performance scale. Ethernet networks are cheap and omnipresent. ATM networks are an emerging technology that others higher bandwidth, and shared-memory multiprocessors offer the best performance because communication is implemented entirely by hardware. On 8 processors and for the longer runs, we achieve speedups between 3.5 and 5 on the Ethernet network and between 4.8 and 6 on the ATM network. On the shared-memory multiprocessor, we achieve speedups in the 5.5 to 6.5 range for all runs

Infoscience - École polytechnique fédérale de Lausanne