Search CORE

281 research outputs found

Lattice QCD on a Beowulf Cluster

Author: Kim Seyong
Publication venue: 'Elsevier BV'
Publication date: 01/01/1999
Field of study

Using commodity component personal computers based on Alpha processor and commodity network devices and a switch, we built an 8-node parallel computer. GNU/Linux is chosen as an operating system and message passing libraries such as PVM, LAM, and MPICH have been tested as a parallel programming environment. We discuss our lattice QCD project for a heavy quark system on this computer.Comment: Lattice99 (algorithms and machines),3 pages, 3 figures, espcrc2.st

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface

Author: Foster I.
Karonis N. T.
Toonen B.
Publication venue
Publication date: 01/01/2002
Field of study

Application development for distributed computing "Grids" can benefit from tools that variously hide or enable application-level management of critical aspects of the heterogeneous environment. As part of an investigation of these issues, we have developed MPICH-G2, a Grid-enabled implementation of the Message Passing Interface (MPI) that allows a user to run MPI programs across multiple computers, at the same or different sites, using the same commands that would be used on a parallel computer. This library extends the Argonne MPICH implementation of MPI to use services provided by the Globus Toolkit for authentication, authorization, resource allocation, executable staging, and I/O, as well as for process creation, monitoring, and control. Various performance-critical operations, including startup and collective operations, are configured to exploit network topology information. The library also exploits MPI constructs for performance management; for example, the MPI communicator construct is used for application-level discovery of, and adaptation to, both network topology and network quality-of-service mechanisms. We describe the MPICH-G2 design and implementation, present performance results, and review application experiences, including record-setting distributed simulations.Comment: 20 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

Implementation of MPICH on top of MPLi̲te

Author: Selvarajan Shoba
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2002
Field of study

The goal of this thesis is to develop a new Channel Interface device for the MPICH implementation of the MPI (Message Passing Interface) standard using MPLi̲te. MPLi̲te is a lightweight message-passing library that is not a full MPI implementation, but offers high performance. MPICH (Message Passing Interface CHameleon) is a full implementation of the MPI standard that has the p4 library as the underlying communication device for TCP/IP networks. By integrating MPLi̲te as a Channel Interface device in MPICH, a parallel programmer can utilize the full MPI implementation of MPICH as well as the high bandwidth offered by MPLi̲te. There are several layers in the MPICH library where one can tie a new device. The Channel Interface is the lowest layer that requires very few functions to add a new device. By attaching MPLi̲te to MPICH at the lowest level, the Channel Interface, almost all of the performance of the MPLi̲te library can be delivered to the applications using MPICH. MPLi̲te can be implemented either as a blocking or a non-blocking Channel Interface device. The performance was measured on two separate test clusters, the PC and the Alpha mini-clusters, having Gigabit Ethernet connections. The PC cluster has two 1.8 GHz Pentium 4 PCs and the Alpha cluster has two 500 MHz Compaq DS20 workstations. Different network interface cards like Netgear, TrendNet and SysKonnect Gigabit Ethernet cards were used for the measurements. Both the blocking and non-blocking MPICH-MPLi̲te Channel Interface devices perform close to raw TCP, whereas a performance loss of 25-30% is seen in the MPICH-p4 Channel Interface device for larger messages. The superior performance offered by the MPICH-MPLi̲te device compared to the MPICH-p4 device can be easily seen on the SysKonnect cards using jumbo frames. The throughput curve also improves considerably by increasing the Eager/Rendezvous threshold

Digital Repository @ Iowa State University (ISU)

Minimalist's Linux Cluster

Author: Chang-Yeong Choi
Jeong-Hyun Kim
Kim
Seyong Kim
Publication venue: 'Elsevier BV'
Publication date: 23/11/2003
Field of study

Using barebone PC components and NIC's, we construct a linux cluster which has 2-dimensional mesh structure. This cluster has smaller footprint, is less expensive, and use less power compared to conventional linux cluster. Here, we report our experience in building such a machine and discuss our current lattice project on the machine.Comment: 3 pages, 2 figures, Proceedings of the Lattice 03 Conference (Tsukuba, Japan

arXiv.org e-Print Archive

Crossref

CERN Document Server

Speeding up parallel GROMACS on high-latency networks

Author: de Groot B.
Fechner M.
Grubmueller H.
Kutzner C.
Lindahl E.
Schmitt U.
van der Spoel D.
Publication venue
Publication date: 01/01/2007
Field of study

MPG.PuRe

Process migration for MPI applications based on coordinated checkpoint

Author: Cao J
Guo M
Li Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/12/2014
Field of study

2005-2006 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe

PolyU Institutional Repository

User-Friendly Parallel Computations with Econometric Examples

Author: Michael Creel
Publication venue
Publication date
Field of study

This paper shows how a high level matrix programming language may be used to perform Monte Carlo simulation, bootstrapping, estimation by maximum likelihood and GMM, and kernel regression in parallel on symmetric multiprocessor computers or clusters of workstations. The implementation of parallelization is done in a way such that an investigator may use the programs without any knowledge of parallel programming. A bootable CD that allows rapid creation of a cluster for parallel computing is introduced. Examples show that parallelization can lead to important reductions in computational time. Detailed discussion of how the Monte Carlo problem was parallelized is included as an example for learning to write parallel programs for Octave.parallel computing, Monte Carlo, bootstrapping,maximum likelihood, GMM, kernel regression

Research Papers in Economics

Scalable group-based checkpoint/restart for large-scale message-passing systems

Author: Ho JCY
Lau FCM
Wang CL
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

The ever increasing number of processors used in parallel computers is making fault tolerance support in large-scale parallel systems more and more important. We discuss the inadequacies of existing system-level checkpointing solutions for message-passing applications as the system scales up. We analyze the coordination cost and blocking behavior of two current MPI implementations with checkpointing support. A group-based solution combining coordinated checkpointing and message logging is then proposed. Experiment results demonstrate its better performance and scalability than LAM/MPI and MPICH-VCL. To assist group formation, a method to analyze the communication behaviors of the application is proposed. ©2008 IEEE.published_or_final_versio

HKU Scholars Hub

Components and Interfaces of a Process Management System for Parallel Programs

Author: Butler Ralph
Gropp William
Lusk Ewing
Publication venue
Publication date: 01/01/2001
Field of study

Parallel jobs are different from sequential jobs and require a different type of process management. We present here a process management system for parallel programs such as those written using MPI. A primary goal of the system, which we call MPD (for multipurpose daemon), is to be scalable. By this we mean that startup of interactive parallel jobs comprising thousands of processes is quick, that signals can be quickly delivered to processes, and that stdin, stdout, and stderr are managed intuitively. Our primary target is parallel machines made up of clusters of SMPs, but the system is also useful in more tightly integrated environments. We describe how MPD enables much faster startup and better runtime management of parallel jobs. We show how close control of stdio can support the easy implementation of a number of convenient system utilities, even a parallel debugger. We describe a simple but general interface that can be used to separate any process manager from a parallel library, which we use to keep MPD separate from MPICH.Comment: 12 pages, Workshop on Clusters and Computational Grids for Scientific Computing, Sept. 24-27, 2000, Le Chateau de Faverges de la Tour, Franc

arXiv.org e-Print Archive

CiteSeerX

UNT Digital Library