Many parallel applications require high-performance I/O to avoid negating some or all of the benefit derived from parallelizing its computation. When these applications are run on a loosely-coupled cluster of SMPs, the limitations of existing hardware and software present even more hurdles to performing high-performance I/O. In this paper, we describe our full implementation of the I/O portion of the MPI-2 specification. In particular, we discuss the limitations inherent in performing high-performance I/O on a cluster of SMPs and demonstrate the benefits of using a cluster-based filesystem over a traditional node-based filesystem. 1 Introduction The MPI I/O interface enables efficient use of an underlying parallel file system by allowing parallel applications to describe complex I/O requests at a high level.[1, 2] For example, parallel applications in weather/climate modeling, seismic processing, computational fluid dynamics, and data mining can use such an interface to bett..