Search CORE

11 research outputs found

A Study of Client-based Caching for Parallel I/O

Author: Settlemyer Bradley
Publication venue: Clemson University Libraries
Publication date: 01/08/2009
Field of study

The trend in parallel computing toward large-scale cluster computers running thousands of cooperating processes per application has led to an I/O bottleneck that has only gotten more severe as the the number of processing cores per CPU has increased. Current parallel file systems are able to provide high bandwidth file access for large contiguous file region accesses; however, applications repeatedly accessing small file regions on unaligned file region boundaries continue to experience poor I/O throughput due to the high overhead associated with accessing parallel file system data. In this dissertation we demonstrate how client-side file data caching can improve parallel file system throughput for applications performing frequent small and unaligned file I/O. We explore the impacts of cache page size and cache capacity using the popular FLASH I/O benchmark and explore a novel cache sharing approach that leverages the trend toward multi-core processors. We also explore a technique we call progressive page caching that represents cache data using dynamic data structures rather than fixed-size pages of file data. Finally, we explore a cache aggregation scheme that leverages the high-level file I/O interfaces provided by the PVFS file system to provide further performance enhancements. In summary, our results indicate that a correctly configured middleware-based file data cache can dramatically improve the performance of I/O workloads dominated by small unaligned file accesses. Further, we demonstrate that a well designed cache can offer stable performance even when the selected cache page granularity is not well matched to the provided workload. Finally, we have shown that high-level file system interfaces can significantly accelerate application performance, and interfaces beyond those currently envisioned by the MPI-IO standard could provide further performance benefits

Clemson University: TigerPrints

Recommended from our members

Moving Large Data Sets Over High-Performance Long Distance Networks

Author: Hodson Stephen W
Poole Stephen W
Ruwart Thomas
Settlemyer Bradley W
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/04/2011
Field of study

In this project we look at the performance characteristics of three tools used to move large data sets over dedicated long distance networking infrastructure. Although performance studies of wide area networks have been a frequent topic of interest, performance analyses have tended to focus on network latency characteristics and peak throughput using network traffic generators. In this study we instead perform an end-to-end long distance networking analysis that includes reading large data sets from a source file system and committing large data sets to a destination file system. An evaluation of end-to-end data movement is also an evaluation of the system configurations employed and the tools used to move the data. For this paper, we have built several storage platforms and connected them with a high performance long distance network configuration. We use these systems to analyze the capabilities of three data movement tools: BBcp, GridFTP, and XDD. Our studies demonstrate that existing data movement tools do not provide efficient performance levels or exercise the storage devices in their highest performance modes. We describe the device information required to achieve high levels of I/O performance and discuss how this data is applicable in use cases beyond data movement performance

UNT Digital Library

Ceph Parallel File System Evaluation Report

Author: Atchley Scott
Caldwell Blake A
Fuller Douglas
Hill Jason J
Nelson Mark
Oral H Sarp
Settlemyer Bradley W
Simmons James A
Wang Feiyi
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/11/2013
Field of study

Crossref

UNT Digital Library

A Mechanism for Scalable Redundancy in Parallel File Systems

Author: Bradley W. Settlemyer
Publication venue
Publication date: 01/01/2006
Field of study

As parallel file systems span larger and larger numbers of nodes in order to provide the performance and scalability necessary for modern cluster applications, the need for fault-tolerance and high data availability file systems has arisen. Modern parallel file systems spanning tens, hundreds, or even thousands of servers will require fault tolerance to avoid job failure and catastrophic data loss due to a single disk failure or server loss. Effective fault tolerance in parallel file systems must provide a high degree of data resiliency, consistency, and scalable performance. In this thesis we provide an in depth description of the resiliency and consistency requirements of parallel file systems. We then describe a data replication mechanism that meets the resiliency and consistency requirements of parallel file systems and provides scal-able performance. We also provide an in depth description of how the file system responds during a system fault and how the system may be recovered to its original, fully redundant state after a failure. Finally, we measure the performance of our proposed mechanism by implementing it in a popular parallel file system, PVFS2. We primarily focus on measurin

CiteSeerX

Clemson University: TigerPrints

Using server-to-server communication in parallel file systems to simplify consistency and improve performance

Author: Bradley W. Settlemyer
Philip H. Carns
Walter B. Ligon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Abstract—The trend in parallel computing toward clusters running thousands of cooperating processes per application has led to an I/O bottleneck that has only gotten more severe as the CPU density of clusters has increased. Current parallel file systems provide large amounts of aggregate I/O bandwidth; however, they do not achieve the high degrees of metadata scalability required to manage files distributed across hundreds or thousands of storage nodes. In this paper we examine the use of collective communication between the storage servers to improve the scalability of file metadata operations. In particular, we apply server-to-server communication to simplify consistency checking and improve the performance of file creation, file removal, and file stat. Our results indicate that collective communication is an effective scheme for simplifying consistency checks and significantly improving the performance for several real metadata intensive workloads. I

CiteSeerX

Crossref

Recommended from our members

Diagnosing Anomalous Network Performance with Confidence

Author: Hodson Stephen W
Kuehn Jeffery A
Poole Stephen W
Settlemyer Bradley W
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/04/2011
Field of study

Variability in network performance is a major obstacle in effectively analyzing the throughput of modern high performance computer systems. High performance interconnec- tion networks offer excellent best-case network latencies; how- ever, highly parallel applications running on parallel machines typically require consistently high levels of performance to adequately leverage the massive amounts of available computing power. Performance analysts have usually quantified network performance using traditional summary statistics that assume the observational data is sampled from a normal distribution. In our examinations of network performance, we have found this method of analysis often provides too little data to under- stand anomalous network performance. Our tool, Confidence, instead uses an empirically derived probability distribution to characterize network performance. In this paper we describe several instances where the Confidence toolkit allowed us to understand and diagnose network performance anomalies that we could not adequately explore with the simple summary statis- tics provided by traditional measurement tools. In particular, we examine a multi-modal performance scenario encountered with an Infiniband interconnection network and we explore the performance repeatability on the custom Cray SeaStar2 interconnection network after a set of software and driver updates

UNT Digital Library

Workload characterization of a leadership class storage cluster

Author: Bradley W. Settlemyer
David A. Dillow
Galen M. Shipman
Raghul Gunasekaran
Youngjae Kim
Zhe Zhang
Publication venue
Publication date: 01/01/2010
Field of study

Abstract—Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the scientific workloads of the world’s fastest HPC (High Performance Computing) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). Spider provides an aggregate bandwidth of over 240 GB/s with over 10 petabytes of RAID 6 formatted capacity. OLCFs flagship petascale simulation platform, Jaguar, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, and the distribution of read requests to write requests for the storage system observed over a period of 6 months. From this study we develop synthesized workloads and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. I

CiteSeerX

Crossref