Search CORE

104 research outputs found

Direct-pNFS: Scalable, transparent, and versatile access to parallel file systems

Author: Hildebrand Dean
Honeyman Peter
Publication venue: Center for Information Technology Integration
Publication date: 01/06/2007
Field of study

Grid computations require global access to massive data stores. To meet this need, the GridNFS project aims to provide scalable, high-performance, transparent, and secure wide-area data management as well as a scalable and agile name space. While parallel file systems give high I/O throughput, they are highly specialized, have limited operating system and hardware platform support, and often lack strong security mechanisms. Remote data access tools such as NFS and GridFTP overcome some of these limitations, but fail to provide universal, transparent, and scalable remote data access. As part of GridNFS, this paper introduces Direct-pNFS, which builds on the NFSv4.1 protocol to meet a key challenge in accessing remote parallel file systems: high-performance and scalable data access without sacrificing transparency, security, orportability. Experiments with Direct-pNFS demonstrate I/O throughput that equals or out performs the exported parallel file system across a range of workloads.http://deepblue.lib.umich.edu/bitstream/2027.42/107917/1/citi-tr-07-2.pd

Deep Blue Documents at the University of Michigan

I/O performance evaluation with Parabench — programmable I/O benchmark

Author: Kunkel Julian M.
Ludwig Thomas
Mordvinova Olga
Runz Dennis
Publication venue: Published by Elsevier B.V.
Publication date: 31/05/2010
Field of study

AbstractChoosing an appropriate cluster file system for a specific high performance computing application is challenging and depends mainly on the specific application I/O needs. There is a wide variety of I/O requirements: Some implementations require reading and writing large datasets, others out-of-core data access, or they have database access requirements. Application access patterns reflect different I/O behavior and can be used for performance testing.This paper presents the programmable I/O benchmarking tool Parabench. It has access patterns as input, which can be adapted to mimic behavior for a rich set of applications. Using this benchmarking tool, composed patterns can be automatically tested and easily compared on different local and cluster file systems. Here we introduce the design of the proposed benchmark, focusing on the Parabench programming language, which was developed for flexible pattern creation. We also demonstrate here an exemplary usage of Parabench and its capabilities to handle the POSIX and MPI-IO interfaces

Elsevier - Publisher Connector

Measurement of PVFS2 performance on InfiniBand

Author: Tirupati Nagaraj Sudhindra Prasad
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2009
Field of study

InfiniBand is becoming increasingly popular as a fast interconnect technology between servers and storage. It has far better price/performance ratio compared to both Gigabit Ethernet and 10 Gigabit Ethernet, and hence is being increasingly used for high-performance computing applications. PVFS2, the second generation Parallel Virtual File System (PVFS), is a distributed file system for parallel data access that is being increasingly used in clustered applications. As previous studies have shown that in general, PVFS2 over InfiniBand offers enhanced I/O rates compared to PVFS2 over TCP and Gigabit Ethernet. Apart from the hardware technology, the application programming interface into the file system also makes a difference. To get better parallel performance, the choice of a file system interface is important. Our study is to benchmark and compare the performance of PVFS2 running over InfiniBand using different file system interfaces. IOR is a popular I/O microbenchmarking tool that supports the POSIX and MPI I/O file system interfaces. In addition to testing these already supported interfaces, we have written a PVFS2 module extension for IOR to support native PVFS2 interfaces into the PVFS2 file system. As we shall see in this study, using native PVFS2 interface offers significant performance benefit compared to other file system interfaces on the PVFS2 file system. Our benchmarking effort also involves studying the effect of a multi-client environment on the I/O performance of different file system interfaces. Based on the benchmarking results we obtain, we determine the most efficient application programming interface for parallel I/O on PVFS2 in a typical multi-client parallel application scenario

Digital Repository @ Iowa State University (ISU)

Hopes and facts in evaluating the performance of HPC-I/O on a cloud environment

Author: Gómez Sánchez Pilar
Luque Emilio
Méndez Sandra
Rexachs del Rosario Dolores
Publication venue
Publication date: 01/04/2015
Field of study

Currently, there is an increasing interest about the cloud platform by the High Performance Computing (HPC) community, and the Parallel I/O for High Performance Systems is not an exception. In cloud platforms, the user takes into account not only the execution time but also the cost, because the cost can be one of the most important issue. In this paper, we propose a methodology to quickly evaluate the performance and cost of Virtual Clusters for parallel scientific application that uses parallel I/O. From the parallel application I/O model automatically extracted with our tool PAS2P-IO, we obtain the I/O requirements and then the user can select the Virtual Cluster that meets the application requirements. The application I/O model does not depend on the underlying I/O system. One of the main benefits of applying our methodology is that it is not necessary to execute the application to select the Virtual Cluster on cloud. Finally, costs and performance-cost ratio for the Virtual Clusters are provided to facilitate the decision making on the selection of resources on a cloud platform.Facultad de Informátic

Hopes and facts in evaluating the performance of HPC-I/O on a cloud environment

Author: Gómez Sánchez Pilar
Luque Emilio
Méndez Sandra
Rexachs del Rosario Dolores
Publication venue
Publication date: 31/03/2015
Field of study

Recommended from our members

A Lightweight, High-performance I/O Management Package for Data-intensive Computing

Author: Wang Jun
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 17/07/2007
Field of study

File storage systems are playing an increasingly important role in high-performance computing as the performance gap between CPU and disk increases. It could take a long time to develop an entire system from scratch. Solutions will have to be built as extensions to existing systems. If new portable, customized software components are plugged into these systems, better sustained high I/O performance and higher scalability will be achieved, and the development cycle of next-generation of parallel file systems will be shortened. The overall research objective of this ECPI development plan aims to develop a lightweight, customized, high-performance I/O management package named LightI/O to extend and leverage current parallel file systems used by DOE. During this period, We have developed a novel component in LightI/O and prototype them into PVFS2, and evaluate the resultant prototype—extended PVFS2 system on data-intensive applications. The preliminary results indicate the extended PVFS2 delivers better performance and reliability to users. A strong collaborative effort between the PI at the University of Nebraska Lincoln and the DOE collaborators—Drs Rob Ross and Rajeev Thakur at Argonne National Laboratory who are leading the PVFS2 group makes the project more promising

UNT Digital Library

HEC: Collaborative Research: SAM^2 Toolkit: Scalable and Adaptive Metadata Management for High-End Computing

Author: Zhu Yifeng
Publication venue: DigitalCommons@UMaine
Publication date: 04/06/2010
Field of study

The increasing demand for Exa-byte-scale storage capacity by high end computing applications requires a higher level of scalability and dependability than that provided by current file and storage systems. The proposal deals with file systems research for metadata management of scalable cluster-based parallel and distributed file storage systems in the HEC environment. It aims to develop a scalable and adaptive metadata management (SAM2) toolkit to extend features of and fully leverage the peak performance promised by state-of-the-art cluster-based parallel and distributed file storage systems used by the high performance computing community. There is a large body of research on data movement and management scaling, however, the need to scale up the attributes of cluster-based file systems and I/O, that is, metadata, has been underestimated. An understanding of the characteristics of metadata traffic, and an application of proper load-balancing, caching, prefetching and grouping mechanisms to perform metadata management correspondingly, will lead to a high scalability. It is anticipated that by appropriately plugging the scalable and adaptive metadata management components into the state-of-the-art cluster-based parallel and distributed file storage systems one could potentially increase the performance of applications and file systems, and help translate the promise and potential of high peak performance of such systems to real application performance improvements. The project involves the following components: 1. Develop multi-variable forecasting models to analyze and predict file metadata access patterns. 2. Develop scalable and adaptive file name mapping schemes using the duplicative Bloom filter array technique to enforce load balance and increase scalability 3. Develop decentralized, locality-aware metadata grouping schemes to facilitate the bulk metadata operations such as prefetching. 4. Develop an adaptive cache coherence protocol using a distributed shared object model for client-side and server-side metadata caching. 5. Prototype the SAM2 components into the state-of-the-art parallel virtual file system PVFS2 and a distributed storage data caching system, set up an experimental framework for a DOE CMS Tier 2 site at University of Nebraska-Lincoln and conduct benchmark, evaluation and validation studies

University of Maine

A MIDDLE-WARE LEVEL CLIENT CACHE FOR A HIGH PERFORMANCE COMPUTING I/O SIMULATOR

Author: Bassily Michael
Publication venue: Clemson University Libraries
Publication date: 11/12/2007
Field of study

This thesis describes the design and run time analysis of the system level middle-ware cache for Hecios. Hecios is a high performance cluster I/O simulator. With Hecios, we provide a simulation environment that accurately captures the performance characteristics of all the components in a clusterwide parallel file system. Hecios was specifically modeled after PVFS2. It was designed to be extensible and to easily allow for various component modules to be easily replaced by those that model other system types. Built around the OMNeT++ simulation package, Hecios\u27 inner-cluster communication module is easily adaptable to any TCP/IP based protocol and all standard network interface cards, switches, hubs, and routers. We will examine the system cache component and describe a methodology for implementing other coherence and replacement techniques within Hecios. Similar to other cache simulation tools, we allow the size of the system cache to be varied independently of the replacement policy and caching technique used

Clemson University: TigerPrints

Extending the POSIX I/O interface: a parallel file system perspective.

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref