57 research outputs found

    NFSv4 and High Performance File Systems: Positioning to Scale

    Full text link
    The avant-garde of high performance computing is building petabyte storage systems. At CITI, we are investigating the use of NFSv4 as a standard for fast and secure access to this data, both across a WAN and within a (potentially massive) cluster. An NFSv4 server manages much state information, which hampers exporting objects via multiple servers and allows the NFSv4 server to become a bottleneck as load increases. This paper introduces Parallel NFSv4, extending the NFSv4 protocol with a new server-to-server protocol and a new file description and location mechanism for increased scalability.http://deepblue.lib.umich.edu/bitstream/2027.42/107881/1/citi-tr-04-2.pd

    Naming, Migration, and Replication for NFSv4

    Full text link
    In this paper, we discuss a global name space for NFSv4 and mechanisms for transparent migration and replication. By convention, any file or directory name beginning with /nfs on an NFS client is part of this shared global name space. Our system supports file system migration and replication through DNS resolution, provides directory migration and replication using built-in NFSv4 mechanisms, and supports read/write replication with precise consistency guarantees, small performance penalty, and good scaling. We implement these features with small extensions to the published NFSv4 protocol, and demonstrate a practical way to enhance network transparency and administerability of NFSv4 in wide area networks.http://deepblue.lib.umich.edu/bitstream/2027.42/107939/1/citi-tr-06-1.pd

    Direct-pNFS: Scalable, transparent, and versatile access to parallel file systems

    Full text link
    Grid computations require global access to massive data stores. To meet this need, the GridNFS project aims to provide scalable, high-performance, transparent, and secure wide-area data management as well as a scalable and agile name space. While parallel file systems give high I/O throughput, they are highly specialized, have limited operating system and hardware platform support, and often lack strong security mechanisms. Remote data access tools such as NFS and GridFTP overcome some of these limitations, but fail to provide universal, transparent, and scalable remote data access. As part of GridNFS, this paper introduces Direct-pNFS, which builds on the NFSv4.1 protocol to meet a key challenge in accessing remote parallel file systems: high-performance and scalable data access without sacrificing transparency, security, orportability. Experiments with Direct-pNFS demonstrate I/O throughput that equals or out performs the exported parallel file system across a range of workloads.http://deepblue.lib.umich.edu/bitstream/2027.42/107917/1/citi-tr-07-2.pd

    Reliable Replication at Low Cost

    Full text link
    The emerging global scientific collaborations demand a scalable, efficient, reliable, and still convenient data access and management scheme. To fulfill these requirements, this paper describes a replicated file system that supports mutable (i.e., read/write) replication with strong consistency guarantees, small performance penalty, high failure resilience, and good scaling properties. The paper further evaluates the system using a real scientific application. The evaluation results show that the presented replication system can significantly improve the application's performance by reducing the first-time access latency to read the input data and by distributing the verification of data access to a nearby server. Furthermore, the penalty of file replication is negligible as long as applications use synchronous writes at a moderate rate.http://deepblue.lib.umich.edu/bitstream/2027.42/107950/1/citi-tr-06-2.pd

    SDS@hd – Scientific Data Storage

    Get PDF
    SDS@hd (Scientific Data Storage) is a central storage service for hot large-scale scientific data that can be used by researchers from all universities in Baden-Württemberg. It offers fast and secure file system storage capabilities to individuals or groups, e.g. in the context of cooperative projects. Fast data accesses are possible even in case of a high number of small files. User authentication and authorization are implemented in terms of the federated identity management in Baden-Württemberg allowing researchers to use their existing ID of their home institution transparently for this service. Data protection requirements can be fulfilled by data encryption and secure data transfer protocols. The service is operated by the computing center of Heidelberg University

    File Creation Strategies in a Distributed Metadata File System

    Full text link

    Infrastructure Plan for ASC Petascale Environments

    Full text link

    Game of Templates. Deploying and (re-)using Virtualized Research Environments in High-Performance and High-Throughput Computing

    Get PDF
    The Virtual Open Science Collaboration Environment project worked on different use cases to evaluate the necessary steps for virtualization or containerization especially when considering the external dependencies of digital workflows. Virtualized Research Environments (VRE) can both help to broaden the user base of an HPC cluster like NEMO and offer new forms of packaging scientific workflows as well as managing software stacks. The eResearch initiative on VREs sponsored by the state of Baden-Württemberg provided the necessary framework for both the researchers of various disciplines as well as the providers of (large-scale) compute infrastructures to define future operational models of HPC clusters and scientific clouds. In daily operations, VREs running on virtualization or containerization technologies such as OpenStack or Singularity help to disentangle the responsibilities regarding the software stacks needed to fulfill a certain task. Nevertheless, the reproduction of VREs as well as the provisioning of research data to be computed and stored afterward creates a couple of challenges which need to be solved beyond the traditional scientific computing models

    Software Roadmap to Plug and Play Petaflop/s

    Full text link
    • …
    corecore