618 research outputs found

    The architecture of the High Performance Storage System (HPSS)

    Get PDF
    The rapid growth in the size of datasets has caused a serious imbalance in I/O and storage system performance and functionality relative to application requirements and the capabilities of other system components. The High Performance Storage System (HPSS) is a scalable, next-generation storage system that will meet the functionality and performance requirements or large-scale scientific and commercial computing environments. Our goal is to improve the performance and capacity of storage by two orders of magnitude or more over what is available in the general or mass marketplace today. We are also providing corresponding improvements in architecture and functionality. This paper describes the architecture and functionality of HPSS

    SAFIUS - A secure and accountable filesystem over untrusted storage

    Get PDF
    We describe SAFIUS, a secure accountable file system that resides over an untrusted storage. SAFIUS provides strong security guarantees like confidentiality, integrity, prevention from rollback attacks, and accountability. SAFIUS also enables read/write sharing of data and provides the standard UNIX-like interface for applications. To achieve accountability with good performance, it uses asynchronous signatures; to reduce the space required for storing these signatures, a novel signature pruning mechanism is used. SAFIUS has been implemented on a GNU/Linux based system modifying OpenGFS. Preliminary performance studies show that SAFIUS has a tolerable overhead for providing secure storage: while it has an overhead of about 50% of OpenGFS in data intensive workloads (due to the overhead of performing encryption/decryption in software), it is comparable (or better in some cases) to OpenGFS in metadata intensive workloads.Comment: 11pt, 12 pages, 16 figure

    Techniques for building highly available distributed file systems

    Get PDF
    This paper analyzes recent research in the field of distributed file systems, with a particular emphasis on the problem of high availability. Several of the techniques involved in building such a system are discussed individually: naming, replication, multiple versions, caching, stashing, and logging. These techniques range from extensions of ideas used in centralized file systems, through new notions already in use, to radical ideas that have not yet been implemented. A number of working and proposed systems are described in conjunction with the analysis of each technique. The paper concludes that a low degree of replication, a liberal use of client and server caching, and optimistic behavior in the face of network partition are all necessary to ensure high availability

    An efficient method to avoid path lookup in file access auditing in IO path to improve file system IO performance

    Get PDF
    One of the biggest challenges in metadata management schemes that sit outside the filesystem layer is their ability to index meaningful path information of files that are being referenced in an external system like a database or in a metadata journal file. Path to a file is a critical requirement that allows both meaningful interpretation of the locality of the file and its metadata and also secondly allows for more efficient user mode services that can transform the file or its metadata. Additionally path information is very essential in compliance systems where audit logs need to tell what happened to a file and where it is located. However when the data path is being audited from layers such as protocols, it becomes harder to reconstruct the entire path information for all the files given that the protocol layers do not directly integrate with the underlying Filesystem. The protocol layers would then need to rely on system cache to get the path data and sometimes this may not be possible making it required for the protocol to actually do an expensive reverse path walk, reconstructing the path. This actually heavily degrades the performance of the system. In this paper we discuss a mechanism that allows us to record enough information about the file using the unique ID of itself and its parent in the protocol layer such that if and when required the path information can be reconstructed based on a reliable reverse lookup in a database or a file based journal system. The idea is to have enough information to reconstruct the path at a later time and outside the system where the information was initially originated from. The paper also talks of keeping this system consistent under all conditions
    corecore