249 research outputs found

    Enabling object storage via shims for grid middleware

    Get PDF
    The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called "Cloud" storage such as Amazon S3, but also underlying the implementation of most parallel distributed storage systems. Many of the assumptions in Object Store design are similar, but not identical, to concepts in the design of Grid Storage Elements, although the requirement for "POSIX-like" filesystem structures on top of SEs makes the disjunction seem larger. As modern Object Stores provide many features that most Grid SEs do not (block level striping, parallel access, automatic file repair, etc.), it is of interest to see how easily we can provide interfaces to typical Object Stores via plugins and shims for Grid tools, and how well experiments can adapt their data models to them. We present evaluation of, and first-deployment experiences with, (for example) Xrootd-Ceph interfaces for direct object-store access, as part of an initiative within GridPP[1] hosted at RAL. Additionally, we discuss the tradeoffs and experience of developing plugins for the currently-popular Ceph parallel distributed filesystem for the GFAL2 access layer, at Glasgow

    Assise: Performance and Availability via NVM Colocation in a Distributed File System

    Full text link
    The adoption of very low latency persistent memory modules (PMMs) upends the long-established model of disaggregated file system access. Instead, by colocating computation and PMM storage, we can provide applications much higher I/O performance, sub-second application failover, and strong consistency. To demonstrate this, we built the Assise distributed file system, based on a persistent, replicated coherence protocol for managing a set of server-colocated PMMs as a fast, crash-recoverable cache between applications and slower disaggregated storage, such as SSDs. Unlike disaggregated file systems, Assise maximizes locality for all file IO by carrying out IO on colocated PMM whenever possible and minimizes coherence overhead by maintaining consistency at IO operation granularity, rather than at fixed block sizes. We compare Assise to Ceph/Bluestore, NFS, and Octopus on a cluster with Intel Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as LevelDB, Postfix, and FileBench. We find that Assise improves write latency up to 22x, throughput up to 56x, fail-over time up to 103x, and scales up to 6x better than its counterparts, while providing stronger consistency semantics. Assise promises to beat the MinuteSort world record by 1.5x

    From FPGA to ASIC: A RISC-V processor experience

    Get PDF
    This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC

    Service-oriented models for audiovisual content storage

    No full text
    What are the important topics to understand if involved with storage services to hold digital audiovisual content? This report takes a look at how content is created and moves into and out of storage; the storage service value networks and architectures found now and expected in the future; what sort of data transfer is expected to and from an audiovisual archive; what transfer protocols to use; and a summary of security and interface issues

    I/O BENCHMARKING OF DATA INTENSIVE APPLICATIONS

    Get PDF
    The increasing computerization of the society over the last decade led to the increased data volumes stored over the world. The need to handle and store these massive amounts of data, arising from diverse sources as scientific records, web pages, or social networks has created a new class of application – data intensive applications. Usually designed up to the specific application requirements, one of these most challenging questions is choice of the appropriate back-end. The I/O benchmarking tools can easy this decision process. However, despite of its high variety, there is a lack of portable and easily adaptable benchmarks that can correspond to the real application behavior. The programmable I/O benchmark Parabench tries to close this gap. Its input is based on access patterns, which can be adjusted to the application, for which the system is to be used. Our work concentrates on ability of Parabench in mimicking real applications. We describe here its capabilities to handle MPI-I/O and POSIX and present a modeling example of a data intensive application from the field of business intelligence

    File system metadata virtualization

    Get PDF
    The advance of computing systems has brought new ways to use and access the stored data that push the architecture of traditional file systems to its limits, making them inadequate to handle the new needs. Current challenges affect both the performance of high-end computing systems and its usability from the applications perspective. On one side, high-performance computing equipment is rapidly developing into large-scale aggregations of computing elements in the form of clusters, grids or clouds. On the other side, there is a widening range of scientific and commercial applications that seek to exploit these new computing facilities. The requirements of such applications are also heterogeneous, leading to dissimilar patterns of use of the underlying file systems. Data centres have tried to compensate this situation by providing several file systems to fulfil distinct requirements. Typically, the different file systems are mounted on different branches of a directory tree, and the preferred use of each branch is publicised to users. A similar approach is being used in personal computing devices. Typically, in a personal computer, there is a visible and clear distinction between the portion of the file system name space dedicated to local storage, the part corresponding to remote file systems and, recently, the areas linked to cloud services as, for example, directories to keep data synchronized across devices, to be shared with other users, or to be remotely backed-up. In practice, this approach compromises the usability of the file systems and the possibility of exploiting all the potential benefits. We consider that this burden can be alleviated by determining applicable features on a per-file basis, and not associating them to the location in a static, rigid name space. Moreover, usability would be further increased by providing multiple dynamic name spaces that could be adapted to specific application needs. This thesis contributes to this goal by proposing a mechanism to decouple the user view of the storage from its underlying structure. The mechanism consists in the virtualization of file system metadata (including both the name space and the object attributes) and the interposition of a sensible layer to take decisions on where and how the files should be stored in order to benefit from the underlying file system features, without incurring on usability or performance penalties due to inadequate usage. This technique allows to present multiple, simultaneous virtual views of the name space and the file system object attributes that can be adapted to specific application needs without altering the underlying storage configuration. The first contribution of the thesis introduces the design of a metadata virtualization framework that makes possible the above-mentioned decoupling; the second contribution consists in a method to improve file system performance in large-scale systems by using such metadata virtualization framework; finally, the third contribution consists in a technique to improve the usability of cloud-based storage systems in personal computing devices.Postprint (published version
    • …
    corecore