18 research outputs found

    A File System Level Snapshot In Ext4

    Get PDF
    Snapshot makes a copy of current working system. Creating a snapshot with some processing andoverheads is important to provide the reliable data service during backup. There are two types of snapshottechniques: volume-based approach and system-based approach. The volume-based Logical VolumeManager provides compact space usability, but requires some space to save snapshot and makes systemwith extra overheads. The system-based Snapshot works better than volume based. The file system basedSnapshot does not need reserve space for snapshot. Therefore, the system-based Snapshot is considered agood solution in the computer environment where large-scale space management capability is not thatcritical. However, such snapshot feature is available in Linux kernel 2.2, 2.6 in EXT3 file system. In thispaper, we proposed a system-based snapshot for the ext4 system in Linux kernel 2.6 or higher. The main concept of the system-based snapshot mainly come from old-version system based Snapshot.Keywords: Snapcreate, kernel, Inode, SnapFS

    Building regulatory compliant storage systems

    Full text link
    In the past decade, informational records have become entirely digital. These include financial statements, health care records, student records, private consumer information and other sensitive data. Because of the delicate nature of the data these records contain, Congress and the courts have begun to recognize the importance of properly storing and securing electronic records. Examples of legislation in-clude the Health Insurance Portability and Accountabilit

    Analyzing Metadata Performance in Distributed File Systems

    Get PDF
    Distributed file systems are important building blocks in modern computing environments. The challenge of increasing I/O bandwidth to files has been largely resolved by the use of parallel file systems and sufficient hardware. However, determining the best means by which to manage large amounts of metadata, which contains information about files and directories stored in a distributed file system, has proved a more difficult challenge. The objective of this thesis is to analyze the role of metadata and present past and current implementations and access semantics. Understanding the development of the current file system interfaces and functionality is a key to understanding their performance limitations. Based on this analysis, a distributed metadata benchmark termed DMetabench is presented. DMetabench significantly improves on existing benchmarks and allows stress on metadata operations in a distributed file system in a parallelized manner. Both intranode and inter-node parallelity, current trends in computer architecture, can be explicitly tested with DMetabench. This is due to the fact that a distributed file system can have different semantics inside a client node rather than semantics between multiple nodes. As measurements in larger distributed environments may exhibit performance artifacts difficult to explain by reference to average numbers, DMetabench uses a time-logging technique to record time-related changes in the performance of metadata operations and also protocols additional details of the runtime environment for post-benchmark analysis. Using the large production file systems at the Leibniz Supercomputing Center (LRZ) in Munich, the functionality of DMetabench is evaluated by means of measurements on different distributed file systems. The results not only demonstrate the effectiveness of the methods proposed but also provide unique insight into the current state of metadata performance in modern file systems

    Autonomous storage management for low-end computing environments

    Get PDF
    To make storage management transparent to users, enterprises rely on expensive storage infrastructure, such as high end storage appliances, tape robots, and offsite storage facilities, maintained by full-time professional system administrators. From the user's perspective access to data is seamless regardless of location, backup requires no periodic, manual action by the user, and help is available to recover from storage problems. The equipment and administrators protect users from the loss of data due to failures, such as device crashes, user errors, or virii, as well as being inconvenienced by the unavailability of critical files. Home users and small businesses must manage increasing amounts of important data distributed among an increasing number of storage devices. At the same time, expert system administration and specialized backup hardware are rarely available in these environments, due to their high cost. Users must make do with error-prone, manual, and time-consuming ad hoc solutions, such as periodically copying data to an external hard drive. Non-technical users are likely to make mistakes, which could result in the loss of a critical piece of data, such as a tax return, customer database, or an irreplaceable digital photograph. In this thesis, we show how to provide transparent storage management for home and small business users We introduce two new systems: The first, PodBase, transparently ensures availability and durability for mobile, personal devices that are mostly disconnected. The second, SLStore, provides enterprise-level data safety (e.g. protection from user error, software faults, or virus infection) without requiring expert administration or expensive hardware. Experimental results show that both systems are feasible, perform well, require minimal user attention, and do not depend on expert administration during disaster-free operation. PodBase relieves home users of many of the burdens of managing data on their personal devices. In the home environment, users typically have a large number of personal devices, many of them mobile devices, each of which contain storage, and which connect to each other intermittently. Each of these devices contain data that must be made durable, and available on other storage devices. Ensuring durability and availability is difficult and tiresome for non-expert users, as they must keep track of what data is stored on which devices. PodBase transparently ensures the durability of data despite the loss or failure of a subset of devices; at the same time, PodBase aims to make data available on all the devices appropriate for a given data type. PodBase takes advantage of storage resources and network bandwidth between devices that typically goes unused. The system uses an adaptive replication algorithm, which makes replication transparent to the user, even when complex replication strategies are necessary. Results from a prototype deployment in a small community of users show that PodBase can ensure the durability and availability of data stored on personal devices under a wide range of conditions with minimal user attention. Our second system, SLStore, brings enterprise-level data protection to home office and small business computing. It ensures that data can be recovered despite incidents like accidental data deletion, data corruption resulting from software errors or security breaches, or even catastrophic storage failure. However, unlike enterprise solutions, SLStore does riot require professional system administrators, expensive backup hard- ware, or routine, manual actions on the part of the user. The system relies on storage leases, which ensure that data cannot be overwritten for a pre-determined period, and an adaptive storage management layer which automatically adapts the level of backup to the storage available. We show that this system is both practical, reliable and easy to manage, even in the presence of hardware and software faults

    Rethinking the I/O Stack for Persistent Memory

    Get PDF
    Modern operating systems have been designed around the hypotheses that (a) memory is both byte-addressable and volatile and (b) storage is block addressable and persistent. The arrival of new Persistent Memory (PM) technologies, has made these assumptions obsolete. Despite much of the recent work in this space, the need for consistently sharing PM data across multiple applications remains an urgent, unsolved problem. Furthermore, the availability of simple yet powerful operating system support remains elusive. In this dissertation, we propose and build The Region System – a high-performance operating system stack for PM that implements usable consistency and persistence for application data. The region system provides support for consistently mapping and sharing data resident in PM across user application address spaces. The region system creates a novel IPI based PMSYNC operation, which ensures atomic persistence of mapped pages across multiple address spaces. This allows applications to consume PM using the well understood and much desired memory like model with an easy-to-use interface. Next, we propose a metadata structure without any redundant metadata to reduce CPU cache flushes. The high-performance design minimizes the expensive PM ordering and durability operations by embracing a minimalistic approach to metadata construction and management. To strengthen the case for the region system, in this dissertation, we analyze different types of applications to identify their dependence on memory mapped data usage, and propose user level libraries LIBPM-R and LIBPMEMOBJ-R to support shared persistent containers. The user level libraries along with the region system demonstrate a comprehensive end-to-end software stack for consuming the PM devices

    Building high-performance web-caching servers

    Get PDF
    corecore