27 research outputs found
Clockwise: a mixed-media file system
This paper presents Clockwise, a mixed-media file system. The primary goal of Clockwise is to provide a storage architecture that supports the storage and retrieval of best-effort and real-time file system data. Clockwise provides an abstraction called a dynamic partition that groups lists of related (large) blocks on one or more disks. Dynamic partitions can grow and shrink in size and reading or writing of dynamic partitions can be scheduled explicitly. With respect to scheduling, Clockwise uses a novel strategy to pre-calculate schedule slack time and it schedules best-effort requests before queued real-time requests in this slack tim
CRAID: Online RAID upgrades using dynamic hot data reorganization
Current algorithms used to upgrade RAID arrays typically require large amounts of data to be migrated, even those that move only the minimum amount of data required to keep a balanced data load. This paper presents CRAID, a self-optimizing RAID array that performs an online block reorganization of frequently used, long-term accessed data in order to reduce this migration even further. To achieve this objective, CRAID tracks frequently used, long-term data blocks and copies them to a dedicated partition spread across all the disks in the array. When new disks are added, CRAID only needs to extend this process to the new devices to redistribute this partition, thus greatly reducing the overhead of the upgrade process. In addition, the reorganized access patterns within this partition improve the array’s performance, amortizing the copy overhead and allowing CRAID to offer a performance competitive with traditional RAIDs.
We describe CRAID’s motivation and design and we evaluate it by replaying seven real-world workloads including a file server, a web server and a user share. Our experiments show that CRAID can successfully detect hot data variations and begin using new disks as soon as they are added to the array. Also, the usage of a dedicated
partition improves the sequentiality of relevant data access, which amortizes the cost of reorganizations. Finally, we prove that a full-HDD CRAID array with a small distributed partition (<1.28% per disk) can compete in performance with an ideally restriped RAID-5 and a hybrid RAID-5 with a small SSD cache.Peer ReviewedPostprint (published version
Cut-and-paste file-systems: integrating simulators and file-systems
We have implemented an integrated and configurable file system called the PFS and a trace-driven file-system simulator called Patsy. Patsy is used for off-line analysis of file-system algorithms, PFS is used for on-line file-system data storage. Algorithms are first analyzed in Patsy and when we are satisfied\ud
with the performance results, migrated into PFS for on-line usage. Since Patsy and PFS are derived from a common cut-and-paste file-system framework, this migration proceeds smoothly.\ud
We have found this integration quite useful: algorithm bottlenecks have been found through Patsy that could have led to performance degradations in PFS. Off-line simulators are simpler to analyze compared to on-line file-systems because a work load can repeatedly be replayed on the same off-line simulator. This is almost impossible in on-line file-systems since it is hard to provide similar conditions for each experiment run. Since simulator and file-system are integrated (hence, use the same code), experiment results from the simulator have relevance in the real system. \ud
This paper describes the cut-and-paste framework, the instantiation of the framework to PFS and Patsy and finally, some of the experiments we conducted in Patsy
Introduction to Multiprocessor I/O Architecture
The computational performance of multiprocessors continues to improve by leaps and bounds, fueled in part by rapid improvements in processor and interconnection technology. I/O performance thus becomes ever more critical, to avoid becoming the bottleneck of system performance. In this paper we provide an introduction to I/O architectural issues in multiprocessors, with a focus on disk subsystems. While we discuss examples from actual architectures and provide pointers to interesting research in the literature, we do not attempt to provide a comprehensive survey. We concentrate on a study of the architectural design issues, and the effects of different design alternatives
Recommended from our members
Logging versus Soft Updates: Asynchronous Meta-data Protection in File Systems
The UNIX Fast File System (FFS) is probably the most widely-used file system for performance comparisons. However, such comparisons frequently overlook many of the performance enhancements that have been added over the past decade. In this paper, we explore the two most commonly used approaches for improving the performance of meta-data operations and recovery: logging and Soft Updates. The commercial sector has moved en masse to logging file systems, as evidenced by their presence on nearly every server platform available today: Solaris, AIX, Digital UNIX, HP-UX, Irix, and Windows NT. On all but Solaris, the default file system uses logging. In the meantime, Soft Updates holds the promise of providing stronger reliability guarantees than logging, with faster recovery and superior performance in certain boundary cases. In this paper, we explore the benefits of both Soft Updates and logging, comparing their behavior on both microbenchmarks and workload-based macrobenchmarks. We find that logging alone is not sufficient to “solve” the meta-data update problem. If synchronous semantics are required (i.e., meta-data operations are durable once the system call returns), then the logging systems cannot realize their full potential. Only when this synchronicity requirement is relaxed can logging systems approach the performance of systems like Soft Updates. Our asynchronous logging and Soft Updates systems perform comparably in most cases. While Soft Updates excels in some meta-data intensive microbenchmarks, it outperforms logging on only two of the four workloads we examined and performs less well on one.Engineering and Applied Science
Cooperative caching and prefetching in parallel/distributed file systems
If we examine the structure of the applications that run on parallel machines, we observe that their I/O needs increase tremendously every day. These applications work with very large data sets which, in most cases, do not fit in memory and have to be kept in the disk. The input and output data files are also very large and have to be accessed very fast. These large applications also want to be able to checkpoint themselves without wasting too much time. These facts constantly increase the expectations placed on parallel and distributed file systems. Thus, these file systems have to improve their performance to avoid becoming the bottleneck in parallel/distributed environments. On the other hand, while the performance of the new processors, interconnection networks and memory increases very rapidly, no such thing happens with the disk performance. This lack of improvement is due to the mechanical parts used to build the disks. These components are slow and limit both the latency and t..