thesis

Matching distributed file systems with application workloads

Abstract

Modern storage systems have a large number of configurable parameters, distributed over many layers of abstraction. The number of combinations of these parameters, that can be altered to create an instance of such a system, is enormous. In practise, many of these parameters are never altered; instead default values, intended to support generic workloads and access patterns, are used. As systems become larger and evolve to support different workloads, the appropriateness of using default parameters in this way comes into question. This thesis examines the implications of changing some of these parameters and explores the effects these changes have on performance. As part of that work multiple contributions have been made, including the creation of a structured method to create and evaluate different storage configurations, choosing appropriate access sizes for the evaluation, picking representative cloud workloads and capturing storage traces for further analysis, extraction of the workload storage characteristics, creating logical partitions of the distributed file system used for the optimization, the creation of heterogeneous storage pools within the homogeneous system and the mapping and evaluation of the chosen workloads to the examined configurations

    Similar works