1 research outputs found

    Towards a Storage Stack for the Data Center

    No full text
    The storage stack in a data center consists of all the hardware and software layers involved in processing and persisting data to durable storage. The shift of the world's computation to data centers is placing significant strain on the storage stack, leading to a stack that is unreliable and non-performant. This is caused in large part by a lack of understanding of the failure and performance characteristics of critical hardware components, and a lack of programmability and control over the numerous software layers in the stack. The broad goal of this thesis is to improve the storage stack by leveraging insights gained from empirical studies of real-world production systems, and by proposing a new paradigm for implementing and enhancing distributed storage functionality that enables the vertical specialization of the storage stack to a wide variety of customer and data center provider needs. The first part of this thesis studies the reliability of main memory in large-scale production systems. Our findings show that conventional wisdom about memory reliability is incorrect, and that physical hardware is in fact the main culprit for most errors in main memory in the field. As a result, existing memory error protection mechanisms are inadequate. We then use the insights gained from the empirical study to propose and evaluate a suitable error protection mechanism for future data centers. The second part of this thesis offers an empirical study of the effects of temperature on the performance and power consumption of the storage stack. Since cooling constitutes a large fraction of the total cost of ownership in a data center, increasing temperatures in a data center without sacrificing performance can have a huge impact on the power consumption and carbon footprint of data centers. The final part of this thesis proposes a new paradigm for implementing and enhancing distributed storage functionality by creating programmable APIs that allow dynamic configuration and control of the software stages along the storage stack, and designing and implementing an IO routing primitive for the storage stack.Ph.D
    corecore