64 research outputs found

    Workload characterization and synthesis for data center optimization

    Get PDF

    Workload Prediction for Efficient Performance Isolation and System Reliability

    Get PDF
    In large-scaled and distributed systems, like multi-tier storage systems and cloud data centers, resource sharing among workloads brings multiple benefits while introducing many performance challenges. The key to effective workload multiplexing is accurate workload prediction. This thesis focuses on how to capture the salient characteristics of the real-world workloads to develop workload prediction methods and to drive scheduling and resource allocation policies, in order to achieve efficient and in-time resource isolation among applications. For a multi-tier storage system, high-priority user work is often multiplexed with low-priority background work. This brings the challenge of how to strike a balance between maintaining the user performance and maximizing the amount of finished background work. In this thesis, we propose two resource isolation policies based on different workload prediction methods: one is a Markovian model-based and the other is a neural networks-based. These policies aim at, via workload prediction, discovering the opportune time to schedule background work with minimum impact on user performance. Trace-driven simulations verify the efficiency of the two pro- posed resource isolation policies. The Markovian model-based policy successfully schedules the background work at the appropriate periods with small impact on the user performance. The neural networks-based policy adaptively schedules user and background work, resulting in meeting both performance requirements consistently. This thesis also proposes an accurate while efficient neural networks-based pre- diction method for data center usage series, called PRACTISE. Different from the traditional neural networks for time series prediction, PRACTISE selects the most informative features from the past observations of the time series itself. Testing on a large set of usage series in production data centers illustrates the accuracy (e.g., prediction error) and efficiency (e.g., time cost) of PRACTISE. The superiority of the usage prediction also allows a proactive resource management in the highly virtualized cloud data centers. In this thesis, we analyze on the performance tickets in the cloud data centers, and propose an active sizing algorithm, named ATM, that predicts the usage workloads and re-allocates capacity to work- loads to avoid VM performance tickets. Moreover, driven by cheap prediction of usage tails, we also present TailGuard in this thesis, which dynamically clones VMs among co-located boxes, in order to efficiently reduce the performance violations of physical boxes in cloud data centers

    Minimally Invasive Solutions to Challenges Posed by Mobility Changes

    Get PDF
    Today, things have changed radically. As network technologies have proliferated and evolved, the components of, and participants in, computerized systems have become increasingly decoupled. Users travel and commute while connecting to their office computer or home media server. Hardware devices may be carried by users, move on their own, or reside in data centers, never to be seen or touched by end-users. Even operating systems (OSes) and applications may now migrate across the network while executing, thanks to advances in virtualization that are only just beginning to remake the computing landscape. The decoupling of users, devices, and software has invalidated properties that enabled desired functionality: resulting in compromised function. Power interfaces utilize physi- cal user interactions to determine when transitions between high and lower power states should occur; what happens when users are no longer physically present? Operating system execution often relies on components such as CPU and local disk responding with tightly bounded delays; what should be done when the OS itself is in the process of migrating between two separate physical machines? The fundamental question explored by this dissertation is: Can we find highly adoptable solutions to restore desired functionality that has been lost because of changed mobility characteristics? Our emphasis on adoptability stems from pragmatic concerns: if a solution is difficult to adopt, it is highly unlikely to be used. Consequently, while many potential approaches may involve changes to the network itself, our work focuses on modifying end-point behavior. We show that practical solutions implemented solely in software and deployed only on network endpoints can be developed for a wide problem range. We consider concrete challenges arising from user, device, and software mobility changes, affecting sub-disciplines spanning cloud computing, green computing, and wireless networks. Cloud Computing: Users increasingly utilize virtual machine (VM) technology to migrate and replicate OS and software amongst networked hosts. Traditional execution required one VM image copy on each host's local storage. By transitioning to networked execution, dozens, if not hundreds, of VM replicas may now be distributed from a single networked storage location to a commensurately large set of physical machines. As these systems expand, they have come to be plagued by boot storms (and similar problems) caused when networked access to storage becomes a major bottleneck, drastically delaying VM distribution and execution. Can we develop techniques that resolve this network bottleneck without the need for expensive hardware over-provisioning? Green Computing: Remote access technologies have enabled users to travel while still interacting with computational machinery left in the office or home. Yet, energy savings mechanisms have traditionally relied on the activity of attached peripherals to determine power usage. The shift to remote interaction, which bypasses physically attached peripherals, has effectively broken these energy savings mechanisms. Can we build an economic and practical system that accommodates energy efficiency without compromising the fluid remote interactions users have now come to expect? Wireless Computing: Increasingly advanced mobile devices have provoked a shift towards heavy usage of 3G and 4G bandwidth use. Accordingly, the capacity of infrastructure wireless networks becomes increasingly strained. Can we find a way of supplementing this relatively low-latency infrastructure with high-latency, high-bandwidth opportunistic content exchange? In each scenario, we design a solution that aims to strike the proper balance between adoptability and technical efficiency - producing what we believe are rigorous, practical and adoptable solutions

    Flexibility in Data Management

    Get PDF
    With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data management. For instance, investigative analytics and agile software development move towards a very agile and flexible handling of data. As the primary facilitators of data management, database systems have to reflect and support these developments. However, traditional database management technology, in particular relational database systems, is built on assumptions of relatively stable application domains. The need to model all data up front in a prescriptive database schema earned relational database management systems the reputation among developers of being inflexible, dated, and cumbersome to work with. Nevertheless, relational systems still dominate the database market. They are a proven, standardized, and interoperable technology, well-known in IT departments with a work force of experienced and trained developers and administrators. This thesis aims at resolving the growing contradiction between the popularity and omnipresence of relational systems in companies and their increasingly bad reputation among developers. It adapts relational database technology towards more agility and flexibility. We envision a descriptive schema-comes-second relational database system, which is entity-oriented instead of schema-oriented; descriptive rather than prescriptive. The thesis provides four main contributions: (1)~a flexible relational data model, which frees relational data management from having a prescriptive schema; (2)~autonomous physical entity domains, which partition self-descriptive data according to their schema properties for better query performance; (3)~a freely adjustable storage engine, which allows adapting the physical data layout used to properties of the data and of the workload; and (4)~a self-managed indexing infrastructure, which autonomously collects and adapts index information under the presence of dynamic workloads and evolving schemas. The flexible relational data model is the thesis\' central contribution. It describes the functional appearance of the descriptive schema-comes-second relational database system. The other three contributions improve components in the architecture of database management systems to increase the query performance and the manageability of descriptive schema-comes-second relational database systems. We are confident that these four contributions can help paving the way to a more flexible future for relational database management technology
    • …
    corecore