10,908 research outputs found

    Workload Interleaving with Performance Guarantees in Data Centers

    Get PDF
    In the era of global, large scale data centers residing in clouds, many applications and users share the same pool of resources for the purposes of reducing energy and operating costs, and of improving availability and reliability. Along with the above benefits, resource sharing also introduces performance challenges: when multiple workloads access the same resources concurrently, contention may occur and introduce delays in the performance of individual workloads. Providing performance isolation to individual workloads needs effective management methodologies. The challenges of deriving effective management methodologies lie in finding accurate, robust, compact metrics and models to drive algorithms that can meet different performance objectives while achieving efficient utilization of resources. This dissertation proposes a set of methodologies aiming at solving the challenging performance isolation problem in workload interleaving in data centers, focusing on both storage components and computing components. at the storage node level, we focus on methodologies for better interleaving user traffic with background workloads, such as tasks for improving reliability, availability, and power savings. More specifically, a scheduling policy for background workload based on the statistical characteristics of the system busy periods and a methodology that quantitatively estimates the performance impact of power savings are developed. at the storage cluster level, we consider methodologies on how to efficiently conduct work consolidation and schedule asynchronous updates without violating user performance targets. More specifically, we develop a framework that can estimate beforehand the benefits and overheads of each option in order to automate the process of reaching intelligent consolidation decisions while achieving faster eventual consistency. at the computing node level, we focus on improving workload interleaving at off-the-shelf servers as they are the basic building blocks of large-scale data centers. We develop priority scheduling middleware that employs different policies to schedule background tasks based on the instantaneous resource requirements of the high priority applications running on the server node. Finally, at the computing cluster level, we investigate popular computing frameworks for large-scale data intensive distributed processing, such as MapReduce and its Hadoop implementation. We develop a new Hadoop scheduler called DyScale to exploit capabilities offered by heterogeneous cores in order to achieve a variety of performance objectives

    Low Power system Design techniques for mobile computers

    Get PDF
    Portable products are being used increasingly. Because these systems are battery powered, reducing power consumption is vital. In this report we give the properties of low power design and techniques to exploit them on the architecture of the system. We focus on: min imizing capacitance, avoiding unnecessary and wasteful activity, and reducing voltage and frequency. We review energy reduction techniques in the architecture and design of a hand-held computer and the wireless communication system, including error control, sys tem decomposition, communication and MAC protocols, and low power short range net works

    Design techniques for low-power systems

    Get PDF
    Portable products are being used increasingly. Because these systems are battery powered, reducing power consumption is vital. In this report we give the properties of low-power design and techniques to exploit them on the architecture of the system. We focus on: minimizing capacitance, avoiding unnecessary and wasteful activity, and reducing voltage and frequency. We review energy reduction techniques in the architecture and design of a hand-held computer and the wireless communication system including error control, system decomposition, communication and MAC protocols, and low-power short range networks

    Test, Control and Monitor System (TCMS) operations plan

    Get PDF
    The purpose is to provide a clear understanding of the Test, Control and Monitor System (TCMS) operating environment and to describe the method of operations for TCMS. TCMS is a complex and sophisticated checkout system focused on support of the Space Station Freedom Program (SSFP) and related activities. An understanding of the TCMS operating environment is provided and operational responsibilities are defined. NASA and the Payload Ground Operations Contractor (PGOC) will use it as a guide to manage the operation of the TCMS computer systems and associated networks and workstations. All TCMS operational functions are examined. Other plans and detailed operating procedures relating to an individual operational function are referenced within this plan. This plan augments existing Technical Support Management Directives (TSMD's), Standard Practices, and other management documentation which will be followed where applicable

    Space station automation of common module power management and distribution

    Get PDF
    The purpose is to automate a breadboard level Power Management and Distribution (PMAD) system which possesses many functional characteristics of a specified Space Station power system. The automation system was built upon 20 kHz ac source with redundancy of the power buses. There are two power distribution control units which furnish power to six load centers which in turn enable load circuits based upon a system generated schedule. The progress in building this specified autonomous system is described. Automation of Space Station Module PMAD was accomplished by segmenting the complete task in the following four independent tasks: (1) develop a detailed approach for PMAD automation; (2) define the software and hardware elements of automation; (3) develop the automation system for the PMAD breadboard; and (4) select an appropriate host processing environment

    ASCR/HEP Exascale Requirements Review Report

    Full text link
    This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio

    Spike: AI scheduling for Hubble Space Telescope after 18 months of orbital operations

    Get PDF
    This paper is a progress report on the Spike scheduling system, developed by the Space Telescope Science Institute for long-term scheduling of Hubble Space Telescope (HST) observations. Spike is an activity-based scheduler which exploits artificial intelligence (AI) techniques for constraint representation and for scheduling search. The system has been in operational use since shortly after HST launch in April 1990. Spike was adopted for several other satellite scheduling problems; of particular interest was the demonstration that the Spike framework is sufficiently flexible to handle both long-term and short-term scheduling, on timescales of years down to minutes or less. We describe the recent progress made in scheduling search techniques, the lessons learned from early HST operations, and the application of Spike to other problem domains. We also describe plans for the future evolution of the system
    • 

    corecore