5,097 research outputs found

    Automatic Energy Saving Schemes for Parallel Applications

    Get PDF
    Although high-performance computing traditionally focuses on the efficient execution of large-scale applications, both energy and power have become critical concerns when approaching exascale. Drastic increases in the power consumption of supercomputers affect significantly their operating costs and failure rates. In modern microprocessor architectures, equipped with dynamic voltage and frequency scaling (DVFS) and CPU clock modulation (throttling), the power consumption may be controlled in software. Additionally, network interconnect, such as Infiniband, may be exploited to maximize energy savings while the application performance loss and frequency switching overheads must be carefully balanced. This work first studies two important collective communication operations, all-to-all and allgather and proposes energy saving strategies on the per-call basis. Next, it targets point-to-point communications to group them into phases and apply frequency scaling to them to save energy by exploiting the architectural and communication stalls. Finally, it proposes an automatic runtime system which combines both collective and point-to-point communications into phases, and applies throttling to them apart from DVFS to maximize energy savings. The experimental results are presented for NAS parallel benchmark problems as well as for the realistic parallel electronic structure calculations performed by the widely used quantum chemistry package GAMESS. Close to the maximum energy savings were obtained with a substantially low performance loss on the given platform

    Introduction to the Computation Offloading from Mobile Devices to the Edge of Mobile Network

    Get PDF
    This paper introduces the concept of Small Cell Cloud (SCC) composed of multiple Cloud-enabled Small Cells (CeSCs), which provide radio connection for mobile User Equipment (UE) such as smart-phones or wearables such as smart glasses. Moreover, CeSCs host computations offloaded from UEs in a way similar to centralized cloud, yet different in its proximity to users. Proposed client-server architecture of SCC con-veys mechanisms for moving offloaded computations from the UEs to CeSCs. Real-life implementation of the SCC architecture relies on custom-developed Of-floading Framework which is responsible for low-level communication between the UE and the SCC. The Of-floading Framework is accompanied by an Augmented Reality (AR) app, which employs intensive computa-tions for discovery of places of interest. Such app is latency-sensitive, a criterion which makes computation offloading beneficial due to its ability to decrease la-tency. The combination of the O˜oading Framework and the AR app makes up an SCC testbed used for fur-ther performance evaluation. Numerous measurements are carried out to examine the impact of various pa-rameters. Based on Proof-of-concept implementation and thorough measurements, it has been revealed that computation offloading can decrease overall latency as much as to 47 % and energy consumption on the UE side to 56

    Algorithms for advance bandwidth reservation in media production networks

    Get PDF
    Media production generally requires many geographically distributed actors (e.g., production houses, broadcasters, advertisers) to exchange huge amounts of raw video and audio data. Traditional distribution techniques, such as dedicated point-to-point optical links, are highly inefficient in terms of installation time and cost. To improve efficiency, shared media production networks that connect all involved actors over a large geographical area, are currently being deployed. The traffic in such networks is often predictable, as the timing and bandwidth requirements of data transfers are generally known hours or even days in advance. As such, the use of advance bandwidth reservation (AR) can greatly increase resource utilization and cost efficiency. In this paper, we propose an Integer Linear Programming formulation of the bandwidth scheduling problem, which takes into account the specific characteristics of media production networks, is presented. Two novel optimization algorithms based on this model are thoroughly evaluated and compared by means of in-depth simulation results

    Automatic application object migration in sensor networks

    Get PDF
    Object migration in wireless sensor networks has the potential to reduce energy consumption for a wireless sensor network mesh. Automated migration reduces the need for the programmer to perform manual static analysis to find an efficient layout solution. Instead, the system can self-optimise and adjust to changing conditions. This paper describes an automated, transparent object migration system for wireless sensor networks, implemented on a micro Java virtual machine. The migration system moves objects at runtime around the sensor mesh to reduce communication overheads. The movement of objects is transparent to the application developer. Automated transparent object migration is a core component of Hydra, a distributed operating system for wireless sensor networks that is currently under development. Performance of the system under a complex performance test scenario using a real-world dataset of seismic events is described. The results show that under both simple and complex conditions the migration technique can result in lower data traffic and consequently lower overall energy cost

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

    Runtime Energy Savings Based on Machine Learning Models for Multicore Applications

    Get PDF
    To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize energy savings under a given performance degradation. Machine learning techniques were utilized to develop performance models which would provide accurate performance prediction with change in operating core-uncore frequency. Experiments, performed on a node (28 cores) of a modern computing platform showed significant energy savings of as much as 26% with performance degradation of as low as 5% under the proposed strategy compared with the execution in the unlimited power case
    corecore