8 research outputs found

    On the effect of exploiting gpus for a more Eco-sustainable lease of life

    Get PDF
    It has been estimated that about 2% of global carbon dioxide emissions can be attributed to IT systems. Green (or sustainable) computing refers to supporting business critical computing needs with the least possible amount of power. This phenomenon changes the priorities in the design of new software systems and in the way companies handle existing ones. In this paper, we present the results of a research project aimed to develop a migration strategy to give an existing software system a new and more eco-sustainable lease of life. We applied a strategy for migrating a subject system that performs intensive and massive computation to a target architecture based on a Graphics Processing Unit (GPU). We validated our solution on a system for path finding robot simulations. An analysis on execution time and energy consumption indicated that: (i) the execution time of the migrated system is less than the execution time of the original system; and (ii) the migrated system reduces energy waste, so suggesting that it is more eco-sustainable than its original version. Our findings improve the body of knowledge on the effect of using the GPU in green computing. © 2015 World Scientific Publishing Company

    Analysis of DVFS Techniques for Improving the GPU Energy Effi-ciency

    Get PDF
    Abstract Dynamic Voltage Frequency Scaling (DVFS) techniques are used to improve energy efficiency of GPUs. Literature survey and thorough analysis of various schemes on DVFS techniques during the last decade are presented in this paper. Detailed analysis of the schemes is included with respect to comparison of various DVFS techniques over the years. To endow with knowledge of various power management techniques that utilize DVFS during the last decade is the main objective of this paper. During the study, we find that DVFS not only work solely but also in coordination with other power optimization techniques like load balancing and task mapping where performance and energy efficiency are affected by varying the platform and benchmark. Thorough analysis of various schemes on DVFS techniques is presented in this paper such that further research in the field of DVFS can be enhanced

    Design and Synthesis of Stable, Aligned and Welded Magnesium Silicide Nanowire Assemblies for Fabrication of Efficient and Reliable Thermoelectrics

    Get PDF
    The ever-increasing energy needs of humanity, coupled with geopolitics of fossil fuels, are imposing demand on the energy supply. A perennial supply of energy is possible by tapping into renewable sources as well as making the current process more efficient. This can be achieved via the use of thermoelectrics, i.e. they can be used to generate energy by converting sunlight into electricity via solar thermoelectrics and increase the efficiency of current processes by converting the wasted heat into electricity. However, the current state-of-the-art thermoelectrics are inefficient and expensive due to the use of rare earth materials. Nanomaterials, especially nanowires, are on the forefront of advancement in thermoelectrics. Recent theoretical and experimental reports indicate the superior energy conversion performances of nanowires compared to their bulk counter parts. However, this superior performance is only observed in single-nanowire devices. Despite more than two decades of research, macro-devices based on large-scale nanowire assemblies, have not been realized. The three main roadblocks for fabricating such large-scale assemblies are lack of 1) methods for mass producing nanowires of any desired chemical composition, 2) techniques for assembling these nanowires into energy conversion devices in an interface engineered manner, and 3) techniques to stabilize the nanowires at higher temperatures against air- water- and acid-assisted degradation. In this context, the objective of this work is to design a strategy to obtain stable and efficient nanowire-based thermoelectric devices. The material systems chosen for this effort are Zinc Phosphide Zn3P2, Magnesium Silicide, Mg2Si, and Manganese Silicide MnSi1.75. Methods for mass production of Zn3P2 and Mg2Si nanowires have been realized previously via a combination of chemical vapor deposition, electroless etching and solid-state diffusion. The assembly and aligning of nanowires was achieved using shear forces via Equal Channel Angular Extrusion (a severe plastic deformation technique). Assembly of the nanowires was also achieved via welding of nanowires utilizing the solid-state diffusion. The nanowires were chemically stabilized by nonconformally decorating them with boron nitride. The stabilization and welding of nanowires helped to achieve decreased thermal conductivity thereby improving their thermoelectric performance compared to the as-obtained nanowire pellets

    Efficient implementation of resource-constrained cyber-physical systems using multi-core parallelism

    Get PDF
    The quest for more performance of applications and systems became more challenging in the recent years. Especially in the cyber-physical and mobile domain, the performance requirements increased significantly. Applications, previously found in the high-performance domain, emerge in the area of resource-constrained domain. Modern heterogeneous high-performance MPSoCs provide a solid foundation to satisfy the high demand. Such systems combine general processors with specialized accelerators ranging from GPUs to machine learning chips. On the other side of the performance spectrum, the demand for small energy efficient systems exposed by modern IoT applications increased vastly. Developing efficient software for such resource-constrained multi-core systems is an error-prone, time-consuming and challenging task. This thesis provides with PA4RES a holistic semiautomatic approach to parallelize and implement applications for such platforms efficiently. Our solution supports the developer to find good trade-offs to tackle the requirements exposed by modern applications and systems. With PICO, we propose a comprehensive approach to express parallelism in sequential applications. PICO detects data dependencies and implements required synchronization automatically. Using a genetic algorithm, PICO optimizes the data synchronization. The evolutionary algorithm considers channel capacity, memory mapping, channel merging and flexibility offered by the channel implementation with respect to execution time, energy consumption and memory footprint. PICO's communication optimization phase was able to generate a speedup almost 2 or an energy improvement of 30% for certain benchmarks. The PAMONO sensor approach enables a fast detection of biological viruses using optical methods. With a sophisticated virus detection software, a real-time virus detection running on stationary computers was achieved. Within this thesis, we were able to derive a soft real-time capable virus detection running on a high-performance embedded system, commonly found in today's smart phones. This was accomplished with smart DSE algorithm which optimizes for execution time, energy consumption and detection quality. Compared to a baseline implementation, our solution achieved a speedup of 4.1 and 87\% energy savings and satisfied the soft real-time requirements. Accepting a degradation of the detection quality, which still is usable in medical context, led to a speedup of 11.1. This work provides the fundamentals for a truly mobile real-time virus detection solution. The growing demand for processing power can no longer satisfied following well-known approaches like higher frequencies. These so-called performance walls expose a serious challenge for the growing performance demand. Approximate computing is a promising approach to overcome or at least shift the performance walls by accepting a degradation in the output quality to gain improvements in other objectives. Especially for a safe integration of approximation into existing application or during the development of new approximation techniques, a method to assess the impact on the output quality is essential. With QCAPES, we provide a multi-metric assessment framework to analyze the impact of approximation. Furthermore, QCAPES provides useful insights on the impact of approximation on execution time and energy consumption. With ApproxPICO we propose an extension to PICO to consider approximate computing during the parallelization of sequential applications

    Exploration of cyber-physical systems for GPGPU computer vision-based detection of biological viruses

    Get PDF
    This work presents a method for a computer vision-based detection of biological viruses in PAMONO sensor images and, related to this, methods to explore cyber-physical systems such as those consisting of the PAMONO sensor, the detection software, and processing hardware. The focus is especially on an exploration of Graphics Processing Units (GPU) hardware for “General-Purpose computing on Graphics Processing Units” (GPGPU) software and the targeted systems are high performance servers, desktop systems, mobile systems, and hand-held systems. The first problem that is addressed and solved in this work is to automatically detect biological viruses in PAMONO sensor images. PAMONO is short for “Plasmon Assisted Microscopy Of Nano-sized Objects”. The images from the PAMONO sensor are very challenging to process. The signal magnitude and spatial extension from attaching viruses is small, and it is not visible to the human eye on raw sensor images. Compared to the signal, the noise magnitude in the images is large, resulting in a small Signal-to-Noise Ratio (SNR). With the VirusDetectionCL method for a computer vision-based detection of viruses, presented in this work, an automatic detection and counting of individual viruses in PAMONO sensor images has been made possible. A data set of 4000 images can be evaluated in less than three minutes, whereas a manual evaluation by an expert can take up to two days. As the most important result, sensor signals with a median SNR of two can be handled. This enables the detection of particles down to 100 nm. The VirusDetectionCL method has been realized as a GPGPU software. The PAMONO sensor, the detection software, and the processing hardware form a so called cyber-physical system. For different PAMONO scenarios, e.g., using the PAMONO sensor in laboratories, hospitals, airports, and in mobile scenarios, one or more cyber-physical systems need to be explored. Depending on the particular use case, the demands toward the cyber-physical system differ. This leads to the second problem for which a solution is presented in this work: how can existing software with several degrees of freedom be automatically mapped to a selection of hardware architectures with several hardware configurations to fulfill the demands to the system? Answering this question is a difficult task. Especially, when several possibly conflicting objectives, e.g., quality of the results, energy consumption, and execution time have to be optimized. An extensive exploration of different software and hardware configurations is expensive and time-consuming. Sometimes it is not even possible, e.g., if the desired architecture is not yet available on the market or the design space is too big to be explored manually in reasonable time. A Pareto optimal selection of software parameters, hardware architectures, and hardware configurations has to be found. To achieve this, three parameter and design space exploration methods have been developed. These are named SOG-PSE, SOG-DSE, and MOGEA-DSE. MOGEA-DSE is the most advanced method of these three. It enables a multi-objective, energy-aware, measurement-based or simulation-based exploration of cyber-physical systems. This can be done in a hardware/software codesign manner. In addition, offloading of tasks to a server and approximate computing can be taken into account. With the simulation-based exploration, systems that do not exist can be explored. This is useful if a system should be equipped, e.g., with the next generation of GPUs. Such an exploration can reveal bottlenecks of the existing software before new GPUs are bought. With MOGEA-DSE the overall goal—to develop a method to automatically explore suitable cyber-physical systems for different PAMONO scenarios—could be achieved. As a result, a rapid, reliable detection and counting of viruses in PAMONO sensor data using high-performance, desktop, laptop, down to hand-held systems has been made possible. The fact that this could be achieved even for a small, hand-held device is the most important result of MOGEA-DSE. With the automatic parameter and design space exploration 84% energy could be saved on the hand-held device compared to a baseline measurement. At the same time, a speedup of four and an F-1 quality score of 0.995 could be obtained. The speedup enables live processing of the sensor data on the embedded system with a very high detection quality. With this result, viruses can be detected and counted on a mobile, hand-held device in less than three minutes and with real-time visualization of results. This opens up completely new possibilities for biological virus detection that were not possible before
    corecore