620 research outputs found

    Adaptive heterogeneous parallelism for semi-empirical lattice dynamics in computational materials science.

    Get PDF
    With the variability in performance of the multitude of parallel environments available today, the conceptual overhead created by the need to anticipate runtime information to make design-time decisions has become overwhelming. Performance-critical applications and libraries carry implicit assumptions based on incidental metrics that are not portable to emerging computational platforms or even alternative contemporary architectures. Furthermore, the significance of runtime concerns such as makespan, energy efficiency and fault tolerance depends on the situational context. This thesis presents a case study in the application of both Mattsons prescriptive pattern-oriented approach and the more principled structured parallelism formalism to the computational simulation of inelastic neutron scattering spectra on hybrid CPU/GPU platforms. The original ad hoc implementation as well as new patternbased and structured implementations are evaluated for relative performance and scalability. Two new structural abstractions are introduced to facilitate adaptation by lazy optimisation and runtime feedback. A deferred-choice abstraction represents a unified space of alternative structural program variants, allowing static adaptation through model-specific exhaustive calibration with regards to the extrafunctional concerns of runtime, average instantaneous power and total energy usage. Instrumented queues serve as mechanism for structural composition and provide a representation of extrafunctional state that allows realisation of a market-based decentralised coordination heuristic for competitive resource allocation and the Lyapunov drift algorithm for cooperative scheduling

    Energy aware approach for HPC systems

    Get PDF
    International audienceHigh‐performance computing (HPC) systems require energy during their full life cycle from design and production to transportation to usage and recycling/dismanteling. Because of increase of ecological and cost awareness, energy performance is now a primary focus. This chapter focuses on the usage aspect of HPC and how adapted and optimized software solutions could improve energy efficiency. It provides a detailed explanation of server power consumption, and discusses the application of HPC, phase detection, and phase identification. The chapter also suggests that having the load and memory access profiles is insufficient for an effective evaluation of the power consumed by an application. The available leverages in HPC systems are also shown in detail. The chapter proposes some solutions for modeling the power consumption of servers, which allows designing power prediction models for better decision making.These approaches allow the deployment and usage of a set of available green leverages, permitting energy reduction

    Power, Performance, and Energy Management of Heterogeneous Architectures

    Get PDF
    abstract: Many core modern multiprocessor systems-on-chip offers tremendous power and performance optimization opportunities by tuning thousands of potential voltage, frequency and core configurations. Applications running on these architectures are becoming increasingly complex. As the basic building blocks, which make up the application, change during runtime, different configurations may become optimal with respect to power, performance or other metrics. Identifying the optimal configuration at runtime is a daunting task due to a large number of workloads and configurations. Therefore, there is a strong need to evaluate the metrics of interest as a function of the supported configurations. This thesis focuses on two different types of modern multiprocessor systems-on-chip (SoC): Mobile heterogeneous systems and tile based Intel Xeon Phi architecture. For mobile heterogeneous systems, this thesis presents a novel methodology that can accurately instrument different types of applications with specific performance monitoring calls. These calls provide a rich set of performance statistics at a basic block level while the application runs on the target platform. The target architecture used for this work (Odroid XU3) is capable of running at 4940 different frequency and core combinations. With the help of instrumented application vast amount of characterization data is collected that provides details about performance, power and CPU state at every instrumented basic block across 19 different types of applications. The vast amount of data collected has enabled two runtime schemes. The first work provides a methodology to find optimal configurations in heterogeneous architecture using classifiers and demonstrates an average increase of 93%, 81% and 6% in performance per watt compared to the interactive, ondemand and powersave governors, respectively. The second work using same data shows a novel imitation learning framework for dynamically controlling the type, number, and the frequencies of active cores to achieve an average of 109% PPW improvement compared to the default governors. This work also presents how to accurately profile tile based Intel Xeon Phi architecture while training different types of neural networks using open image dataset on deep learning framework. The data collected allows deep exploratory analysis. It also showcases how different hardware parameters affect performance of Xeon Phi.Dissertation/ThesisMasters Thesis Engineering 201

    DNA-inspired Scheme for Building the Energy Profile of HPC Systems

    Get PDF
    International audienceEnergy usage is becoming a challenge for the design of next generation large scale distributed systems. This paper explores an inno- vative approach of profiling such systems. It proposes a DNA-like solution without making any assumptions on the running applications and used hardware. This profiling based on internal counters usage and energy monitoring allows to isolate specific phases during the execution and enables some energy consumption control and energy usage prediction. First experimental validations of the system modeling are presented and analyzed

    Power-Performance Modeling and Adaptive Management of Heterogeneous Mobile Platforms​

    Get PDF
    abstract: Nearly 60% of the world population uses a mobile phone, which is typically powered by a system-on-chip (SoC). While the mobile platform capabilities range widely, responsiveness, long battery life and reliability are common design concerns that are crucial to remain competitive. Consequently, state-of-the-art mobile platforms have become highly heterogeneous by combining a powerful SoC with numerous other resources, including display, memory, power management IC, battery and wireless modems. Furthermore, the SoC itself is a heterogeneous resource that integrates many processing elements, such as CPU cores, GPU, video, image, and audio processors. Therefore, CPU cores do not dominate the platform power consumption under many application scenarios. Competitive performance requires higher operating frequency, and leads to larger power consumption. In turn, power consumption increases the junction and skin temperatures, which have adverse effects on the device reliability and user experience. As a result, allocating the power budget among the major platform resources and temperature control have become fundamental consideration for mobile platforms. Dynamic thermal and power management algorithms address this problem by putting a subset of the processing elements or shared resources to sleep states, or throttling their frequencies. However, an adhoc approach could easily cripple the performance, if it slows down the performance-critical processing element. Furthermore, mobile platforms run a wide range of applications with time varying workload characteristics, unlike early generations, which supported only limited functionality. As a result, there is a need for adaptive power and performance management approaches that consider the platform as a whole, rather than focusing on a subset. Towards this need, our specific contributions include (a) a framework to dynamically select the Pareto-optimal frequency and active cores for the heterogeneous CPUs, such as ARM big.Little architecture, (b) a dynamic power budgeting approach for allocating optimal power consumption to the CPU and GPU using performance sensitivity models for each PE, (c) an adaptive GPU frame time sensitivity prediction model to aid power management algorithms, and (d) an online learning algorithm that constructs adaptive run-time models for non-stationary workloads.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Teenustele orienteeritud ja tÔendite-teadlik mobiilne pilvearvutus

    Get PDF
    Arvutiteaduses on kaks kĂ”ige suuremat jĂ”udu: mobiili- ja pilvearvutus. Kui pilvetehnoloogia pakub kasutajale keerukate ĂŒlesannete lahendamiseks salvestus- ning arvutusplatvormi, siis nutitelefon vĂ”imaldab lihtsamate ĂŒlesannete lahendamist mistahes asukohas ja mistahes ajal. TĂ€psemalt on mobiilseadmetel vĂ”imalik pilve vĂ”imalusi Ă€ra kasutades energiat sÀÀsta ning jagu saada kasvavast jĂ”udluse ja ruumi vajadusest. Sellest tulenevalt on kĂ€esoleva töö peamiseks kĂŒsimuseks kuidas tuua pilveinfrastruktuur mobiilikasutajale lĂ€hemale? Antud töös uurisime kuidas mobiiltelefoni pilveteenust saab mobiilirakendustesse integreerida. Saime teada, et töö delegeerimine pilve eeldab mitmete pilve aspektide kaalumist ja integreerimist, nagu nĂ€iteks ressursimahukas töötlemine, asĂŒnkroonne suhtlus kliendiga, programmaatiline ressursside varustamine (Web APIs) ja pilvedevaheline kommunikatsioon. Nende puuduste ĂŒletamiseks lĂ”ime Mobiilse pilve vahevara Mobile Cloud Middleware (Mobile Cloud Middleware - MCM) raamistiku, mis kasutab deklaratiivset teenuste komponeerimist, et delegeerida töid mobiililt mitmetele pilvedele kasutades minimaalset andmeedastust. Teisest kĂŒljest on nĂ€idatud, et koodi teisaldamine on peamisi strateegiaid seadme energiatarbimise vĂ€hendamiseks ning jĂ”udluse suurendamiseks. Sellegipoolest on koodi teisaldamisel miinuseid, mis takistavad selle laialdast kasutuselevĂ”ttu. Selles töös uurime lisaks, mis takistab koodi mahalaadimise kasutuselevĂ”ttu ja pakume lahendusena vĂ€lja raamistiku EMCO, mis kogub seadmetelt infot koodi jooksutamise kohta erinevates kontekstides. Neid andmeid analĂŒĂŒsides teeb EMCO kindlaks, mis on sobivad tingimused koodi maha laadimiseks. VĂ”rreldes kogutud andmeid, suudab EMCO jĂ€reldada, millal tuleks mahalaadimine teostada. EMCO modelleerib kogutud andmeid jaotuse mÀÀra jĂ€rgi lokaalsete- ning pilvejuhtude korral. Neid jaotusi vĂ”rreldes tuletab EMCO tĂ€psed atribuudid, mille korral mobiilirakendus peaks koodi maha laadima. VĂ”rreldes EMCO-t teiste nĂŒĂŒdisaegsete mahalaadimisraamistikega, tĂ”useb EMCO efektiivsuse poolest esile. LĂ”puks uurisime kuidas arvutuste maha laadimist Ă€ra kasutada, et tĂ€iustada kasutaja kogemust pideval mobiilirakenduse kasutamisel. Meie peamiseks motivatsiooniks, et sellist adaptiivset tööde tĂ€itmise kiirendamist pakkuda, on tagada kasutuskvaliteet (QoE), mis muutub vastavalt kasutajale, aidates seelĂ€bi suurendada mobiilirakenduse eluiga.Mobile and cloud computing are two of the biggest forces in computer science. While the cloud provides to the user the ubiquitous computational and storage platform to process any complex tasks, the smartphone grants to the user the mobility features to process simple tasks, anytime and anywhere. Smartphones, driven by their need for processing power, storage space and energy saving are looking towards remote cloud infrastructure in order to solve these problems. As a result, the main research question of this work is how to bring the cloud infrastructure closer to the mobile user? In this thesis, we investigated how mobile cloud services can be integrated within the mobile apps. We found out that outsourcing a task to cloud requires to integrate and consider multiple aspects of the clouds, such as resource-intensive processing, asynchronous communication with the client, programmatically provisioning of resources (Web APIs) and cloud intercommunication. Hence, we proposed a Mobile Cloud Middleware (MCM) framework that uses declarative service composition to outsource tasks from the mobile to multiple clouds with minimal data transfer. On the other hand, it has been demonstrated that computational offloading is a key strategy to extend the battery life of the device and improves the performance of the mobile apps. We also investigated the issues that prevent the adoption of computational offloading, and proposed a framework, namely Evidence-aware Mobile Computational Offloading (EMCO), which uses a community of devices to capture all the possible context of code execution as evidence. By analyzing the evidence, EMCO aims to determine the suitable conditions to offload. EMCO models the evidence in terms of distributions rates for both local and remote cases. By comparing those distributions, EMCO infers the right properties to offload. EMCO shows to be more effective in comparison with other computational offloading frameworks explored in the state of the art. Finally, we investigated how computational offloading can be utilized to enhance the perception that the user has towards an app. Our main motivation behind accelerating the perception at multiple response time levels is to provide adaptive quality-of-experience (QoE), which can be used as mean of engagement strategy that increases the lifetime of a mobile app
    • 

    corecore