Search CORE

7 research outputs found

Low-power high-efficiency video decoding using general purpose processors

Author: Chi Chi Ching
Juurlink Ben
Álvarez-Mesa Mauricio
Publication venue
Publication date: 01/01/2015
Field of study

In this article, we investigate how code optimization techniques and low-power states of general-purpose processors improve the power efficiency of HEVC decoding. The power and performance efficiency of the use of SIMD instructions, multicore architectures, and low-power active and idle states are analyzed in detail for offline video decoding. In addition, the power efficiency of techniques such as “race to idle” and “exploiting slack” with DVFS are evaluated for real-time video decoding. Results show that “exploiting slack” is more power efficient than “race to idle” for all evaluated platforms representing smartphone, tablet, laptop, and desktop computing systems

DepositOnce

Power management for interactive 3D games

Author: GU YAN
Publication venue
Publication date: 18/08/2008
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Power-Performance Modeling and Adaptive Management of Heterogeneous Mobile Platforms

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Nearly 60% of the world population uses a mobile phone, which is typically powered by a system-on-chip (SoC). While the mobile platform capabilities range widely, responsiveness, long battery life and reliability are common design concerns that are crucial to remain competitive. Consequently, state-of-the-art mobile platforms have become highly heterogeneous by combining a powerful SoC with numerous other resources, including display, memory, power management IC, battery and wireless modems. Furthermore, the SoC itself is a heterogeneous resource that integrates many processing elements, such as CPU cores, GPU, video, image, and audio processors. Therefore, CPU cores do not dominate the platform power consumption under many application scenarios. Competitive performance requires higher operating frequency, and leads to larger power consumption. In turn, power consumption increases the junction and skin temperatures, which have adverse effects on the device reliability and user experience. As a result, allocating the power budget among the major platform resources and temperature control have become fundamental consideration for mobile platforms. Dynamic thermal and power management algorithms address this problem by putting a subset of the processing elements or shared resources to sleep states, or throttling their frequencies. However, an adhoc approach could easily cripple the performance, if it slows down the performance-critical processing element. Furthermore, mobile platforms run a wide range of applications with time varying workload characteristics, unlike early generations, which supported only limited functionality. As a result, there is a need for adaptive power and performance management approaches that consider the platform as a whole, rather than focusing on a subset. Towards this need, our specific contributions include (a) a framework to dynamically select the Pareto-optimal frequency and active cores for the heterogeneous CPUs, such as ARM big.Little architecture, (b) a dynamic power budgeting approach for allocating optimal power consumption to the CPU and GPU using performance sensitivity models for each PE, (c) an adaptive GPU frame time sensitivity prediction model to aid power management algorithms, and (d) an online learning algorithm that constructs adaptive run-time models for non-stationary workloads.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

ASU Digital Repository

Recommended from our members

Cooperative Power and Resource Management for Heterogeneous Mobile Architectures

Author: Hsieh Chenying
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Heterogeneous architectures have been ubiquitous in mobile system-on-chips (SoCs). The demand from different application domains such as games, computer vision and machine learning which requires massive parallelism of computation has driven the integration of more accelerators into mobile SoCs to provide satisfactory performance energy-efficiently. These on-chip computing resources typically have their individual runtime systems including: (1) a software governor: continuously monitors hardware utilization and makes decisions of trade-off between performance and power consumption. (2) software stack: allows application developers to program the hardware for general purpose computation and perform memory management and profiling. As computation of mobile applications may demand all sorts of combinations of computing resources, we identify two problems: (1) individual runtime can often lead to poor performance-power trade-off or inefficient utilization of computing resources. (2) existing approaches fail to schedule subprograms among different computing resources and further lose the opportunity to avoid resource contention to gain better performance

eScholarship - University of California

Shader optimization and specialization

Author: Crawford Lewis
Publication venue: The University of Edinburgh
Publication date: 11/10/2022
Field of study

In the field of real-time graphics for computer games, performance has a significant effect on the player’s enjoyment and immersion. Graphics processing units (GPUs) are hardware accelerators that run small parallelized shader programs to speed up computationally expensive rendering calculations. This thesis examines optimizing shader programs and explores ways in which data patterns on both the CPU and GPU can be analyzed to automatically speed up rendering in games. Initially, the effect of traditional compiler optimizations on shader source-code was explored. Techniques such as loop unrolling or arithmetic reassociation provided speed-ups on several devices, but different GPU hardware responded differently to each set of optimizations. Analyzing execution traces from numerous popular PC games revealed that much of the data passed from CPU-based API calls to GPU-based shaders is either unused, or remains constant. A system was developed to capture this constant data and fold it into the shaders’ source-code. Re-running the game’s rendering code using these specialized shader variants resulted in performance improvements in several commercial games without impacting their visual quality

Edinburgh Research Archive

Games are up for DVFS

Author: Chakraborty S.
Gu Y.
Ooi W.T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

10.1145/1146909.1147063Proceedings - Design Automation Conference598-603PDAW

CiteSeerX

ScholarBank@NUS