Search CORE

34 research outputs found

Markov Decision Process Based Energy-Efficient On-Line Scheduling for Slice-Parallel Video Decoders on Multicore Systems

Author: Atienza David
Frossard Pascal
Kanoun Karim
Mastronarde Nicholas
van der Schaar Mihaela
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/01/2012
Field of study

We consider the problem of energy-efficient on-line scheduling for slice-parallel video decoders on multicore systems. We assume that each of the processors are Dynamic Voltage Frequency Scaling (DVFS) enabled such that they can independently trade off performance for power, while taking the video decoding workload into account. In the past, scheduling and DVFS policies in multi-core systems have been formulated heuristically due to the inherent complexity of the on-line multicore scheduling problem. The key contribution of this report is that we rigorously formulate the problem as a Markov decision process (MDP), which simultaneously takes into account the on-line scheduling and per-core DVFS capabilities; the power consumption of the processor cores and caches; and the loss tolerant and dynamic nature of the video decoder's traffic. In particular, we model the video traffic using a Direct Acyclic Graph (DAG) to capture the precedence constraints among frames in a Group of Pictures (GOP) structure, while also accounting for the fact that frames have different display/decoding deadlines and non-deterministic decoding complexities. The objective of the MDP is to minimize long-term power consumption subject to a minimum Quality of Service (QoS) constraint related to the decoder's throughput. Although MDPs notoriously suffer from the curse of dimensionality, we show that, with appropriate simplifications and approximations, the complexity of the MDP can be mitigated. We implement a slice-parallel version of H.264 on a multiprocessor ARM (MPARM) virtual platform simulator, which provides cycle-accurate and bus signal-accurate simulation for different processors. We use this platform to generate realistic video decoding traces with which we evaluate the proposed on-line scheduling algorithm in Matlab

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Online Energy-Efficient Task-Graph Scheduling for Multicore Platforms

Author: Atienza Alonso David
Kanoun Karim
Mastronade Nicholas
Van der Schaar Mihaela
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/03/2014
Field of study

Numerous Directed-Acyclic Graph (DAG) schedulers have been developed to improve the energy efficiency of various multi-core platforms. However, these schedulers make a priori assumptions about the relationship between the task dependencies, and they are unable to adapt online to the characteristics of each application without offline profiling data. Therefore, we propose a novel energy-efficient online scheduling solution for the general DAG model to address the two aforementioned problems. Our proposed scheduler is able to adapt at runtime to the characteristics of each application by making smart foresighted decisions, which take into account the impact of current scheduling decisions on the present and future deadline miss rates and energy efficiency. Moreover, our scheduler is able to efficiently handle execution with very limited resources by avoiding scheduling tasks that are expected to miss their deadlines and do not have an impact on future deadlines. We validate our approach against state-of-the-art solutions. In our first set of experiments, our results with the H.264 video decoder demonstrate that the proposed low-complexity solution for the general DAG model reduces the energy consumption by up to 15% compared to an existing sophisticated and complex scheduler that was specifically built for the H.264 video decoder application. In our second set of experiments, our results with different configurations of synthetic DAGs demonstrate that our proposed solution is able to reduce the energy consumption by up to 55% and the deadline miss rates by up to 99% compared to a second existing scheduling solution. Finally, we show that our DFM and scheduler have low complexities on a real mobile platform and we show that our solution is resilient to workload prediction errors by using different estimator accuracies

Infoscience - École polytechnique fédérale de Lausanne

Power-Aware HEVC Decoding with Tunable Image Quality

Author: Holmbacka Simon
Lilius Johan
Menard Daniel
Nogues Erwan
Pelcat Maxime
Publication venue: HAL CCSD
Publication date: 20/10/2014
Field of study

International audienceA high pressure is put on mobile devices to support increasingly advanced applications requiring more processing capabilities. Among those, the emerging High Efficiency Video Coding (HEVC) provides a better video quality for the same bit rate than the previous H.264 standard. A limitation in the usability of a mobile video playing device is the lack of support for guaranteeing stand-by time and up time for battery driven devices. The Green Metadata initiative within the MPEG standard was launched to address the power saving issues of the decoder and defines the technology requirements. In this paper, we propose a HEVC decoder with tunable decoding quality levels for maximum power savings as suggested in the scope of the Green Metadata initiative. Our experiments reveal that the modified HEVC video decoder can save up to 28 % of power consumption in real-world platforms while keeping better quality than decoding with H.264

Error tolerant multimedia stream processing: There's plenty of room at the top (of the system stack)

Author: Andreopoulos Y
Publication venue
Publication date: 08/12/2012
Field of study

There is a growing realization that the expected fault rates and energy dissipation stemming from increases in CMOS integration will lead to the abandonment of traditional system reliability in favor of approaches that offer reliability to hardware-induced errors across the application, runtime support, architecture, device and integrated-circuit (IC) layers. Commercial stakeholders of multimedia stream processing (MSP) applications, such as information retrieval, stream mining systems, and high-throughput image and video processing systems already feel the strain of inadequate system-level scaling and robustness under the always-increasing user demand. While such applications can tolerate certain imprecision in their results, today's MSP systems do not support a systematic way to exploit this aspect for cross-layer system resilience. However, research is currently emerging that attempts to utilize the error-tolerant nature of MSP applications for this purpose. This is achieved by modifications to all layers of the system stack, from algorithms and software to the architecture and device layer, and even the IC digital logic synthesis itself. Unlike conventional processing that aims for worst-case performance and accuracy guarantees, error-tolerant MSP attempts to provide guarantees for the expected performance and accuracy. In this paper we review recent advances in this field from an MSP and a system (layer-by-layer) perspective, and attempt to foresee some of the components of future cross-layer error-tolerant system design that may influence the multimedia and the general computing landscape within the next ten years. © 1999-2012 IEEE

UCL Discovery

Energy Efficiency and Performance Management of Parallel Dataflow Applications

Author: Holmbacka Simon
Lafond Sébastien
Lilius Johan
Nogues Erwan
Pelcat Maxime
Publication venue: HAL CCSD
Publication date: 08/10/2014
Field of study

International audienceParallelizing software is a popular way of achieving high energy efficiency since parallel applications can be mapped on many cores and the clock frequency can be lowered. Perfect parallelism is, however, not often reached and different program phases usually contain different levels of parallelism due to data dependencies. Applications have currently no means of expressing the level of parallelism, and the power management is mostly done based on only the workload. In this work, we provide means of expressing QoS and levels of parallelism in applications for more tight integration with the power management to obtain optimal energy efficiency in multi-core systems. We utilize the dataflow framework PREESM to create and analyze program structures and expose the parallelism in the program phases to the power management. We use the derived parameters in a NLP (NonLinear Programming) solver to determine the minimum power for allocating resources to the applications

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Dynamic Resource Management of Network-on-Chip Platforms for Multi-stream Video Processing

Author: Mendis Hashan Roshantha
Publication venue: University of York
Publication date: 01/03/2017
Field of study

This thesis considers resource management in the context of parallel multiple video stream decoding, on multicore/many-core platforms. Such platforms have tens or hundreds of on-chip processing elements which are connected via a Network-on-Chip (NoC). Inefficient task allocation configurations can negatively affect the communication cost and resource contention in the platform, leading to predictability and performance issues. Efficient resource management for large-scale complex workloads is considered a challenging research problem; especially when applications such as video streaming and decoding have dynamic and unpredictable workload characteristics. For these type of applications, runtime heuristic-based task mapping techniques are required. As the application and platform size increase, decentralised resource management techniques are more desirable to overcome the reliability and performance bottlenecks in centralised management. In this work, several heuristic-based runtime resource management techniques, targeting real-time video decoding workloads are proposed. Firstly, two admission control approaches are proposed; one fully deterministic and highly predictable; the other is heuristic-based, which balances predictability and performance. Secondly, a pair of runtime task mapping schemes are presented, which make use of limited known application properties, communication cost and blocking-aware heuristics. Combined with the proposed deterministic admission controller, these techniques can provide strict timing guarantees for hard real-time streams whilst improving resource usage. The third contribution in this thesis is a distributed, bio-inspired, low-overhead, task re-allocation technique, which is used to further improve the timeliness and workload distribution of admitted soft real-time streams. Finally, this thesis explores parallelisation and resource management issues, surrounding soft real-time video streams that have been encoded using complex encoding tools and modern codecs such as High Efficiency Video Coding (HEVC). Properties of real streams and decoding trace data are analysed, to statistically model and generate synthetic HEVC video decoding workloads. These workloads are shown to have complex and varying task dependency structures and resource requirements. To address these challenges, two novel runtime task clustering and mapping techniques for Tile-parallel HEVC decoding are proposed. These strategies consider the workload communication to computation ratio and stream-specific characteristics to balance predictability improvement and communication energy reduction. Lastly, several task to memory controller port assignment schemes are explored to alleviate performance bottlenecks, resulting from memory traffic contention

White Rose E-theses Online

Energy and reliability challenges in next generation devices: integrated software solutions

Author
Publication venue: Università degli Studi di Cagliari
Publication date: 24/02/2011
Field of study

Archivio istituzionale della ricerca - Università di Cagliari

Energy and reliability challenges in next generation devices: integrated software solutions

Author: Mulas Fabrizio
Publication venue
Publication date: 24/02/2011
Field of study

Archivio istituzionale della ricerca - Università di Cagliari

UniCA Eprints

Optimized Fundamental Signal Processing Operations for Energy Minimization on Heterogeneous Mobile Devices

Author: Badia Contelles J. M.
Belloch Rodríguez José Antonio
Gonzalez Alberto
Igual Peña Francisco Daniel
Quintana Ortí Enrique Salvador
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2017
Field of study

[EN] Numerous signal processing applications are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness constraints for user interactivity and, at the same time, must be optimized for energy efficiency. The increasingly heterogeneous power-versus-performance profile of modern hardware introduces new opportunities for energy savings as well as challenges. In this line, recent systems-on-chip (SoC) composed of low-power multicore processors, combined with a small graphics accelerator (or GPU), yield a notable increment of the computational capacity while partially retaining the appealing low power consumption of embedded systems. This paper analyzes the potential of these new hardware systems to accelerate applications that involve a large number of floating-point arithmetic operations mainly in the form of convolutions. To assess the performance, a headphone-based spatial audio application for mobile devices based on a Samsung Exynos 5422 SoC has been developed. We discuss different implementations and analyze the tradeoffs between performance and energy efficiency for different scenarios and configurations. Our experimental results reveal that we can extend the battery lifetime of a device featuring such an architecture by a 238% by properly configuring and leveraging the computational resources.This work was supported by the Spanish Ministerio de Economia y Competitividad projects under Grant TIN2014-53495-R and Grant TEC2015-67387-C4-1-R, in part by the University Project UJI-B2016-20, in part by the Project PROMETEOII/2014/003. The work of J. A. Belloch was supported by the GVA Post-Doctoral Contract under Grant APOSTD/2016/069. This paper was recommended by Associate Editor Y. Ha.Belloch Rodríguez, JA.; Badia Contelles, JM.; Igual Peña, FD.; Gonzalez, A.; Quintana Ortí, ES. (2017). Optimized Fundamental Signal Processing Operations for Energy Minimization on Heterogeneous Mobile Devices. IEEE Transactions on Circuits and Systems I Regular Papers. 65(5):1614-1627. https://doi.org/10.1109/TCSI.2017.2761909S1614162765

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositori Institucional de la Universitat Jaume I

RiuNet

Beyond multimedia adaptation: Quality of experience-aware multi-sensorial media delivery

Author: Ghinea G
Muntean G-M
Yuan Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Multiple sensorial media (mulsemedia) combines multiple media elements which engage three or more of human senses, and as most other media content, requires support for delivery over the existing networks. This paper proposes an adaptive mulsemedia framework (ADAMS) for delivering scalable video and sensorial data to users. Unlike existing two-dimensional joint source-channel adaptation solutions for video streaming, the ADAMS framework includes three joint adaptation dimensions: video source, sensorial source, and network optimization. Using an MPEG-7 description scheme, ADAMS recommends the integration of multiple sensorial effects (i.e., haptic, olfaction, air motion, etc.) as metadata into multimedia streams. ADAMS design includes both coarse- and fine-grained adaptation modules on the server side: mulsemedia flow adaptation and packet priority scheduling. Feedback from subjective quality evaluation and network conditions is used to develop the two modules. Subjective evaluation investigated users' enjoyment levels when exposed to mulsemedia and multimedia sequences, respectively and to study users' preference levels of some sensorial effects in the context of mulsemedia sequences with video components at different quality levels. Results of the subjective study inform guidelines for an adaptive strategy that selects the optimal combination for video segments and sensorial data for a given bandwidth constraint and user requirement. User perceptual tests show how ADAMS outperforms existing multimedia delivery solutions in terms of both user perceived quality and user enjoyment during adaptive streaming of various mulsemedia content. In doing so, it highlights the case for tailored, adaptive mulsemedia delivery over traditional multimedia adaptive transport mechanisms

CiteSeerX

Crossref

Northumbria Research Link

Brunel University Research Archive