10,349 research outputs found

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

    A Library for Pattern-based Sparse Matrix Vector Multiply

    Get PDF
    Pattern-based Representation (PBR) is a novel approach to improving the performance of Sparse Matrix-Vector Multiply (SMVM) numerical kernels. Motivated by our observation that many matrices can be divided into blocks that share a small number of distinct patterns, we generate custom multiplication kernels for frequently recurring block patterns. The resulting reduction in index overhead significantly reduces memory bandwidth requirements and improves performance. Unlike existing methods, PBR requires neither detection of dense blocks nor zero filling, making it particularly advantageous for matrices that lack dense nonzero concentrations. SMVM kernels for PBR can benefit from explicit prefetching and vectorization, and are amenable to parallelization. The analysis and format conversion to PBR is implemented as a library, making it suitable for applications that generate matrices dynamically at runtime. We present sequential and parallel performance results for PBR on two current multicore architectures, which show that PBR outperforms available alternatives for the matrices to which it is applicable, and that the analysis and conversion overhead is amortized in realistic application scenarios

    Optimized mobile thin clients through a MPEG-4 BiFS semantic remote display framework

    Get PDF
    According to the thin client computing principle, the user interface is physically separated from the application logic. In practice only a viewer component is executed on the client device, rendering the display updates received from the distant application server and capturing the user interaction. Existing remote display frameworks are not optimized to encode the complex scenes of modern applications, which are composed of objects with very diverse graphical characteristics. In order to tackle this challenge, we propose to transfer to the client, in addition to the binary encoded objects, semantic information about the characteristics of each object. Through this semantic knowledge, the client is enabled to react autonomously on user input and does not have to wait for the display update from the server. Resulting in a reduction of the interaction latency and a mitigation of the bursty remote display traffic pattern, the presented framework is of particular interest in a wireless context, where the bandwidth is limited and expensive. In this paper, we describe a generic architecture of a semantic remote display framework. Furthermore, we have developed a prototype using the MPEG-4 Binary Format for Scenes to convey the semantic information to the client. We experimentally compare the bandwidth consumption of MPEG-4 BiFS with existing, non-semantic, remote display frameworks. In a text editing scenario, we realize an average reduction of 23% of the data peaks that are observed in remote display protocol traffic

    Huffman-based Code Compression Techniques for Embedded Systems

    Get PDF
    • …
    corecore