7,246 research outputs found

    Low power memory allocation and mapping for area-constrained systems-on-chips

    Get PDF
    Large fractions of today’s embedded systems’ power consumption can be attributed to the memory subsystem. In order to reduce this fraction, we propose a mathematical model to optimize on-chip memory configurations for minimal power. We exploit the power reduction effect of splitting memory into subunits with frequently accessed addresses mapped to small memories. The definition of an integer linear programming model enables us to solve the twofold problem of allocating an optimal set of memory instances with varying size on the one hand and finding an optimal mapping of application segments to allocated memories on the other hand. Experimental results yield power reductions of up to 82 % for instruction memory and 73 % for data memory. Area usage, at the same time, deteriorates by only 2.1 %, respectively, 1.2 % on average and even improves in some cases. Flexibility and performance of our model make it a valuable tool for low power system-on-chip design, either for efficient design space exploration or as part of a HW/SW codesign synthesis flow

    Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

    Get PDF
    Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can be used for arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness

    On-Line Dependability Enhancement of Multiprocessor SoCs by Resource Management

    Get PDF
    This paper describes a new approach towards dependable design of homogeneous multi-processor SoCs in an example satellite-navigation application. First, the NoC dependability is functionally verified via embedded software. Then the Xentium processor tiles are periodically verified via on-line self-testing techniques, by using a new IIP Dependability Manager. Based on the Dependability Manager results, faulty tiles are electronically excluded and replaced by fault-free spare tiles via on-line resource management. This integrated approach enables fast electronic fault detection/diagnosis and repair, and hence a high system availability. The dependability application runs in parallel with the actual application, resulting in a very dependable system. All parts have been verified by simulation

    A survey of system level power management schemes in the dark-silicon era for many-core architectures

    Get PDF
    Power consumption in Complementary Metal Oxide Semiconductor (CMOS) technology has escalated to a point that only a fractional part of many-core chips can be powered-on at a time. Fortunately, this fraction can be increased at the expense of performance through the dark-silicon solution. However, with many-core integration set to be heading towards its thousands, power consumption and temperature increases per time, meaning the number of active nodes must be reduced drastically. Therefore, optimized techniques are demanded for continuous advancement in technology. Existing eïŹ€orts try to overcome this challenge by activating nodes from diïŹ€erent parts of the chip at the expense of communication latency. Other eïŹ€orts on the other hand employ run-time power management techniques to manage the power performance of the cores trading-oïŹ€ performance for power. We found out that, for a signiïŹcant amount of power to saved and high temperature to be avoided, focus should be on reducing the power consumption of all the on-chip components. Especially, the memory hierarchy and the interconnect. Power consumption can be minimized by, reducing the size of high leakage power dissipating elements, turning-oïŹ€ idle resources and integrating power saving materials
    • 

    corecore