44 research outputs found

    Integrating VLIW processors with a network on chip

    No full text
    Networks are a becoming a necessity to easily integrate multiple processors on a single chip. A crucial question here is whether it is good enough to reason about statistical performance as opposed to hard real-time performance constraints. Today’s processors often do not allow software design for hard real-time systems, caused by the design of the bus- and/or memory interfaces, thereby necessitating elaborate performance analysis through simulation.\u3cbr/\u3e\u3cbr/\u3eIn this presentation I will indicate what options a processor designer has, using Silicon Hive processor design tools, in specifying the interfaces and local memory sub-system in a processor. It allows a multitude of communication options to build either type of system: statistically bound or hard real-time bound performance.\u3cbr/\u3e\u3cbr/\u3eAdditionaly I will describe the multi-processor simulation and prototyping environment and touching on the processor design methodology

    Power distribution and management

    No full text

    Implementation issues of 3rd generation mobile communication turbo decoding

    Get PDF

    FADIC:Architectural synthesis applied in IC design

    No full text
    \u3cp\u3eThis paper discusses the design of a chip using architectural synthesis. The chip, FADIC, is applied in Digital Audio Broadcasting (DAB) receivers. It shows that architectural synthesis tools are used for the design of new complex applications and that it supports the evolutionary development of challenging applications like DAB. It was found that the success of such tools in the design community depends on the way user interaction is supported and stimulated. Fast and accurate feedback from the synthesis tools in combination with a rich set of hints for the compiler to guide the architecture exploration are the key issues. It is shown that short time to market is possible for implementations which are an order of magnitude more efficient than alternative implementations on commercially available DSP processors.\u3c/p\u3

    Improving Efficiency of Power Gated Circuits through Concurrent Optimization of Power Switch Size and Forward Body Biasing

    No full text
    \u3cp\u3ePower gating (PG) has emerged as an effective technique to reduce standby leakage power in portable devices where battery life time is vital. However, it comes at the cost of timing overhead which is a problem for most of the applications where real-time constraints exist. Designing efficient power gated circuits is very challenging problem due to contrasting requirements in active mode (low timing overhead implying larger power switch size) and standby mode (low standby leakage power implying smaller power switch size). In this work, we show that applying Forward Body Biasing (FBB) to the logic gates in conjunction with power gating (PG + FBB) will provide us with an additional degree of freedom which can be utilized to improve the efficiency of the power gated circuit. We propose an optimization algorithm to find the optimum power switch size and FBB value such that total leakage energy of the design (active + standby) in minimized. Results show that our PG + FBB technique on an average improves the leakage energy savings by 2X-5X as compared to using only power gating. With PG + FBB technique, one can also design a zero delay penalty power gated circuit which is not possible if only power gating is used.\u3c/p\u3

    Designing energy efficient approximate multipliers for neural acceleration

    Get PDF
    \u3cp\u3eMany error resilient applications can be approximated using multi-layer perceptrons (MLPs) with insignificant degradation in output quality. Faster and energy efficient execution of such an application is achieved using a neural accelerator (NA). This work exploits the error resilience characteristics of a MLP by approximating the accelerator itself. An error resilience analysis of the MLP is performed to obtain key constraints which are used for designing energy efficient approximate multipliers. A systematic methodology for the design of approximate multipliers is used. A graph based netlist modification approach is considered. Approximate versions of basic standard cells are generated and these are used to replace accurate cells in the synthesized netlist in a systematic quality controlled manner. These approximate multipliers are further used for approximating the multiply and accumulate (MAC) units in the neural accelerator (NA). The results are validated by considering approximate neural replication of a robotic application, inversek2j. System level energy savings of upto 14% is obtained for less the 7% degradation in output quality. Average application speedup of 24% is obtained over accurate neural accelerator (NA). The results are compared with state-of-the-art approximate multipliers and a comparison with truncation (bit-wise scaling) is performed. Moreover, error healing capability of MLPs is shown by studying the impact of retraining on networks with approximate multipliers.\u3c/p\u3

    An automated approximation methodology for arithmetic circuits

    No full text
    \u3cp\u3eArithmetic circuits like adders and multipliers are key workforces of many error resilient applications. Prior efforts on approximating these arithmetic circuits mainly focused on manual circuit level functional modifications. These manual approaches need high design time and effort. Due to this only a limited no. of approximate design points can be generated from the original circuit leading to a sparsely occupied pareto front. This work proposes an automated approximation methodology for arithmetic circuits. Proposed method approximates the gate level standard cell library and uses these approximate standard cells to modify the netlist of the original circuit. A heuristic design space exploration methodology is proposed to speed-up the design process. We integrate this methodology with traditional ASIC flow and validate our results using adders and multipliers of different bitwidths. We show that our methodology improves on existing state-of-the-art manual as well as automated design techniques by generating non-dominant pareto-fronts. An application case study (sobel edge detection) is shown using approximate arithmetic circuits generated by our methodology. In case of sobel edge detector, we show upto 50% energy improvements for hardly any quality degradation (PSNR ≥ 20dB).\u3c/p\u3

    Application of medium-grain multiprocessor mapping methodology to epileptic seizure predictor

    No full text
    \u3cp\u3eIn this paper we present a methodology that enables mapping and scheduling of a dynamic real-time medical signal processing application onto an MPSoC platform. We apply the Task Concurrency Management (TCM) methodology on Lyapunov Exponent calculator, which is a part of an epileptic seizure predictor. TCM requires a division of an application into thread frames and thread nodes. In particular, we demonstrate a new technique for thread node splitting so as to reduce execution time variance. This is necessary to meet stringent energy and performance requirements during mapping and scheduling. Through experiments we verify that the resulting model of the Lyapunov Exponent calculator fulfills the requirements of the TCM methodology.\u3c/p\u3

    A chip set for a digital audio broadcasting channel decoder

    No full text
    \u3cp\u3eIn this paper the design of two chips for an ASIC based channel decoder for a Digital Audio Broadcasting (DAB) system is discussed. The ASIC solution is a follow-up to an expensive implementation which is based on general purpose DSP processors. Both ASICs are used in a test receiver and a precursor consumer DAB receiver.\u3c/p\u3
    corecore