121 research outputs found

    On-Chip Transparent Wire Pipelining (invited paper)

    Get PDF
    Wire pipelining has been proposed as a viable mean to break the discrepancy between decreasing gate delays and increasing wire delays in deep-submicron technologies. Far from being a straightforwardly applicable technique, this methodology requires a number of design modifications in order to insert it seamlessly in the current design flow. In this paper we briefly survey the methods presented by other researchers in the field and then we thoroughly analyze the solutions we recently proposed, ranging from system-level wire pipelining to physical design aspects

    Adaptive Latency Insensitive Protocols

    Get PDF
    Latency-insensitive design copes with excessive delays typical of global wires in current and future IC technologies. It achieves its goal via encapsulation of synchronous logic blocks in wrappers that communicate through a latency-insensitive protocol (LIP) and pipelined interconnects. Previously proposed solutions suffer from an excessive performance penalty in terms of throughput or from a lack of generality. This article presents an adaptive LIP that outperforms previous static implementations, as demonstrated by two relevant cases — a microprocessor and an MPEG encoder — whose components we made insensitive to the latencies of their interconnections through a newly developed wrapper. We also present an informal exposition of the theoretical basis of adaptive LIPs, as well as implementation detail

    A new system design methodology for wire pipelined SoC

    Get PDF
    Wire Pipelining (WP) has been proposed in order to limit the impact of increasing wire delays. In general, the added pipeline elements alters the system such that architectural changes are needed to preserve functionality. We illustrate a proposal that, while allowing the use of IP blocks without modification, takes advantage of a minimal knowledge of the IP's communication profile to dramatically increase the performances. We showed the formal equivalence between WP and original system and proved the higher performance achievable through a relevant case study

    Issues in Implementing Latency Insensitive Protocols

    Get PDF
    The performance of future Systems-on-Chip will be limited by the latency of long interconnects requiring more than one clock cycle for the signals to propagate. To deal with the problem L. Carloni et alii proposed the Latency Insensitive Protocols (LIP). A design that works under the assumption of zero-delay connections between functional modules is modified in a Latency Insensitive Design (LID) by encapsulating them within wrappers (“shells”) and connecting them through internally pipelined blocks (“relay stations”) complying with a protocol that guarantees identity of behavior [1]. The wrappers perform:- Data Validation: each output channel signals whether the datum therein present has still to be consumed.- Back Pressure: when the pearl is stopped the shell generates a stop signal sent in the opposite direction of inputs;- Clock Gating: a module waiting for new data and/or stopped keeps its present state. Such a protocol was implemented [2] through the introductio

    Throughput-driven floorplanning with wire pipelining

    Get PDF
    The size of future high-performance SoC is such that the time-of-flight of wires connecting distant pins in the layout can be much higher than the clock period. In order to keep the frequency as high as possible, the wires may be pipelined. However, the insertion of flip-flops may alter the throughput of the system due to the presence of loops in the logic netlist. In this paper, we address the problem of floorplanning a large design where long interconnects are pipelined by inserting the throughput in the cost function of a tool based on simulated annealing. The results obtained on a series of benchmarks are then validated using a simple router that breaks long interconnects by suitably placing flip-flops along the wires

    Energy Detection UWB Receiver Design using a Multi-resolution VHDL-AMS Description

    Get PDF
    Ultra Wide Band (UWB) impulse radio systems are appealing for location-aware applications. There is a growing interest in the design of UWB transceivers with reduced complexity and power consumption. Non-coherent approaches for the design of the receiver based on energy detection schemes seem suitable to this aim and have been adopted in the project the preliminary results of which are reported in this paper. The objective is the design of a UWB receiver with a top-down methodology, starting from Matlab-like models and refining the description down to the final transistor level. This goal will be achieved with an integrated use of VHDL for the digital blocks and VHDL-AMS for the mixed-signal and analog circuits. Coherent results are obtained using VHDL-AMS and Matlab. However, the CPU time cost strongly depends on the description used in the VHDL-AMS models. In order to show the functionality of the UWB architecture, the receiver most critical functions are simulated showing results in good agreement with the expectations

    Power-performance assessment of different DVFS control policies in NoCs

    Get PDF
    We analyze the power-delay trade-off in a Network-on-Chip (NoC) under three Dynamic Voltage and Frequency Scaling (DVFS) policies. The first rate-based policy sets frequency and voltage of the NoC to the minimum value that allows to sustain the injection rate without reaching saturation. The second queue-based policy uses a feedback-loop approach to throttle the NoC frequency and voltage such that the average backlog of the injection queues tracks a target value. The third delay-based policy uses a closed- loop strategy that targets a given NoC end-to-end average delay. We first show that, despite the different mechanism and implementation, both rate-based and queue-based policies obtain very similar results in terms of power and delay, and we propose a theoretical interpretation of this similarity. Then, we show that delay-based policy generally offers a better power-delay trade-off. We obtained our results with an extensive set of experiments on synthetic traffic, as well as multimedia, communications and PARSEC benchmarks. For all the experiments, we report both cycle-accurate simulation results for the analysis of NoC delay and accurate power results obtained targeting a standard-cell library in an advanced 28-nm FDSOI CMOS technology

    An effective AMS Top-Down Methodology Applied to the Design of a Mixed-SignalUWB System-on-Chip

    Get PDF
    The design of Ultra Wideband (UWB) mixed-signal SoC for localization applications in wireless personal area networks is currently investigated by several researchers. The complexity of the design claims for effective top-down methodologies. We propose a layered approach based on VHDL-AMS for the first design stages and on an intelligent use of a circuit-level simulator for the transistor-level phase. We apply the latter just to one block at a time and wrap it within the system-level VHDL-AMS description. This method allows to capture the impact of circuit-level design choices and non-idealities on system performance. To demonstrate the effectiveness of the methodology we show how the refinement of the design affects specific UWB system parameters such as bit-error rate and localization estimations

    A 1-bit Synchronization Algorithm for a Reduced Complexity Energy Detection UWB Receiver

    Get PDF
    This work investigates the possibility of performing synchronization in a reduced complexity Energy Detection receiver. A new receiver scheme employing a single comparator only is defined and the related synchronization algorithm is presented. The possibility of synchronizing has been analyzed both for an idealized Dirac Delta input signal and for realistic UWB signals obtained through the TG4a channel model. The matlab simulations show that it is possible to obtain coarse synchronization using a simple maximum detection algorithm computed on collected energies for the ideal case of Dirac Delta pulses. For realistic UWB signals better synchronization performances are possible by employing a searchback algorithm. Due to the low complexity of the receiver scheme, the synchronization algorithm requires a long locking time
    • 

    corecore