416 research outputs found

    Comparision of Different Logic style for High Performance Wave Pipeline Circuit

    Get PDF
    High throughput and low latency designs are required in modern high performance systems, especially for signal processing applications .Existing logic families cannot provide both of them simultaneously. We propose Double Pass Transistor Logic (DPL) which can be used as a universal logic to provide finest grain pipelining without affecting overall latency or increasing the area. It  does not require  any special process steps and hence, can be  realized  in  a  normal process  technology  as  against  the CPL proposed  by  Yano et  al  [2] which uses  threshold  voltage  adjustment  of  selected  devices.  The design procedure is described for (a) low latency, (b) high throughput and (c) low area requirements. In  addition to the various  advantages,  it  is envisioned  that DPL  designs  can also be used to build ultra-high speed  pipelined system without pipelining latches, viz., wave pipelined  digital systems,  where the throughput achievable  is beyond  that permitted  by  the delay  of  a pipeline stage

    Design of High Performance Modified Wave pipelined DAA Filter with Critical Path Approach

    Get PDF
    In this paper, a new high speed control circuit is proposed which will act as a critical path for the data which will go from input to output to improve the performance of wave pipelining circuits The wave pipelining is a method of high performance circuit designs which implements pipelining in logic without the use of intermediate registers. Wave pipelining has been widely used in the past few years with a great deal of significant features in technology and applications. It has the ability to improve speed, efficiency, economy in every aspect which it presents. Wave pipelining is being used in wide range of applications like digital filters, network routers, multipliers, fast convolvers, MODEMs, image processing, control systems, radars and many others. In previous work, the operating speed of the wave-pipelined circuit can be increased by the following three tasks: adjustment of the clock period, clock skew and equalization of path delays. The path-delay equalization task can be done theoretically, but the real challenge is to accomplish it in the presence of various different delays. So, the main objective of this paper is to solve the path delay equalization problem by inserting the control circuit in wave pipelined based circuit which will act as critical path for the data that moves from input to output. The proposed technique is evaluated for DSP applications by designing 4- tap FIR filter using Distributed arithmetic algorithm (DAA). Then comparison of this design is done with 4-tap FIR filter designs using conventional pipelining and non pipelining. The synthesis and simulation results based on Xilinx ISE Navigator 12.3 shows that wave pipelined DAA based filter is faster by a factor of 1.43 compared to non pipelined one and the conventional pipelined filter is faster than non pipelined by factor of 1.61 but at the cost of increased logic utilization by 200 %. So, the wave-pipelined DA filters designed with the proposed control circuit can operate at higher frequency than that of non-pipelined but less than that of pipelined. The gain in speed in pipelined compared to that of wavepipelined is at the cost of increased area and more dissipated power. When latency is considered, wavepipelined design filters with the proposed scheme are having the lowest latency among three schemes designed

    A Practical Application of Wave-Pipelining Theory on a Adaptive Differential Pulse Code Modulation Coder-Decoder Design

    Get PDF
    Pipeline architectures are often considered in VLSI designs that require high throughput. The draw-backs for traditional pipelined architectures are the increased area, power, and latency required for implementation. However, with increased design effort, wave-pipelining can be applied as an alternative to a pipelined circuit to reduce the pipeline area, power, and latency while maintaining the original functionality and timing of the overall circuit. The objective of this paper is the successful application of the theories of wave-pipelining in a practical digital system. To accomplish this, the pipelined portion of an Multi-Channel Adaptive Differential Pulse Code Modulation (ADPCM) Coder-Decoder (CODEC) is replaced with a wave pipeline design

    FPGA Acceleration of 3GPP Channel Model Emulator for 5G New Radio

    Get PDF
    The channel model is by far the most computing intensive part of the link level simulations of multiple-input and multiple-output (MIMO) fifth-generation new radio (5G NR) communication systems. Simulation effort further increases when using more realistic geometry-based channel models, such as the three-dimensional spatial channel model (3D-SCM). Channel emulation is used for functional and performance verification of such models in the network planning phase. These models use multiple finite impulse response (FIR) filters and have a very high degree of parallelism which can be exploited for accelerated execution on Field Programmable Gate Array (FPGA) and Graphics Processing Unit (GPU) platforms. This paper proposes an efficient re-configurable implementation of the 3rd generation partnership project (3GPP) 3D-SCM on FPGAs using a design flow based on high-level synthesis (HLS). It studies the effect of various HLS optimization techniques on the total latency and hardware resource utilization on Xilinx Alveo U280 and Intel Arria 10GX 1150 high-performance FPGAs, using in both cases the commercial HLS tools of the producer. The channel model accuracy is preserved using double precision floating point arithmetic. This work analyzes in detail the effort to target the FPGA platforms using HLS tools, both in terms of common parallelization effort (shared by both FPGAs), and in terms of platform-specific effort, different for Xilinx and Intel FPGAs. Compared to the baseline general-purpose central processing unit (CPU) implementation, the achieved speedups are 65X and 95X using the Xilinx UltraScale+ and Intel Arria FPGA platform respectively, when using a Double Data Rate (DDR) memory interface. The FPGA-based designs also achieved ~3X better performance compared to a similar technology node NVIDIA GeForce GTX 1070 GPU, while consuming ~4X less energy. The FPGA implementation speedup improves up to 173X over the CPU baseline when using the Xilinx UltraRAM (URAM) and High-Bandwidth Memory (HBM) resources, also achieving 6X lower latency and 12X lower energy consumption than the GPU implementation

    Exploration and Design of High Performance Variation Tolerant On-Chip Interconnects

    Get PDF
    Siirretty Doriast

    Exploiting level sensitive latches in wire pipelining

    Get PDF
    The present research presents procedures for exploitation of level sensitive latches in wire pipelining. The user gives a Steiner tree, having a signal source and set of destination or sinks, and the location in rectangular plane, capacitive load and required arrival time at each of the destinations. The user also defines a library of non-clocked (buffer) elements and clocked elements (flip-flop and latch), also known as synchronous elements. The first procedure performs concurrent repeater and synchronous element insertion in a bottom-up manner to find the minimum latency that may be achieved between the source and the destinations. The second procedure takes additional input (required latency) for each destination, derived from previous procedure, and finds the repeater and synchronous element assignments for all internal nodes of the Steiner tree, which minimize overall area used. These procedures utilize the latency and area advantages of latch based pipelining over flip-flop based pipelining. The second procedure suggests two methods to tackle the challenges that exist in a latch based design. The deferred delay padding technique is introduced, which removes the short path violations for latches with minimal extra cost

    Ultrasound Beamforming on a FPGA

    Get PDF

    Yield Modeling and Analysis of a Clockless Asynchronous Wave Pipeline with Pulse Faults

    Get PDF
    corecore