519 research outputs found

    High-radix division and square-root with speculation

    Get PDF
    The speed of high-radix digit-recurrence dividers and square-root units is mainly determined by the complexity of the result-digit selection. We present a scheme in which a simpler function speculates the result digit, and, when this speculation is incorrect, a rollback or a partial advance is performed. This results in operations with a shorter cycle time and a variable number of cycles. The scheme can be used in separate division and square-root units, or in a combined one. Several designs were realized and compared in terms of execution time and area. The fastest unit considered is a radix-512 divider with a partial advance of six bits.Peer ReviewedPostprint (published version

    Division with speculation of quotient digits

    Get PDF
    The speed of SRT-type dividers is mainly determined by the complexity of the quotient-digit selection, so that implementations are limited to low-radix stages. A scheme is presented in which the quotient-digit is speculated and, when this speculation is incorrect, a rollback or a partial advance is performed. This results in a division operation with a shorter cycle time and a variable number of cycles. Several designs have been realized, and a radix-64 implementation that is 30% faster than the fastest conventional implementation (radix-8) at an increase of about 45% in area per quotient bit has been obtained. A radix-16 implementation that is about 10% faster than the radix-8 conventional one, with the additional advantage of requiring about 25% less area per quotient bit, is also shownPeer ReviewedPostprint (published version

    A multi-radix approach to asynchronous division

    Get PDF
    The speed of high-radix digit-recurrence dividers is mainly determined by the hardware complexity of the quotient-digit selection function. In this paper we present a scheme that combines the area efficiency of bundled data with data-dependent computation time. In this scheme the selection function is very simple and may be implemented using a fast adder This function speculates the result digit and, when the speculation is incorrect, a correction of the quotient and of the residual must be performed. When the residual satisfies some constraints it is also possible to switch to a higher radix, computing a fraction of the next digit in advance. This results in a division scheme with a variable iteration time and a variable number of iterations and hence with an asynchronous behaviour Several designs were realized and compared both in terms of execution time and area. The fastest unit considered is a radix-64 divider that may switch to radix 128 or 256. Our evaluations show that area /spl times/ delay savings from 25% to 65%, compared to equivalent synchronous designs, may be achieved.Peer ReviewedPostprint (published version

    A radix-16 SRT division unit with speculation of the quotient digits

    Get PDF
    The speed of a divider based on a digit-recurrence algorithm depends mainly on the latency of the quotient digit generation function. In this paper we present an analytical approach that extends the theory developed for standard SRT division and permits us to implement division schemes where a simpler function speculates the quotient digit. This leads to division units with shorter cycle time and variable latency since a speculation error may be produced and a post-correction of the quotient may be necessary. We have applied our algorithm to the design of a radix-16 speculative divider for double precision floating point numbers, that resulted in being faster than analogous implementations.Peer ReviewedPostprint (published version

    High sample-rate Givens rotations for recursive least squares

    Get PDF
    The design of an application-specific integrated circuit of a parallel array processor is considered for recursive least squares by QR decomposition using Givens rotations, applicable in adaptive filtering and beamforming applications. Emphasis is on high sample-rate operation, which, for this recursive algorithm, means that the time to perform arithmetic operations is critical. The algorithm, architecture and arithmetic are considered in a single integrated design procedure to achieve optimum results. A realisation approach using standard arithmetic operators, add, multiply and divide is adopted. The design of high-throughput operators with low delay is addressed for fixed- and floating-point number formats, and the application of redundant arithmetic considered. New redundant multiplier architectures are presented enabling reductions in area of up to 25%, whilst maintaining low delay. A technique is presented enabling the use of a conventional tree multiplier in recursive applications, allowing savings in area and delay. Two new divider architectures are presented showing benefits compared with the radix-2 modified SRT algorithm. Givens rotation algorithms are examined to determine their suitability for VLSI implementation. A novel algorithm, based on the Squared Givens Rotation (SGR) algorithm, is developed enabling the sample-rate to be increased by a factor of approximately 6 and offering area reductions up to a factor of 2 over previous approaches. An estimated sample-rate of 136 MHz could be achieved using a standard cell approach and O.35pm CMOS technology. The enhanced SGR algorithm has been compared with a CORDIC approach and shown to benefit by a factor of 3 in area and over 11 in sample-rate. When compared with a recent implementation on a parallel array of general purpose (GP) DSP chips, it is estimated that a single application specific chip could offer up to 1,500 times the computation obtained from a single OP DSP chip

    Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider

    Get PDF
    [[abstract]]A carry-free subtractive division algorithm is proposed in this paper. In the conventional subtractive divider, adders are used to find both quotient bit and partial remainder. Carries are usually generated in the addition operation, and it may take time to finish the operation, therefore, the carry propagation delay usually is a bottleneck of the conventional subtractive divider. In this paper, a carry-free scheme is proposed by using signed bit representation to represent both quotient and partial remainder. During the arithmetic operation, a special technique is used to decide the quotient bit, and the new partial remainder can be found further by a table lookup-like method. The signed bit format of the quotient can be converted by on-the-fly conversion to the binary representation. Based on this algorithm a 32-b/32-b divider is designed and implemented, and the simulation shows that the divider works well.[[notice]]補正完畢[[incitationindex]]E

    State of the art baseband DSP platforms for Software Defined Radio: A survey

    Get PDF
    Software Defined Radio (SDR) is an innovative approach which is becoming a more and more promising technology for future mobile handsets. Several proposals in the field of embedded systems have been introduced by different universities and industries to support SDR applications. This article presents an overview of current platforms and analyzes the related architectural choices, the current issues in SDR, as well as potential future trends.Peer reviewe

    Design of Fast Pipelined Multiplier using Modified Redundant Adder

    Full text link
    corecore