9,130 research outputs found

    Floating Point Square Root under HUB Format

    Get PDF
    Unit-Biased (HUB) is an emerging format based on shifting the representation line of the binary numbers by half unit in the last place. The HUB format is specially relevant for computers where rounding to nearest is required because it is performed simply by truncation. From a hardware point of view, the circuits implementing this representation save both area and time since rounding does not involve any carry propagation. Designs to perform the four basic operations have been proposed under HUB format recently. Nevertheless, the square root operation has not been confronted yet. In this paper we present an architecture to carry out the square root operation under HUB format for floating point numbers. The results of this work keep supporting the fact that the HUB representation involves simpler hardware than its conventional counterpart for computers requiring round-to-nearest mode.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tec

    An on-line approach for evaluating trigonometric functions

    Get PDF
    This thesis investigates the evaluation of trigonometric functions based on an on-line arithmetic approach. On-line algorithms have been developed to evaluate the sine and cosine functions. Error analysis and heuristics are carried out to arrive at a minimal error algorithm based on the series expansion of the sine and cosine function. A logical design based on the algorithm is presented where the unit is designed as a set of basic modules. A detailed bit slice design of each module is also presented. A simulator was designed as an experimental tool for synthesis of the on-line algorithms, and a tool for performance evaluation

    On-The-Fly Range Reduction

    Get PDF
    In several cases, the input argument of an elementary function evaluation is given bit-serially, most significant bit first. We suggest a solution for performing the first step of the evaluation (namely, the range reduction) on the fly: the computation is overlapped with the reception of the input bits. This algorithm can be used for the trigonometric functions sin, cos, tan as well as for the exponential function.Il arrive que l’oprande dont on doit calculer une fonction élémentaire soit disponible chiffre après chiffre, en série, en commençant par les poids forts. Nous proposons une solution permettant d’effectuer la première phase de l’évaluation(la réduction d’argument)au vol: le calcul et la réception des chiffres d’entré se recouvrent. Cet algorithme peut être utilisé pour les fonctions trigonométriques sin, cos, tan ainsi que pour l'exponentiell

    Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider

    Get PDF
    [[abstract]]A carry-free subtractive division algorithm is proposed in this paper. In the conventional subtractive divider, adders are used to find both quotient bit and partial remainder. Carries are usually generated in the addition operation, and it may take time to finish the operation, therefore, the carry propagation delay usually is a bottleneck of the conventional subtractive divider. In this paper, a carry-free scheme is proposed by using signed bit representation to represent both quotient and partial remainder. During the arithmetic operation, a special technique is used to decide the quotient bit, and the new partial remainder can be found further by a table lookup-like method. The signed bit format of the quotient can be converted by on-the-fly conversion to the binary representation. Based on this algorithm a 32-b/32-b divider is designed and implemented, and the simulation shows that the divider works well.[[notice]]補正完畢[[incitationindex]]E

    Design and implementation of high-radix arithmetic systems based on the SDNR/RNS data representation

    Get PDF
    This project involved the design and implementation of high-radix arithmetic systems based on the hybrid SDNRIRNS data representation. Some real-time applications require a real-time arithmetic system. An SDNR/RNS arithmetic system provides parallel, real-time processing. The advantages and disadvantages of high-radix SDNR/RNS arithmetic, and the feasibility of implementing SDNR/RNS arithmetic systems in CMOS VLSI technology, were investigated in this project. A common methodological model, which included the stages of analysis, design, implementation, testing, and simulation, was followed. The combination of the SDNR and RNS transforms potential complex logic networks into simpler logic blocks. It was found that when constructing a SDNRIRNS adder, factors such as the radix, digit set, and moduli must be taken into account. There are many avenues still to explore. For example, implementing other arithmetic systems in the same CMOS VLSI technology used in this project and comparing them to equivalent SDNR/RNS systems would provide a set of benchmarks. These benchmarks would be useful in addressing issues relating to relative performance

    Random on-board pixel sampling (ROPS) X-ray Camera

    Full text link
    Recent advances in compressed sensing theory and algorithms offer new possibilities for high-speed X-ray camera design. In many CMOS cameras, each pixel has an independent on-board circuit that includes an amplifier, noise rejection, signal shaper, an analog-to-digital converter (ADC), and optional in-pixel storage. When X-ray images are sparse, i.e., when one of the following cases is true: (a.) The number of pixels with true X-ray hits is much smaller than the total number of pixels; (b.) The X-ray information is redundant; or (c.) Some prior knowledge about the X-ray images exists, sparse sampling may be allowed. Here we first illustrate the feasibility of random on-board pixel sampling (ROPS) using an existing set of X-ray images, followed by a discussion about signal to noise as a function of pixel size. Next, we describe a possible circuit architecture to achieve random pixel access and in-pixel storage. The combination of a multilayer architecture, sparse on-chip sampling, and computational image techniques, is expected to facilitate the development and applications of high-speed X-ray camera technology.Comment: 9 pages, 6 figures, Presented in 19th iWoRI

    A high-performance inner-product processor for real and complex numbers.

    Get PDF
    A novel, high-performance fixed-point inner-product processor based on a redundant binary number system is investigated in this dissertation. This scheme decreases the number of partial products to 50%, while achieving better speed and area performance, as well as providing pipeline extension opportunities. When modified Booth coding is used, partial products are reduced by almost 75%, thereby significantly reducing the multiplier addition depth. The design is applicable for digital signal and image processing applications that require real and/or complex numbers inner-product arithmetic, such as digital filters, correlation and convolution. This design is well suited for VLSI implementation and can also be embedded as an inner-product core inside a general purpose or DSP FPGA-based processor. Dynamic control of the computing structure permits different computations, such as a variety of inner-product real and complex number computations, parallel multiplication for real and complex numbers, and real and complex number division. The same structure can also be controlled to accept redundant binary number inputs for multiplication and inner-product computations. An improved 2's-complement to redundant binary converter is also presented

    Fast decimal floating-point division

    Get PDF
    A new implementation for decimal floating-point (DFP) division is introduced. The algorithm is based on high-radix SRT division The SRT division algorithm is named after D. Sweeney, J. E. Robertson, and T. D. Tocher. with the recurrence in a new decimal signed-digit format. Quotient digits are selected using comparison multiples, where the magnitude of the quotient digit is calculated by comparing the truncated partial remainder with limited precision multiples of the divisor. The sign is determined concurrently by investigating the polarity of the truncated partial remainder. A timing evaluation using a logic synthesis shows a significant decrease in the division execution time in contrast with one of the fastest DFP dividers reported in the open literatureHooman Nikmehr, Braden Phillips and Cheng-Chew Li
    • …
    corecore