9,130 research outputs found
Floating Point Square Root under HUB Format
Unit-Biased (HUB) is an emerging format based on
shifting the representation line of the binary numbers by half
unit in the last place. The HUB format is specially relevant
for computers where rounding to nearest is required because
it is performed simply by truncation. From a hardware point
of view, the circuits implementing this representation save both
area and time since rounding does not involve any carry propagation.
Designs to perform the four basic operations have been
proposed under HUB format recently. Nevertheless, the square
root operation has not been confronted yet. In this paper we
present an architecture to carry out the square root operation
under HUB format for floating point numbers. The results of
this work keep supporting the fact that the HUB representation
involves simpler hardware than its conventional counterpart for
computers requiring round-to-nearest mode.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tec
An on-line approach for evaluating trigonometric functions
This thesis investigates the evaluation of trigonometric functions based on an on-line arithmetic approach. On-line algorithms have been developed to evaluate the sine and cosine functions. Error analysis and heuristics are carried out to arrive at a minimal error algorithm based on the series expansion of the sine and cosine function.
A logical design based on the algorithm is presented where the unit is designed as a set of basic modules. A detailed bit slice design of each module is also presented. A simulator was designed as an experimental tool for synthesis of the on-line algorithms, and a tool for performance evaluation
On-The-Fly Range Reduction
In several cases, the input argument of an elementary function evaluation is given bit-serially, most significant bit first. We suggest a solution for performing the first step of the evaluation (namely, the range reduction) on the fly: the computation is overlapped with the reception of the input bits. This algorithm can be used for the trigonometric functions sin, cos, tan as well as for the exponential function.Il arrive que l’oprande dont on doit calculer une fonction élémentaire soit disponible chiffre après chiffre, en série, en commençant par les poids forts. Nous proposons une solution permettant d’effectuer la première phase de l’évaluation(la réduction d’argument)au vol: le calcul et la réception des chiffres d’entré se recouvrent. Cet algorithme peut être utilisé pour les fonctions trigonométriques sin, cos, tan ainsi que pour l'exponentiell
Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider
[[abstract]]A carry-free subtractive division algorithm is proposed in this paper. In the conventional subtractive divider, adders are used to find both quotient bit and partial remainder. Carries are usually generated in the addition operation, and it may take time to finish the operation, therefore, the carry propagation delay usually is a bottleneck of the conventional subtractive divider. In this paper, a carry-free scheme is proposed by using signed bit representation to represent both quotient and partial remainder. During the arithmetic operation, a special technique is used to decide the quotient bit, and the new partial remainder can be found further by a table lookup-like method. The signed bit format of the quotient can be converted by on-the-fly conversion to the binary representation. Based on this algorithm a 32-b/32-b divider is designed and implemented, and the simulation shows that the divider works well.[[notice]]補ćŁĺ®Śç•˘[[incitationindex]]E
Design and implementation of high-radix arithmetic systems based on the SDNR/RNS data representation
This project involved the design and implementation of high-radix arithmetic systems based on the hybrid SDNRIRNS data representation. Some real-time applications require a real-time arithmetic system. An SDNR/RNS arithmetic system provides parallel, real-time processing. The advantages and disadvantages of high-radix SDNR/RNS arithmetic, and the feasibility of implementing SDNR/RNS arithmetic systems in CMOS VLSI technology, were investigated in this project. A common methodological model, which included the stages of analysis, design, implementation, testing, and simulation, was followed. The combination of the SDNR and RNS transforms potential complex logic networks into simpler logic blocks. It was found that when constructing a SDNRIRNS adder, factors such as the radix, digit set, and moduli must be taken into account. There are many avenues still to explore. For example, implementing other arithmetic systems in the same CMOS VLSI technology used in this project and comparing them to equivalent SDNR/RNS systems would provide a set of benchmarks. These benchmarks would be useful in addressing issues relating to relative performance
Random on-board pixel sampling (ROPS) X-ray Camera
Recent advances in compressed sensing theory and algorithms offer new
possibilities for high-speed X-ray camera design. In many CMOS cameras, each
pixel has an independent on-board circuit that includes an amplifier, noise
rejection, signal shaper, an analog-to-digital converter (ADC), and optional
in-pixel storage. When X-ray images are sparse, i.e., when one of the following
cases is true: (a.) The number of pixels with true X-ray hits is much smaller
than the total number of pixels; (b.) The X-ray information is redundant; or
(c.) Some prior knowledge about the X-ray images exists, sparse sampling may be
allowed. Here we first illustrate the feasibility of random on-board pixel
sampling (ROPS) using an existing set of X-ray images, followed by a discussion
about signal to noise as a function of pixel size. Next, we describe a possible
circuit architecture to achieve random pixel access and in-pixel storage. The
combination of a multilayer architecture, sparse on-chip sampling, and
computational image techniques, is expected to facilitate the development and
applications of high-speed X-ray camera technology.Comment: 9 pages, 6 figures, Presented in 19th iWoRI
A high-performance inner-product processor for real and complex numbers.
A novel, high-performance fixed-point inner-product processor based on a redundant binary number system is investigated in this dissertation. This scheme decreases the number of partial products to 50%, while achieving better speed and area performance, as well as providing pipeline extension opportunities. When modified Booth coding is used, partial products are reduced by almost 75%, thereby significantly reducing the multiplier addition depth. The design is applicable for digital signal and image processing applications that require real and/or complex numbers inner-product arithmetic, such as digital filters, correlation and convolution. This design is well suited for VLSI implementation and can also be embedded as an inner-product core inside a general purpose or DSP FPGA-based processor. Dynamic control of the computing structure permits different computations, such as a variety of inner-product real and complex number computations, parallel multiplication for real and complex numbers, and real and complex number division. The same structure can also be controlled to accept redundant binary number inputs for multiplication and inner-product computations. An improved 2's-complement to redundant binary converter is also presented
Fast decimal floating-point division
A new implementation for decimal floating-point (DFP) division is introduced. The algorithm is based on high-radix SRT division The SRT division algorithm is named after D. Sweeney, J. E. Robertson, and T. D. Tocher. with the recurrence in a new decimal signed-digit format. Quotient digits are selected using comparison multiples, where the magnitude of the quotient digit is calculated by comparing the truncated partial remainder with limited precision multiples of the divisor. The sign is determined concurrently by investigating the polarity of the truncated partial remainder. A timing evaluation using a logic synthesis shows a significant decrease in the division execution time in contrast with one of the fastest DFP dividers reported in the open literatureHooman Nikmehr, Braden Phillips and Cheng-Chew Li
- …