2,175 research outputs found

    An on-line approach for evaluating trigonometric functions

    Get PDF
    This thesis investigates the evaluation of trigonometric functions based on an on-line arithmetic approach. On-line algorithms have been developed to evaluate the sine and cosine functions. Error analysis and heuristics are carried out to arrive at a minimal error algorithm based on the series expansion of the sine and cosine function. A logical design based on the algorithm is presented where the unit is designed as a set of basic modules. A detailed bit slice design of each module is also presented. A simulator was designed as an experimental tool for synthesis of the on-line algorithms, and a tool for performance evaluation

    BKM: a new hardware algorithm for complex elementary functions

    Get PDF
    A new algorithm for computing the complex logarithm and exponential functions is proposed. This algorithm is based on shift-and-add elementary steps, and it generalizes some algorithms by Briggs and De Lugish (1970), as well as the CORDIC algorithm. It can easily be used to compute the classical real elementary functions (sin, cos, arctan, ln, exp). This algorithm is more suitable for computations in a redundant number system than the CORDIC algorithm, since there is no scaling factor when computing trigonometric function

    Super - cordic: Low delay cordic architectures for computing complex functions

    Get PDF
    This thesis proposes an optimized Co-ordinate Rotation Digital Computer (CORDIC) algorithm in the rotation and extended vectoring mode of the circular co-ordinate system. The CORDIC algorithm computes the values of trigonometric functions and their inverses. The proposed algorithm provides the result with a lower overall latency than existing systems. This is done by using redundant representations and approximations of the required direction and angle of each rotation. The algorithm has been designed to provide the result in a fixed number of iterations nn for the rotation mode and 3⌈n/2⌉+⌊n/2⌋3\lceil n/2 \rceil + \lfloor n/2 \rfloor for the extended vectoring mode; where, nn is a design parameter. In each iteration, the algorithm performs between 0 and p/np/n parallel rotations, where, pp is the number of precision bits and nn is the selected number of iterations. A technique to handle the scaling factor compensation for such an algorithm is proposed. The results of the functional verification for different values of nn and an estimation of the overall latency are presented. Based on the results, guidelines to choosing a value of nn to meet the required performance have also been presented.M.S

    Application-Specific Number Representation

    No full text
    Reconfigurable devices, such as Field Programmable Gate Arrays (FPGAs), enable application- specific number representations. Well-known number formats include fixed-point, floating- point, logarithmic number system (LNS), and residue number system (RNS). Such different number representations lead to different arithmetic designs and error behaviours, thus produc- ing implementations with different performance, accuracy, and cost. To investigate the design options in number representations, the first part of this thesis presents a platform that enables automated exploration of the number representation design space. The second part of the thesis shows case studies that optimise the designs for area, latency or throughput from the perspective of number representations. Automated design space exploration in the first part addresses the following two major issues: ² Automation requires arithmetic unit generation. This thesis provides optimised arithmetic library generators for logarithmic and residue arithmetic units, which support a wide range of bit widths and achieve significant improvement over previous designs. ² Generation of arithmetic units requires specifying the bit widths for each variable. This thesis describes an automatic bit-width optimisation tool called R-Tool, which combines dynamic and static analysis methods, and supports different number systems (fixed-point, floating-point, and LNS numbers). Putting it all together, the second part explores the effects of application-specific number representation on practical benchmarks, such as radiative Monte Carlo simulation, and seismic imaging computations. Experimental results show that customising the number representations brings benefits to hardware implementations: by selecting a more appropriate number format, we can reduce the area cost by up to 73.5% and improve the throughput by 14.2% to 34.1%; by performing the bit-width optimisation, we can further reduce the area cost by 9.7% to 17.3%. On the performance side, hardware implementations with customised number formats achieve 5 to potentially over 40 times speedup over software implementations

    DCT Implementation on GPU

    Get PDF
    There has been a great progress in the field of graphics processors. Since, there is no rise in the speed of the normal CPU processors; Designers are coming up with multi-core, parallel processors. Because of their popularity in parallel processing, GPUs are becoming more and more attractive for many applications. With the increasing demand in utilizing GPUs, there is a great need to develop operating systems that handle the GPU to full capacity. GPUs offer a very efficient environment for many image processing applications. This thesis explores the processing power of GPUs for digital image compression using Discrete cosine transform

    The implementation and applications of multiple-valued logic

    Get PDF
    Multiple-Valued Logic (MVL) takes two major forms. Multiple-valued circuits can implement the logic directly by using multiple-valued signals, or the logic can be implemented indirectly with binary circuits, by using more than one binary signal to represent a single multiple-valued signal. Techniques such as carry-save addition can be viewed as indirectly implemented MVL. Both direct and indirect techniques have been shown in the past to provide advantages over conventional arithmetic and logic techniques in algorithms required widely in computing for applications such as image and signal processing. It is possible to implement basic MVL building blocks at the transistor level. However, these circuits are difficult to design due to their non binary nature. In the design stage they are more like analogue circuits than binary circuits. Current integrated circuit technologies are biased towards binary circuitry. However, in spite of this, there is potential for power and area savings from MVL circuits, especially in technologies such as BiCMOS. This thesis shows that the use of voltage mode MVL will, in general not provide bandwidth increases on circuit buses because the buses become slower as the number of signal levels increases. Current mode MVL circuits however do have potential to reduce power and area requirements of arithmetic circuitry. The design of transistor level circuits is investigated in terms of a modern production technology. A novel methodology for the design of current mode MVL circuits is developed. The methodology is based upon the novel concept of the use of non-linear current encoding of signals, providing the opportunity for the efficient design of many previously unimplemented circuits in current mode MVL. This methodology is used to design a useful set of basic MVL building blocks, and fabrication results are reported. The creation of libraries of MVL circuits is also discussed. The CORDIC algorithm for two dimensional vector rotation is examined in detail as an example for indirect MVL implementation. The algorithm is extended to a set of three dimensional vector rotators using conventional arithmetic, redundant radix four arithmetic, and Taylor's series expansions. These algorithms can be used for two dimensional vector rotations in which no scale factor corrections are needed. The new algorithms are compared in terms of basic VLSI criteria against previously reported algorithms. A pipelined version of the redundant arithmetic algorithm is floorplanned and partially laid out to give indications of wiring overheads, and layout densities. An indirectly implemented MVL algorithm such as the CORDIC algorithm described in this thesis would clearly benefit from direct implementation in MVL

    CALCULATION OF SINE AND COSINE OF AN ANGLE USING THE CORDIC ALGORITHM

    Get PDF
    With increasing on chip complexities the on chip area is a major concern. Today users desire every gadget to be small enough, particularly the hand held systems.CORDIC is one such algorithm which serves this purpose.CORDIC algorithm has become a widely used approach to elementary function evaluation when silicon area is a primary concern.CORDIC is more economical than DSP algorithms both in terms of area and power consumption.This paper presents how to calculate sine and cosine values of the given angle using CORDIC algorithm. Abrief description of the theory behind the algorithm is also given. Summary of CORDIC synthesis results based on Xilinx FPGAs is given. The system simulation was carried out using Xilinx ISE Design Suite13.1. The system is implemented using Virtex5 XC5VF70T FPGA  with Xilinx ISE12.1 and Verilog Hardware Description Language

    Algorithms and VLSI architectures for parametric additive synthesis

    Get PDF
    A parametric additive synthesis approach to sound synthesis is advantageous as it can model sounds in a large scale manner, unlike the classical sinusoidal additive based synthesis paradigms. It is known that a large body of naturally occurring sounds are resonant in character and thus fit the concept well. This thesis is concerned with the computational optimisation of a super class of form ant synthesis which extends the sinusoidal parameters with a spread parameter known as band width. Here a modified formant algorithm is introduced which can be traced back to work done at IRCAM, Paris. When impulse driven, a filter based approach to modelling a formant limits the computational work-load. It is assumed that the filter's coefficients are fixed at initialisation, thus avoiding interpolation which can cause the filter to become chaotic. A filter which is more complex than a second order section is required. Temporal resolution of an impulse generator is achieved by using a two stage polyphase decimator which drives many filterbanks. Each filterbank describes one formant and is composed of sub-elements which allow variation of the formant’s parameters. A resource manager is discussed to overcome the possibility of all sub- banks operating in unison. All filterbanks for one voice are connected in series to the impulse generator and their outputs are summed and scaled accordingly. An explorative study of number systems for DSP algorithms and their architectures is investigated. I invented a new theoretical mechanism for multi-level logic based DSP. Its aims are to reduce the number of transistors and to increase their functionality. A review of synthesis algorithms and VLSI architectures are discussed in a case study between a filter based bit-serial and a CORDIC based sinusoidal generator. They are both of similar size, but the latter is always guaranteed to be stable

    Searchable Sky Coverage of Astronomical Observations: Footprints and Exposures

    Full text link
    Sky coverage is one of the most important pieces of information about astronomical observations. We discuss possible representations, and present algorithms to create and manipulate shapes consisting of generalized spherical polygons with arbitrary complexity and size on the celestial sphere. This shape specification integrates well with our Hierarchical Triangular Mesh indexing toolbox, whose performance and capabilities are enhanced by the advanced features presented here. Our portable implementation of the relevant spherical geometry routines comes with wrapper functions for database queries, which are currently being used within several scientific catalog archives including the Sloan Digital Sky Survey, the Galaxy Evolution Explorer and the Hubble Legacy Archive projects as well as the Footprint Service of the Virtual Observatory.Comment: 11 pages, 7 figures, submitted to PAS
    • …
    corecore