489 research outputs found

    Algorithms and architectures for decimal transcendental function computation

    Get PDF
    Nowadays, there are many commercial demands for decimal floating-point (DFP) arithmetic operations such as financial analysis, tax calculation, currency conversion, Internet based applications, and e-commerce. This trend gives rise to further development on DFP arithmetic units which can perform accurate computations with exact decimal operands. Due to the significance of DFP arithmetic, the IEEE 754-2008 standard for floating-point arithmetic includes it in its specifications. The basic decimal arithmetic unit, such as decimal adder, subtracter, multiplier, divider or square-root unit, as a main part of a decimal microprocessor, is attracting more and more researchers' attentions. Recently, the decimal-encoded formats and DFP arithmetic units have been implemented in IBM's system z900, POWER6, and z10 microprocessors. Increasing chip densities and transistor count provide more room for designers to add more essential functions on application domains into upcoming microprocessors. Decimal transcendental functions, such as DFP logarithm, antilogarithm, exponential, reciprocal and trigonometric, etc, as useful arithmetic operations in many areas of science and engineering, has been specified as the recommended arithmetic in the IEEE 754-2008 standard. Thus, virtually all the computing systems that are compliant with the IEEE 754-2008 standard could include a DFP mathematical library providing transcendental function computation. Based on the development of basic decimal arithmetic units, more complex DFP transcendental arithmetic will be the next building blocks in microprocessors. In this dissertation, we researched and developed several new decimal algorithms and architectures for the DFP transcendental function computation. These designs are composed of several different methods: 1) the decimal transcendental function computation based on the table-based first-order polynomial approximation method; 2) DFP logarithmic and antilogarithmic converters based on the decimal digit-recurrence algorithm with selection by rounding; 3) a decimal reciprocal unit using the efficient table look-up based on Newton-Raphson iterations; and 4) a first radix-100 division unit based on the non-restoring algorithm with pre-scaling method. Most decimal algorithms and architectures for the DFP transcendental function computation developed in this dissertation have been the first attempt to analyze and implement the DFP transcendental arithmetic in order to achieve faithful results of DFP operands, specified in IEEE 754-2008. To help researchers evaluate the hardware performance of DFP transcendental arithmetic units, the proposed architectures based on the different methods are modeled, verified and synthesized using FPGAs or with CMOS standard cells libraries in ASIC. Some of implementation results are compared with those of the binary radix-16 logarithmic and exponential converters; recent developed high performance decimal CORDIC based architecture; and Intel's DFP transcendental function computation software library. The comparison results show that the proposed architectures have significant speed-up in contrast to the above designs in terms of the latency. The algorithms and architectures developed in this dissertation provide a useful starting point for future hardware-oriented DFP transcendental function computation researches

    Implementation and Applications of Logarithmic Signal Processing on an FPGA

    Get PDF
    This thesis presents two novel algorithms for converting a normalised binary floating point number into a binary logarithmic number with the single-precision of a floating point number. The thesis highlights the importance of logarithmic number systems in real-time DSP applications. A real-time cross-correlation application where logarithmic signal processing is used to simplify the complex computation is presented. The first algorithm presented in this thesis comprises two stages. A piecewise linear approximation to the original logarithmic curve is performed in the first stage and a scaled-down normalised error curve is stored in the second stage. The algorithm requires less than 20 kbits of ROM and a maximum of three small multipliers. The architecture is implemented on Xilinx's Spartan3 and Spartan6 FPGA family. Synthesis results confirm that the algorithm operates at a frequency of 42.3 MHz on a Spartan3 device and 127.8 MHz on a Spartan6. Both solutions have a pipeline latency of two clocks. The operating speed increases to 71.4 MHz and 160 MHz respectively when the pipeline latencies increase to eight clocks. The proposed algorithm is further improved by using a PWL (Piece-Wise Linear) approximation of the transform curve combined with a PWL approximation of a scaled version of the normalized segment error. A hardware approach for reducing the memory with additional XOR gates in the second stage is also presented. The architecture presented uses just one 18k bit Block RAM (BRAM) and synthesis results indicate operating frequencies of 93 and 110 MHz when implemented on the Xilinx Spartan3 and Spartan6 devices respectively. Finally a novel prototype of an FPGA-based four channel correlation velocimetry system is presented. The system operates at a higher sampling frquency than previous published work and outputs the new result after every new sample it receives. The system works at a sampling frequency of 195.31 kHz and a sample resolution of 12 bits. The prototype system calculates a delay in a range of 0 to 2.6 ms with a resolution of 5.12 us

    Application-Specific Number Representation

    No full text
    Reconfigurable devices, such as Field Programmable Gate Arrays (FPGAs), enable application- specific number representations. Well-known number formats include fixed-point, floating- point, logarithmic number system (LNS), and residue number system (RNS). Such different number representations lead to different arithmetic designs and error behaviours, thus produc- ing implementations with different performance, accuracy, and cost. To investigate the design options in number representations, the first part of this thesis presents a platform that enables automated exploration of the number representation design space. The second part of the thesis shows case studies that optimise the designs for area, latency or throughput from the perspective of number representations. Automated design space exploration in the first part addresses the following two major issues: ² Automation requires arithmetic unit generation. This thesis provides optimised arithmetic library generators for logarithmic and residue arithmetic units, which support a wide range of bit widths and achieve significant improvement over previous designs. ² Generation of arithmetic units requires specifying the bit widths for each variable. This thesis describes an automatic bit-width optimisation tool called R-Tool, which combines dynamic and static analysis methods, and supports different number systems (fixed-point, floating-point, and LNS numbers). Putting it all together, the second part explores the effects of application-specific number representation on practical benchmarks, such as radiative Monte Carlo simulation, and seismic imaging computations. Experimental results show that customising the number representations brings benefits to hardware implementations: by selecting a more appropriate number format, we can reduce the area cost by up to 73.5% and improve the throughput by 14.2% to 34.1%; by performing the bit-width optimisation, we can further reduce the area cost by 9.7% to 17.3%. On the performance side, hardware implementations with customised number formats achieve 5 to potentially over 40 times speedup over software implementations

    Power-Aware Design Methodologies for FPGA-Based Implementation of Video Processing Systems

    Get PDF
    The increasing capacity and capabilities of FPGA devices in recent years provide an attractive option for performance-hungry applications in the image and video processing domain. FPGA devices are often used as implementation platforms for image and video processing algorithms for real-time applications due to their programmable structure that can exploit inherent spatial and temporal parallelism. While performance and area remain as two main design criteria, power consumption has become an important design goal especially for mobile devices. Reduction in power consumption can be achieved by reducing the supply voltage, capacitances, clock frequency and switching activities in a circuit. Switching activities can be reduced by architectural optimization of the processing cores such as adders, multipliers, multiply and accumulators (MACS), etc. This dissertation research focuses on reducing the switching activities in digital circuits by considering data dependencies in bit level, word level and block level neighborhoods in a video frame. The bit level data neighborhood dependency consideration for power reduction is illustrated in the design of pipelined array, Booth and log-based multipliers. For an array multiplier, operands of the multipliers are partitioned into higher and lower parts so that the probability of the higher order parts being zero or one increases. The gating technique for the pipelined approach deactivates part(s) of the multiplier when the above special values are detected. For the Booth multiplier, the partitioning and gating technique is integrated into the Booth recoding scheme. In addition, a delay correction strategy is developed for the Booth multiplier to reduce the switching activities of the sign extension part in the partial products. A novel architecture design for the computation of log and inverse-log functions for the reduction of power consumption in arithmetic circuits is also presented. This also utilizes the proposed partitioning and gating technique for further dynamic power reduction by reducing the switching activities. The word level and block level data dependencies for reducing the dynamic power consumption are illustrated by presenting the design of a 2-D convolution architecture. Here the similarities of the neighboring pixels in window-based operations of image and video processing algorithms are considered for reduced switching activities. A partitioning and detection mechanism is developed to deactivate the parallel architecture for window-based operations if higher order parts of the pixel values are the same. A neighborhood dependent approach (NDA) is incorporated with different window buffering schemes. Consideration of the symmetry property in filter kernels is also applied with the NDA method for further reduction of switching activities. The proposed design methodologies are implemented and evaluated in a FPGA environment. It is observed that the dynamic power consumption in FPGA-based circuit implementations is significantly reduced in bit level, data level and block level architectures when compared to state-of-the-art design techniques. A specific application for the design of a real-time video processing system incorporating the proposed design methodologies for low power consumption is also presented. An image enhancement application is considered and the proposed partitioning and gating, and NDA methods are utilized in the design of the enhancement system. Experimental results show that the proposed multi-level power aware methodology achieves considerable power reduction. Research work is progressing In utilizing the data dependencies in subsequent frames in a video stream for the reduction of circuit switching activities and thereby the dynamic power consumption

    Analogue-to-digital conversion and image enhancement using neuron-mos technology

    Get PDF
    This thesis describes the development of two novel circuits that use a newly developed technology, that of neuron-MOS, for the purposes of analogue-to-digital conversion and image enhancement. Neuron-MOS has the potential to reduce both the complexity and number of transistors required for analogue and digital circuits. A reduced area, low transistor-count- analogue-to-digital converter that is suitable for inclusion in a massively parallel array of identical image processing elements is developed. Supporting the function of the array some fundamental image enhancement operations, such as edge enhancement, are examined exploiting the unique features of neuron-MOS technology

    Space Communications: Theory and Applications. Volume 3: Information Processing and Advanced Techniques. A Bibliography, 1958 - 1963

    Get PDF
    Annotated bibliography on information processing and advanced communication techniques - theory and applications of space communication

    Optimal control and robust estimation for ocean wave energy converters

    No full text
    This thesis deals with the optimal control of wave energy converters and some associated observer design problems. The first part of the thesis will investigate model predictive control of an ocean wave energy converter to maximize extracted power. A generic heaving converter that can have both linear dampers and active elements as a power take-off system is considered and an efficient optimal control algorithm is developed for use within a receding horizon control framework. The optimal control is also characterized analytically. A direct transcription of the optimal control problem is also considered as a general nonlinear program. A variation of the projected gradient optimization scheme is formulated and shown to be feasible and computationally inexpensive compared to a standard nonlinear program solver. Since the system model is bilinear and the cost function is not convex quadratic, the resulting optimization problem is shown not to be a quadratic program. Results are compared with other methods like optimal latching to demonstrate the improvement in absorbed power under irregular sea condition simulations. In the second part, robust estimation of the radiation forces and states inherent in the optimal control of wave energy converters is considered. Motivated by this, low order H∞ observer design for bilinear systems with input constraints is investigated and numerically tractable methods for design are developed. A bilinear Luenberger type observer is formulated and the resulting synthesis problem reformulated as that for a linear parameter varying system. A bilinear matrix inequality problem is then solved to find nominal and robust quadratically stable observers. The performance of these observers is compared with that of an extended Kalman filter. The robustness of the observers to parameter uncertainty and to variation in the radiation subsystem model order is also investigated. This thesis also explores the numerical integration of bilinear control systems with zero-order hold on the control inputs. Making use of exponential integrators, exact to high accuracy integration is proposed for such systems. New a priori bounds are derived on the computational complexity of integrating bilinear systems with a given error tolerance. Employing our new bounds on computational complexity, we propose a direct exponential integrator to solve bilinear ODEs via the solution of sparse linear systems of equations. Based on this, a novel sparse direct collocation of bilinear systems for optimal control is proposed. These integration schemes are also used within the indirect optimal control method discussed in the first part.Open Acces

    Advances and Novel Approaches in Discrete Optimization

    Get PDF
    Discrete optimization is an important area of Applied Mathematics with a broad spectrum of applications in many fields. This book results from a Special Issue in the journal Mathematics entitled ‘Advances and Novel Approaches in Discrete Optimization’. It contains 17 articles covering a broad spectrum of subjects which have been selected from 43 submitted papers after a thorough refereeing process. Among other topics, it includes seven articles dealing with scheduling problems, e.g., online scheduling, batching, dual and inverse scheduling problems, or uncertain scheduling problems. Other subjects are graphs and applications, evacuation planning, the max-cut problem, capacitated lot-sizing, and packing algorithms

    Applications of Artificial Intelligence to Cryptography

    Get PDF
    This paper considers some recent advances in the field of Cryptography using Artificial Intelligence (AI). It specifically considers the applications of Machine Learning (ML) and Evolutionary Computing (EC) to analyze and encrypt data. A short overview is given on Artificial Neural Networks (ANNs) and the principles of Deep Learning using Deep ANNs. In this context, the paper considers: (i) the implementation of EC and ANNs for generating unique and unclonable ciphers; (ii) ML strategies for detecting the genuine randomness (or otherwise) of finite binary strings for applications in Cryptanalysis. The aim of the paper is to provide an overview on how AI can be applied for encrypting data and undertaking cryptanalysis of such data and other data types in order to assess the cryptographic strength of an encryption algorithm, e.g. to detect patterns of intercepted data streams that are signatures of encrypted data. This includes some of the authors’ prior contributions to the field which is referenced throughout. Applications are presented which include the authentication of high-value documents such as bank notes with a smartphone. This involves using the antenna of a smartphone to read (in the near field) a flexible radio frequency tag that couples to an integrated circuit with a non-programmable coprocessor. The coprocessor retains ultra-strong encrypted information generated using EC that can be decrypted on-line, thereby validating the authenticity of the document through the Internet of Things with a smartphone. The application of optical authentication methods using a smartphone and optical ciphers is also briefly explored

    Contributions to cascade linear control strategies applied to grid-connected Voltage-Source Converters

    Get PDF
    El trabajo desarrollado en esta Tesis se centra en optimizar el comportamiento de Voltage-Source Converters (VSCs) cuando son utilizados como interfaz con la red eléctrica, tanto para absorber como para entregar energía de la red con la mejor calidad posible y cumpliendo con los estándares. Para tal fin, esta Tesis se centra en el control de sistemas lineales conectados en cascada aplicados al control de VSCs conectados en paralelo con la red eléctrica a través de un filtro L, especialmente en conexiones con redes débiles y en dos líneas de trabajo: (i) seguimiento de armónicos de las corrientes de red y rechazo de armónicos de las tensiones de red, y (ii) control de la tensión del PCC en caso de desequilibrio. Para ello, esta Tesis realiza contribuciones en el área del control de corriente y control de la tensión del PCC. De entre las técnicas existentes para implementar el control de corriente para compensación armónica, dos de las más utilizadas son el control resonante y el control repetitivo, tanto en ejes de referencia estacionarios como síncronos. Se ha realizado un exhaustivo estudio de diferentes estructuras para implementar tales controles, mostrando su algoritmo adaptativo en frecuencia para cada una de ellas y analizando su carga computacional. Además, se han facilitado directrices básicas para su programación en un DSP. Se ha analizado también el esquema de control de corriente para establecer una comparación entre las diferentes estructuras. Después de estudiar en profundidad el control de corriente de un VSC conectado a la red eléctrica, el segundo control a analizar es el control de tensión del PCC. La presencia de una tensión desequilibrada en el PCC da lugar a la aparición de una componente de corriente de secuencia negativa, que deteriora el comportamiento del sistema de control cuando se emplean las técnicas de control convencionales. Los STATCOMs son bien conocidos por ser una aplicación de potencia capaz de llevar a cabo la regulación de la tensión en el PCC en líneas de distribución que pueden ser susceptibles de sufrir perturbaciones. Esta Tesis propone el uso de un controlador de tensión en ejes de referencia síncronos para compensar una tensión desequilibrada a través de un STATCOM, permitiendo controlar independientemente tanto la secuencia positiva como la secuencia negativa. Además, este controlador incluye aspectos como un mecanismo de antiwindup y droop control para mejorar su comportamiento. Se han realizado varias pruebas experimentales para analizar las características de los controladores de corriente abordados en esta Tesis. Todas ellas han sido realizadas bajo las mismas condiciones de potencia, tensión y corriente, de modo que se pueden extraer resultados comparativos. Estas pruebas pretenden caracterizar la respuesta transitoria, la respuesta en régimen permanente, el comportamiento frente a saltos de frecuencia y la carga computacional de los controladores de corriente estudiados
    corecore