6 research outputs found
Optimizaci贸n de recursos hardware para la operaci贸n de convoluci贸n utilizada en el procesamiento digital de se帽ales
Esta tesis presenta varias arquitecturas sobre la unidad MAC (multiplica鈥揳cumula) para la
optimizaci贸n de la operaci贸n de convoluci贸n, que es ampliamente utilizada en el
procesamiento digital de se帽ales, sobre varios dispositivos electr贸nicos de bajo coste.
B谩sicamente esta optimizaci贸n se centra en las FPGA de Xilinx Spartan 3 y Spartan 6,
utilizando aritm茅tica redundante, en particular la aritm茅tica carry鈥搒ave. Este tipo de
aritm茅tica no se suele utilizar en las FPGAs debido a que aumenta el 谩rea consumida, pero
en esta investigaci贸n se ha demostrado experimentalmente que cuando el n煤mero de
operaciones MAC a realizar es elevado, como es el caso de la convoluci贸n de dos se帽ales,
el uso de la aritm茅tica CSA resulta eficiente, ya que disminuye significativamente los
tiempos empleados, sin un aumento excesivo de los recursos utilizados de la FPGA.
Por otro lado, tambi茅n se han estudiado otros dispositivos electr贸nicos que suelen ser
empleados en el procesamiento digital de se帽ales, tales como DSP o GPP, realizando una
comparaci贸n de los tiempos empleados de las FPGAs respecto a estos dispositivos.This Thesis presents several architectures of the multiply-accumulate unit (MAC) to
optimize the convolution operation, which is widely used in digital signal processing, on
several low-cost electronic devices. This optimization is mainly focused on Xilinx Spartan-
3 and Spartan-6 FPGAs, using redundant arithmetic, specifically the carry-save arithmetic
(CSA). This type of arithmetic is not usually used on FPGAs since its high consumption of
area resources, but this research shows that if the number of MAC operations developed is
high, as the case of the convolution of two signals, the use of CSA arithmetic is efficient,
since it decreases significantly the execution times without an excessive increase of the
resources used in the FPGA.
On the other hand, other electronic devices as DSP or GPP, usually used in digital signal
processing, have been studied. A comparation of execution times on FPGAs and these
devices has been included
A fast parallel squarer based on divide-and-conquer
Journal ArticleFast and small squarers are needed in many applications such as image compression. A new family of high performance parallel squarers based on the divide-and-conquer method is reported. Our main result was realizing the basis cases of the divide-and-conquer recursion by using optimized n-bit primitive squarers, where n is in the range of 2 to 6. This method reduced the gate count and provided shorter critical paths. A chip implementing an 8-bit squarer was designed, fabricated and successfully tested, resulting in 24 MOPS using a 2-p CMOS fabrication technology. This squarer had two additional features: increased number of squaring operations per unit circuit area, and the potential for reduced power consumption per squaring operation
A study of arithmetic circuits and the effect of utilising Reed-Muller techniques
Reed-Muller algebraic techniques, as an alternative means in logic design, became more attractive recently, because of their compact representations of logic functions and yielding of easily testable circuits. It is claimed by some researchers that Reed-Muller algebraic techniques are particularly suitable for arithmetic circuits. In fact, no practical application in this field can be found in the open literature.This project investigates existing Reed-Muller algebraic techniques and explores their application in arithmetic circuits. The work described in this thesis is concerned with practical applications in arithmetic circuits, especially for minimizing logic circuits at the transistor level. These results are compared with those obtained using the conventional Boolean algebraic techniques. This work is also related to wider fields, from logic level design to layout level design in CMOS circuits, the current leading technology in VLSI. The emphasis is put on circuit level (transistor level) design. The results show that, although Boolean logic is believed to be a more general tool in logic design, it is not the best tool in all situations. Reed-Muller logic can generate good results which can't be easily obtained by using Boolean logic.F or testing purposes, a gate fault model is often used in the conventional implementation of Reed-Muller logic, which leads to Reed-Muller logic being restricted to using a small gate set. This usually leads to generating more complex circuits. When a cell fault model, which is more suitable for regular and iterative circuits, such as arithmetic circuits, is used instead of the gate fault model in Reed-Muller logic, a wider gate set can be employed to realize Reed-Muller functions. As a result, many circuits designed using Reed-Muller logic can be comparable to that designed using Boolean logic. This conclusion is demonstrated by testing many randomly generated functions.The main aim of this project is to develop arithmetic circuits for practical application. A number of practical arithmetic circuits are reported. The first one is a carry chain adder. Utilising the CMOS circuit characteristics, a simple and high speed carry chain is constructed to perform the carry operation. The proposed carry chain adder can be reconstructed to form a fast carry skip adder, and it is also found to be a good application for residue number adders. An algorithm for an on-line adder and its implementation are also developed. Another circuit is a parallel multiplier based on 5:3 counter. The simulations show that the proposed circuits are better than many previous designs, in terms of the number of transistors and speed. In addition, a 4:2 compressor for a carry free adder is investigated. It is shown that the two main schemes to construct the 4:2 compressor have a unified structure. A variant of the Baugh and Wooley algorithm is also studied and generalized in this work