Search CORE

32 research outputs found

Novel arithmetic implementations using cellular neural network arrays.

Author: Ibrahim Youssef
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2005
Field of study

The primary goal of this research is to explore the use of arrays of analog self-synchronized cells---the cellular neural network (CNN) paradigm---in the implementation of novel digital arithmetic architectures. In exploring this paradigm we also discover that the implementation of these CNN arrays produces very low system noise; that is, noise generated by the rapid switching of current through power supply die connections---so called di/dt noise. With the migration to sub 100 nanometer process technology, signal integrity is becoming a critical issue when integrating analog and digital components onto the same chip, and so the CNN architectural paradigm offers a potential solution to this problem. A typical example is the replacement of conventional digital circuitry adjacent to sensitive bio-sensors in a SoC Bio-Platform. The focus of this research is therefore to discover novel approaches to building low-noise digital arithmetic circuits using analog cellular neural networks, essentially implementing asynchronous digital logic but with the same circuit components as used in analog circuit design. We address our exploration by first improving upon previous research into CNN binary arithmetic arrays. The second phase of our research introduces a logical extension of the binary arithmetic method to implement binary signed-digit (BSD) arithmetic. To this end, a new class of CNNs that has three stable states is introduced, and is used to implement arithmetic circuits that use binary inputs and outputs but internally uses the BSD number representation. Finally, we develop CNN arrays for a 2-dimensional number representation (the Double-base Number System - DBNS). A novel adder architecture is described in detail, that performs the addition as well as reducing the representation for further processing; the design incorporates an innovative self-programmable array. Extensive simulations have shown that our new architectures can reduce system noise by almost 70dB and crosstalk by more than 23dB over standard digital implementations.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .I27. Source: Dissertation Abstracts International, Volume: 66-11, Section: B, page: 6159. Thesis (Ph.D.)--University of Windsor (Canada), 2005

Scholarship at UWindsor

Multiplierless CSD techniques for high performance FPGA implementation of digital filters.

Author: Wang Yunhua.
Publication venue
Publication date: 01/01/2007
Field of study

I leverage FastCSD to develop a new, high performance iterative multiplierless structure based on a novel real-time CSD recoding, so that more zero partial products are introduced. Up to 66.7% zero partial products occur compared to 50% in the traditional modified Booth's recoding. Also, this structure reduces the non-zero partial products to a minimum. As a result, the number of arithmetic operations in the carry-save structure is reduced. Thus, an overall speed-up, as well as low-power consumption can be achieved. Furthermore, because the proposed structure involves real time CSD recoding and does not require a fixed value for the multiplier input to be known a priori, the proposed multiplier can be applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters.My work is based on a dramatic new technique for converting between 2's complement and CSD number systems, and results in high-performance structures that are particularly effective for implementing adaptive systems in reconfigurable logic.My research focus is on two key ideas for improving DSP performance: (1) Develop new high performance, efficient shift-add techniques ("multiplierless") to implement the multiply-add operations without the need for a traditional multiplier structure. (2) There is a growing trend toward design prototyping and even production in FPGAs as opposed to dedicated DSP processors or ASICs; leverage this trend synergistically with the new multiplierless structures to improve performance.Implementation of digital signal processing (DSP) algorithms in hardware, such as field programmable gate arrays (FPGAs), requires a large number of multipliers. Fast, low area multiply-adds have become critical in modern commercial and military DSP applications. In many contemporary real-time DSP and multimedia applications, system performance is severely impacted by the limitations of currently available speed, energy efficiency, and area requirement of an onboard silicon multiplier.I also introduce a new multi-input Canonical Signed Digit (CSD) multiplier unit, which requires fewer shift/add/subtract operations and reduced CSD number conversion overhead compared to existing techniques. This results in reduced power consumption and area requirements in the hardware implementation of DSP algorithms. Furthermore, because all the products are produced simultaneously, the multiplication speed and thus the throughput are improved. The multi-input multiplier unit is applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters. The implementation cost of these digital filters can be further reduced by limiting the wordlength of the input signal with little or no sacrifice to the filter performance, which is confirmed by my simulation results. The proposed multiplier unit can also be applied to other DSP algorithms, such as digital filter banks or matrix and vector multiplications.Finally, the tradeoff between filter order and coefficient length in the design and implementation of high-performance filters in Field Programmable Gate Arrays (FPGAs) is discussed. Non-minimum order FIR filters are designed for implementation using Canonical Signed Digit (CSD) multiplierless implementation techniques. By increasing the filter order, the length of the coefficients can be decreased without reducing the filter performance. Thus, an overall hardware savings can be achieved.Adaptive system implementations require real-time conversion of coefficients to Canonical Signed Digit (CSD) or similar representations to benefit from multiplierless techniques for implementing filters. Multiplierless approaches are used to reduce the hardware and increase the throughput. This dissertation introduces the first non-iterative hardware algorithm to convert 2's complement numbers to their CSD representations (FastCSD) using a fixed number of shift and logic operations. As a result, the power consumption and area requirements required for hardware implementation of DSP algorithms in which the coefficients are not known a priori can be greatly reduced. Because all CSD digits are produced simultaneously, the conversion speed and thus the throughput are improved when compared to overlap-and-scan techniques such as Booth's recoding

SHAREOK repository

Techniques for Efficient Implementation of FIR and Particle Filtering

Author
Publication venue: 'Linkoping University Electronic Press'
Publication date
Field of study

Crossref

Evolutionary design of digital VLSI hardware

Author: Thomson Robert
Publication venue: The University of Edinburgh
Publication date: 01/01/2005
Field of study

Edinburgh Research Archive

A low-power quadrature digital modulator in 0.18um CMOS

Author: Hu Song
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

Quadrature digital modulation techniques are widely used in modern communication systems because of their high performance and flexibility. However, these advantages come at the cost of high power consumption. As a result, power consumption has to be taken into account as a main design factor of the modulator.In this thesis, a low-power quadrature digital modulator in 0.18um CMOS is presented with the target system clock speed of 150 MHz. The quadrature digital modulator consists of several key blocks: quadrature direct digital synthesizer (QDDS), pulse shaping filter, interpolation filter and inverse sinc filter. The design strategy is to investigate different implementations for each block and compare the power consumption of these implementations. Based on the comparison results, the implementation that consumes the lowest power will be chosen for each block. First of all, a novel low-power QDDS is proposed in the thesis. Power consumption estimation shows that it can save up to 60% of the power consumption at 150 MHz system clock frequency compared with one conventional design. Power consumption estimation results also show that using two pulse shaping blocks to process I/Q data, cascaded integrator comb (CIC) interpolation structure, and inverse sinc filter with modified canonic signed digit (MCSD) multiplication consume less power than alternative design choices. These low-power blocks are integrated together to achieve a low-power modulator. The power consumption estimation after layout shows that it only consumes about 95 mW at 150 MHz system clock rate, which is much lower than similar commercial products. The designed modulator can provide a low-power solution for various quadrature modulators. It also has an output bandwidth from 0 to 75 MHz, configurable pulse shaping filters and interpolation filters, and an internal sin(x)/x correction filter

eCommons@USASK

University of Saskatchewan Research Archive

Design and implementation of digital wave filter adaptors

Author: Petrie Neil
Publication venue: The University of Edinburgh
Publication date: 01/01/1985
Field of study

Edinburgh Research Archive

Energy efficient hardware acceleration of multimedia processing tools

Author: Kinane Andrew
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/2006
Field of study

The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

Irish Universities

DCU Online Research Access Service

Design and Implementation of Complexity Reduced Digital Signal Processors for Low Power Biomedical Applications

Author: Eminaga Y.
Eminaga Y.
Publication venue
Publication date: 01/01/2019
Field of study

Wearable health monitoring systems can provide remote care with supervised, inde-pendent living which are capable of signal sensing, acquisition, local processing and transmission. A generic biopotential signal (such as Electrocardiogram (ECG), and Electroencephalogram (EEG)) processing platform consists of four main functional components. The signals acquired by the electrodes are ampliﬁed and preconditioned by the (1) Analog-Front-End (AFE) which are then digitized via the (2) Analog-to-Digital Converter (ADC) for further processing. The local digital signal processing is usually handled by a custom designed (3) Digital Signal Processor (DSP) which is responsible for either anyone or combination of signal processing algorithms such as noise detection, noise/artefact removal, feature extraction, classiﬁcation and compres-sion. The digitally processed data is then transmitted via the (4) transmitter which is renown as the most power hungry block in the complete platform. All the afore-mentioned components of the wearable systems are required to be designed and ﬁtted into an integrated system where the area and the power requirements are stringent. Therefore, hardware complexity and power dissipation of each functional component are crucial aspects while designing and implementing a wearable monitoring platform. The work undertaken focuses on reducing the hardware complexity of a biosignal DSP and presents low hardware complexity solutions that can be employed in the aforemen-tioned wearable platforms. A typical state-of-the-art system utilizes Sigma Delta (Σ∆) ADCs incorporating a Σ∆ modulator and a decimation ﬁlter whereas the state-of-the-art decimation ﬁlters employ linear phase Finite-Impulse-Response (FIR) ﬁlters with high orders that in-crease the hardware complexity [1–5]. In this thesis, the novel use of minimum phase Inﬁnite-Impulse-Response (IIR) decimators is proposed where the hardware complexity is massively reduced compared to the conventional FIR decimators. In addition, the non-linear phase eﬀects of these ﬁlters are also investigated since phase non-linearity may distort the time domain representation of the signal being ﬁltered which is un-desirable eﬀect for biopotential signals especially when the ﬁducial characteristics carry diagnostic importance. In the case of ECG monitoring systems the eﬀect of the IIR ﬁlter phase non-linearity is minimal which does not aﬀect the diagnostic accuracy of the signals. The work undertaken also proposes two methods for reducing the hardware complexity of the popular biosignal processing tool, Discrete Wavelet Transform (DWT). General purpose multipliers are known to be hardware and power hungry in terms of the number of addition operations or their underlying building blocks like full adders or half adders required. Higher number of adders leads to an increase in the power consumption which is directly proportional to the clock frequency, supply voltage, switching activity and the resources utilized. A typical Field-Programmable-Gate-Array’s (FPGA) resources are Look-up Tables (LUTs) whereas a custom Digital Signal Processor’s (DSP) are gate-level cells of standard cell libraries that are used to build adders [6]. One of the proposed methods is the replacement of the hardware and power hungry general pur-pose multipliers and the coeﬃcient memories with reconﬁgurable multiplier blocks that are composed of simple shift-add networks and multiplexers. This method substantially reduces the resource utilization as well as the power consumption of the system. The second proposed method is the design and implementation of the DWT ﬁlter banks using IIR ﬁlters which employ less number of arithmetic operations compared to the state-of-the-art FIR wavelets. This reduces the hardware complexity of the analysis ﬁlter bank of the DWT and can be employed in applications where the reconstruction is not required. However, the synthesis ﬁlter bank for the IIR wavelet transform has a higher computational complexity compared to the conventional FIR wavelet synthesis ﬁlter banks since re-indexing of the ﬁltered data sequence is required that can only be achieved via the use of extra registers. Therefore, this led to the proposal of a novel design which replaces the complex IIR based synthesis ﬁlter banks with FIR ﬁl-ters which are the approximations of the associated IIR ﬁlters. Finally, a comparative study is presented where the hybrid IIR/FIR and FIR/FIR wavelet ﬁlter banks are de-ployed in a typical noise reduction scenario using the wavelet thresholding techniques. It is concluded that the proposed hybrid IIR/FIR wavelet ﬁlter banks provide better denoising performance, reduced computational complexity and power consumption in comparison to their IIR/IIR and FIR/FIR counterparts

WestminsterResearch

Hardware Implementation Of Tunable Heterodyne Band-Pass Filters

Author: Azam Asad
Publication venue: 'Oklahoma State University Library'
Publication date: 01/08/2001
Field of study

Modem wireless and satellite communication systems make use of spreadspectrum modulation concepts such as Frequency-hopping spread-spectrum (FHSS) and Direct sequence spread-spectrum (DSSS). The spread-spectrum modulation method inherently possesses anti-jamming and anti-interception properties due to the fact that the narrowband information signal is spread over a wide range of frequencies, masking the information-bearing signal as noise. Despite these properties, these communication channels can be severely corrupted by high-powered narrowband interference signals generated by local FM or AM transmitters which may cause complications when detecting the information signal at the receiver. Therefore, the communication system is made more efficient with the use of signal processing techniques for narrowband interference attenuation. Control systems is another area where the presence of narrowband interference signal due to mechanical resonance can be responsible for causing distortion in information signal. Any Band-pass, High-pass or a Low-pass filter may be converted into a tunable filter through the use of new Tunable Heterodyne Band-pass Filter concept in which the frequency of the heterodyne signal is adjusted thereby creating the effect of translating the entire transfer function of the fixed filter in frequency. In this thesis, hardware implementation techniques and results of the new Digital Tunable Heterodyne Band-pass filter is proposed that allows a prototype IIR or FIR filter to be shifted through the entire range of digital frequencies with a single parameter, the heterodyning frequency. The unique property of this new tunable filter is the range of tunability it possesses. With this technique, the fixed filter is tuned continuously using the concept of frequency translation. The images created by the heterodyne process are cancelled without the use of image canceling filters, which significantly contribute towards a hardware efficient design. In this thesis, simulation results are observed to illustrate the effects ofhaving the fixed prototype filter as a band-pass, high-pass, low-pass or notch filter. This thesis concentrates on the hardware implementation of the tunable heterodyne filter structure with a band-pass filter as the fixed prototype filter. Thus, simulation and experimental results show that if the fixed filter is a narrowband Bandpass filter, a much hardware efficient implementation can be achieved by using the new Tunable Heterodyne Band-pass filter to extract the narrowband interference from broadband communication or control systems as compared to the standard techniques used. The proposed heterodyne filter is suitable both as a tunable filter or to be implemented with standard algorithms to design adaptive digital filters. The new structure proposed is composed of three main components which can be implemented using Field Programmable Gate Arrays (FPGA) or easily be retargeted for an Application Specific Integrated Circuits (ASIC) standard cell technology or custom designed for Very Large Scale Integration (VLSI) processes. A prototype system is implemented using a single chip Xilinx Virtex Series Field Programmable Gate Arrays (FPGA) and thesimulation results are compared with the hardware data

SHAREOK repository