230 research outputs found

    Implementation of the Trigonometric LMS Algorithm using Original Cordic Rotation

    Full text link
    The LMS algorithm is one of the most successful adaptive filtering algorithms. It uses the instantaneous value of the square of the error signal as an estimate of the mean-square error (MSE). The LMS algorithm changes (adapts) the filter tap weights so that the error signal is minimized in the mean square sense. In Trigonometric LMS (TLMS) and Hyperbolic LMS (HLMS), two new versions of LMS algorithms, same formulations are performed as in the LMS algorithm with the exception that filter tap weights are now expressed using trigonometric and hyperbolic formulations, in cases for TLMS and HLMS respectively. Hence appears the CORDIC algorithm as it can efficiently perform trigonometric, hyperbolic, linear and logarithmic functions. While hardware-efficient algorithms often exist, the dominance of the software systems has kept those algorithms out of the spotlight. Among these hardware- efficient algorithms, CORDIC is an iterative solution for trigonometric and other transcendental functions. Former researches worked on CORDIC algorithm to observe the convergence behavior of Trigonometric LMS (TLMS) algorithm and obtained a satisfactory result in the context of convergence performance of TLMS algorithm. But revious researches directly used the CORDIC block output in their simulation ignoring the internal step-by-step rotations of the CORDIC processor. This gives rise to a need for verification of the convergence performance of the TLMS algorithm to investigate if it actually performs satisfactorily if implemented with step-by-step CORDIC rotation. This research work has done this job. It focuses on the internal operations of the CORDIC hardware, implements the Trigonometric LMS (TLMS) and Hyperbolic LMS (HLMS) algorithms using actual CORDIC rotations. The obtained simulation results are highly satisfactory and also it shows that convergence behavior of HLMS is much better than TLMS.Comment: 12 pages, 5 figures, 1 table. Published in IJCNC; http://airccse.org/journal/cnc/0710ijcnc08.pdf, http://airccse.org/journal/ijc2010.htm

    AREA AND POWER-EFFICIENT RECONFIGURABLE DIGITAL DOWN CONVERTER ON FPGA

    Get PDF
    This paper presents a field-programmable gate array (FPGA)-based digital down converter (DDC) that can reduce the bandwidth from about 70 MHz to 182.292 kHz. The proposed DDC consists of a polyphase COordinate Rotation DIgital Computer (CORDIC) processor and a multirate filter. The advantage of polyphase CORDIC processor is to process with high sample rate input data and produces computational efficient noiseless baseband spectrum. The pipeline multirate filter works at a high clock speed. Moreover, the multirate filter generates a fractional sample rate factor using a cubic B-spline Farrow filter. The proposed DDC is coded with optimal hardware description language (HDL) and tested on Kintex-7 Xilinx FPGA as the target device. Experimental results indicate that the proposed design saves chip area, power consumption and operates at high speed without loss of any functionality. Additionally, the proposed design offers sufficient spurious-free dynamic range (SFDR) and produces less than 1 Hz frequency resolution at the output

    Design of an FPGA-based smart camera and its application towards object tracking : a thesis presented in partial fulfilment of the requirements for the degree of Master of Engineering in Electronics and Computer Engineering at Massey University, Manawatu, New Zealand

    Get PDF
    Smart cameras and hardware image processing are not new concepts, yet despite the fact both have existed several decades, not much literature has been presented on the design and development process of hardware based smart cameras. This thesis will examine and demonstrate the principles needed to develop a smart camera on hardware, based on the experiences from developing an FPGA-based smart camera. The smart camera is applied on a Terasic DE0 FPGA development board, using Terasic’s 5 megapixel GPIO camera. The algorithm operates at 120 frames per second at a resolution of 640x480 by utilising a modular streaming approach. Two case studies will be explored in order to demonstrate the development techniques established in this thesis. The first case study will develop the global vision system for a robot soccer implementation. The algorithm will identify and calculate the positions and orientations of each robot and the ball. Like many robot soccer implementations each robot has colour patches on top to identify each robot and aid finding its orientation. The ball is comprised of a single solid colour that is completely distinct from the colour patches. Due to the presence of uneven light levels a YUV-like colour space labelled YC1C2 is used in order to make the colour values more light invariant. The colours are then classified using a connected components algorithm to segment the colour patches. The shapes of the classified patches are then used to identify the individual robots, and a CORDIC function is used to calculate the orientation. The second case study will investigate an improved colour segmentation design. A new HSY colour space is developed by remapping the Cartesian coordinate system from the YC1C2 to a polar coordinate system. This provides improved colour segmentation results by allowing for variations in colour value caused by uneven light patterns and changing light levels

    NIKEL: Electronics and data acquisition for kilopixels kinetic inductance camera

    Full text link
    A prototype of digital frequency multiplexing electronics allowing the real time monitoring of microwave kinetic inductance detector (MKIDs) arrays for mm-wave astronomy has been developed. Thanks to the frequency multiplexing, it can monitor simultaneously 400 pixels over a 500 MHz bandwidth and requires only two coaxial cables for instrumenting such a large array. The chosen solution and the performances achieved are presented in this paper.Comment: 21 pages, 14 figure

    Digital implementation of an upstream DOCSIS QAM modulator and channel emulator

    Get PDF
    The concept of cable television, originally called community antenna television (CATV), began in the 1940's. The information and services provided by cable operators have changed drastically since the early days. Cable service providers are no longer simply providing their customers with broadcast television but are providing a multi-purpose, two-way link to the digital world. Custom programming, telephone service, radio, and high-speed internet access are just a few of the services offered by cable service providers in the 21st century. At the dawn of the internet the dominant mode of access was through telephone lines. Despite advances in dial-up modem technology, the telephone system was unable to keep pace with the demand for data throughput. In the late 1990's an industry consortium known as Cable Television Laboratories, Inc. developed a standard protocol for providing high-speed internet access through the existing CATV infrastructure. This protocol is known as Data Over Cable Service Interface Specification (DOCSIS) and it helped to usher in the era of the information superhighway. CATV systems use different parts of the radio frequency (RF) spectrum for communication to and from the user. The downstream portion (data destined for the user) consumes the bulk of the spectrum and is located at relatively high frequencies. The upstream portion (data destined to the network from the user) of the spectrum is smaller and located at the low end of the spectrum. This lower frequency region of the RF spectrum is particularly prone to impairments such as micro-reflections, which can be viewed as a type of multipath interference. Upstream data transfer in the presence of these impairments is therefore problematic and requires complex signal correction algorithms to be employed in the receiver. The quality of a receiver is largely determined by how well it mitigates the signal impairments introduced by the channel. For this reason, engineers developing a receiver require a piece of equipment that can emulate the channel impairments in any permutation in order to test their receiver. The conventional test methodology uses a hardware RF channel emulator connected between the transmitter and the receiver under test. This method not only requires an expensive RF channel emulator, but a functioning analog front-end as well. Of these two problems, the expense of the hardware emulator is likely less important than the delay in development caused by waiting for a functional analog front-end. Receiver design is an iterative, time consuming process that requires the receiver's digital signal processing (DSP) algorithms be tested as early as possible to reduce the time-to-market. This thesis presents a digital implementation of a DOCSIS-compliant channel emulator whereby cable micro-reflections and thermal noise at the analog front-end of the receiver are modelled digitally at baseband. The channel emulator and the modulator are integrated into a single hardware structure to produce a compact circuit that, during receiver testing, resides inside the same field programmable gate array (FPGA) as the receiver. This approach removes the dependence on the analog front-end allowing it to be developed concurrently with the receiver's DSP circuits, thus reducing the time-to-market. The approach taken in this thesis produces a fully programmable channel emulator that can be loaded onto FPGAs as needed by engineers working independently on different receiver designs. The channel emulator uses 3 independent data streams to produce a 3-channel signal, whereby a main channel with micro-reflections is flanked on either side by adjacent channels. Thermal noise normally generated by the receiver's analog front-end is emulated and injected into the signal. The resulting structure utilizes 43 dedicated multipliers and 401.125 KB of RAM, and achieves a modulation error ratio (MER) of 55.29 dB

    Algorithms and architectures for the multirate additive synthesis of musical tones

    Get PDF
    In classical Additive Synthesis (AS), the output signal is the sum of a large number of independently controllable sinusoidal partials. The advantages of AS for music synthesis are well known as is the high computational cost. This thesis is concerned with the computational optimisation of AS by multirate DSP techniques. In note-based music synthesis, the expected bounds of the frequency trajectory of each partial in a finite lifecycle tone determine critical time-invariant partial-specific sample rates which are lower than the conventional rate (in excess of 40kHz) resulting in computational savings. Scheduling and interpolation (to suppress quantisation noise) for many sample rates is required, leading to the concept of Multirate Additive Synthesis (MAS) where these overheads are minimised by synthesis filterbanks which quantise the set of available sample rates. Alternative AS optimisations are also appraised. It is shown that a hierarchical interpretation of the QMF filterbank preserves AS generality and permits efficient context-specific adaptation of computation to required note dynamics. Practical QMF implementation and the modifications necessary for MAS are discussed. QMF transition widths can be logically excluded from the MAS paradigm, at a cost. Therefore a novel filterbank is evaluated where transition widths are physically excluded. Benchmarking of a hypothetical orchestral synthesis application provides a tentative quantitative analysis of the performance improvement of MAS over AS. The mapping of MAS into VLSI is opened by a review of sine computation techniques. Then the functional specification and high-level design of a conceptual MAS Coprocessor (MASC) is developed which functions with high autonomy in a loosely-coupled master- slave configuration with a Host CPU which executes filterbanks in software. Standard hardware optimisation techniques are used, such as pipelining, based upon the principle of an application-specific memory hierarchy which maximises MASC throughput

    KAVUAKA: a low-power application-specific processor architecture for digital hearing aids

    Get PDF
    The power consumption of digital hearing aids is very restricted due to their small physical size and the available hardware resources for signal processing are limited. However, there is a demand for more processing performance to make future hearing aids more useful and smarter. Future hearing aids should be able to detect, localize, and recognize target speakers in complex acoustic environments to further improve the speech intelligibility of the individual hearing aid user. Computationally intensive algorithms are required for this task. To maintain acceptable battery life, the hearing aid processing architecture must be highly optimized for extremely low-power consumption and high processing performance.The integration of application-specific instruction-set processors (ASIPs) into hearing aids enables a wide range of architectural customizations to meet the stringent power consumption and performance requirements. In this thesis, the application-specific hearing aid processor KAVUAKA is presented, which is customized and optimized with state-of-the-art hearing aid algorithms such as speaker localization, noise reduction, beamforming algorithms, and speech recognition. Specialized and application-specific instructions are designed and added to the baseline instruction set architecture (ISA). Among the major contributions are a multiply-accumulate (MAC) unit for real- and complex-valued numbers, architectures for power reduction during register accesses, co-processors and a low-latency audio interface. With the proposed MAC architecture, the KAVUAKA processor requires 16 % less cycles for the computation of a 128-point fast Fourier transform (FFT) compared to related programmable digital signal processors. The power consumption during register file accesses is decreased by 6 %to 17 % with isolation and by-pass techniques. The hardware-induced audio latency is 34 %lower compared to related audio interfaces for frame size of 64 samples.The final hearing aid system-on-chip (SoC) with four KAVUAKA processor cores and ten co-processors is integrated as an application-specific integrated circuit (ASIC) using a 40 nm low-power technology. The die size is 3.6 mm2. Each of the processors and co-processors contains individual customizations and hardware features with a varying datapath width between 24-bit to 64-bit. The core area of the 64-bit processor configuration is 0.134 mm2. The processors are organized in two clusters that share memory, an audio interface, co-processors and serial interfaces. The average power consumption at a clock speed of 10 MHz is 2.4 mW for SoC and 0.6 mW for the 64-bit processor.Case studies with four reference hearing aid algorithms are used to present and evaluate the proposed hardware architectures and optimizations. The program code for each processor and co-processor is generated and optimized with evolutionary algorithms for operation merging,instruction scheduling and register allocation. The KAVUAKA processor architecture is com-pared to related processor architectures in terms of processing performance, average power consumption, and silicon area requirements

    Real-Time UAV Pose Estimation and Tracking Using FPGA Accelerated April Tag

    Get PDF
    April Tags and other passive fiducial markers are widely used to determine localization using a monocular camera. It utilizes specialized algorithms that detect markers to calculate their orientation and distance in three dimensional (3-D) space. The video and image processing steps performed to use these fiducial systems dominate the computation time of the algorithms. Low latency is a key component for the real-time application of these fiducial markers. The drawbacks of performing the video and image processing in software is the difficulty in performing the same operation in parallel effectively. Specialized hardware instantiations with the same algorithm scan efficiently parallelize them as well as operate on the image in a streaming fashion. Compared to graphics processing units (GPUs) that also perform well in the field, field programmable gate arrays (FPGAs) operate with less power, making them optimal with tight power constraints. This research describes such an optimization for the April Tag algorithm on an unmanned aerial vehicle with an embedded platform to perform real-time pose estimation, tracking, and localization in GPS-denied (global positioning system) environments at 30 frames per second (FPS) by converting the initial embedded C/C++ solution to a heterogeneous one through hardware acceleration. It compares the size, accuracy, and speed of the April Tag algorithm’s various implementations. The initial solution operated at around 2 FPS while the final solution, a novel heterogeneous algorithm on the Fusion 2 Zynq 7020 system on chip (SoC), operated at around 43 FPS using hardware acceleration. The research proposes a pipeline that breaks the algorithm into distinct steps where portions of it can be improved by utilizing algorithms optimized to run on a FPGA. Additional steps were made to further reduce the hardware algorithm’s resource utilization. Each step in the software was compared against its hardware counterpart using its utilization and timing as benchmarks
    • …
    corecore