Search CORE

1,080 research outputs found

KAVUAKA: a low-power application-specific processor architecture for digital hearing aids

Author: Gerlach Lukas
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2021
Field of study

The power consumption of digital hearing aids is very restricted due to their small physical size and the available hardware resources for signal processing are limited. However, there is a demand for more processing performance to make future hearing aids more useful and smarter. Future hearing aids should be able to detect, localize, and recognize target speakers in complex acoustic environments to further improve the speech intelligibility of the individual hearing aid user. Computationally intensive algorithms are required for this task. To maintain acceptable battery life, the hearing aid processing architecture must be highly optimized for extremely low-power consumption and high processing performance.The integration of application-specific instruction-set processors (ASIPs) into hearing aids enables a wide range of architectural customizations to meet the stringent power consumption and performance requirements. In this thesis, the application-specific hearing aid processor KAVUAKA is presented, which is customized and optimized with state-of-the-art hearing aid algorithms such as speaker localization, noise reduction, beamforming algorithms, and speech recognition. Specialized and application-specific instructions are designed and added to the baseline instruction set architecture (ISA). Among the major contributions are a multiply-accumulate (MAC) unit for real- and complex-valued numbers, architectures for power reduction during register accesses, co-processors and a low-latency audio interface. With the proposed MAC architecture, the KAVUAKA processor requires 16 % less cycles for the computation of a 128-point fast Fourier transform (FFT) compared to related programmable digital signal processors. The power consumption during register file accesses is decreased by 6 %to 17 % with isolation and by-pass techniques. The hardware-induced audio latency is 34 %lower compared to related audio interfaces for frame size of 64 samples.The final hearing aid system-on-chip (SoC) with four KAVUAKA processor cores and ten co-processors is integrated as an application-specific integrated circuit (ASIC) using a 40 nm low-power technology. The die size is 3.6 mm2. Each of the processors and co-processors contains individual customizations and hardware features with a varying datapath width between 24-bit to 64-bit. The core area of the 64-bit processor configuration is 0.134 mm2. The processors are organized in two clusters that share memory, an audio interface, co-processors and serial interfaces. The average power consumption at a clock speed of 10 MHz is 2.4 mW for SoC and 0.6 mW for the 64-bit processor.Case studies with four reference hearing aid algorithms are used to present and evaluate the proposed hardware architectures and optimizations. The program code for each processor and co-processor is generated and optimized with evolutionary algorithms for operation merging,instruction scheduling and register allocation. The KAVUAKA processor architecture is com-pared to related processor architectures in terms of processing performance, average power consumption, and silicon area requirements

Institutionelles Repositorium der Leibniz Universität Hannover

Residue Number Systems: a Survey

Author: Nannarelli Alberto
Re Marco
Publication venue: Technical University of Denmark, DTU Informatics, Building 321
Publication date: 01/01/2008
Field of study

Online Research Database In Technology

Implementation of Carrier Phase Recovery Circuits for Optical Communication

Author: B\uf6rjeson Erik
Publication venue
Publication date: 01/01/2020
Field of study

Fiber-optic links form a vital part of our increasingly connected world, and as the number of Internet users and the network traffic increases, reducing the power dissipation of these links becomes more important. A considerable part of the total link power is dissipated in the digital signal processing (DSP) subsystems, which show a growing complexity as more advanced modulation formats are introduced. Since DSP designers can no longer take reduced power dissipation with each new CMOS process node for granted, the design of more efficient DSPalgorithms in conjunction with circuit implementation strategies focused on power efficiency is required.One part of the DSP for a coherent fiber-optic link is the carrier phase recovery (CPR) unit, which can account for a significant portion of the DSP power dissipation, especially for shorter links. A wide range of CPR algorithms is available, but reliable estimates of their power efficiency is missing, making accurate comparisons impossible. Furthermore, much of the current literature does not account for the limited precision arithmetic of the DSP.In this thesis, we develop circuit implementations based on a range of suggested CPR algorithms, focusing on power efficiency. These circuits allow us to contrast different CPR solutions based not only on power dissipation, but also on the quality of the phase estimation, including fixed-point arithmetic aspects. We also show how different parameter settings affect the power efficiency and the implementation penalty. Additionally, the thesis includes a description of our field-programmable gate-array fiber-emulation environment, which can be used to study rare phenomena in DSP implementations, or to reach very low bit-error rates. We use this environment to evaluate the cycle-slip probability of a CPR implementation

Chalmers Research

Studies on Implementation of . . . High Throughput and Low Power Consumption

Author: Henrik Ohlsson
Publication venue
Publication date
Field of study

In this thesis we discuss design and implementation of frequency selective digital filters with high throughput and low power consumption. The thesis includes proposed arithmetic transformations of lattice wave digital filters that aim at increasing the throughput and reduce the power consumption of the filter implementation. The thesis also includes two case studies where digital filters with high throughput and low power consumption are required. A method for obtaining high throughput as well as reduced power consumption of digital filters is arithmetic transformation of the filter structure. In this thesis arithmetic transformations of first- and second-order Richards’ allpass sections composed by symmetric two-port adaptors and implemented using carry-save arithmetic are proposed. Such filter sections can be used for implementation of lattice wave digital filters and bireciprocal lattice wave digital filters. The latter structures are efficient for implementation of interpolators and decimators by factors of two. Th

CiteSeerX

Serial-data computation in VLSI

Author: Smith Stewart Gresty
Publication venue: The University of Edinburgh
Publication date: 01/01/1987
Field of study

Edinburgh Research Archive

Towards adaptive balanced computing (ABC) using reconfigurable functional caches (RFCs)

Author: Kim Hue-Sung
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2001
Field of study

The general-purpose computing processor performs a wide range of functions. Although the performance of general-purpose processors has been steadily increasing, certain software technologies like multimedia and digital signal processing applications demand ever more computing power. Reconfigurable computing has emerged to combine the versatility of general-purpose processors with the customization ability of ASICs. The basic premise of reconfigurability is to provide better performance and higher computing density than fixed configuration processors. Most of the research in reconfigurable computing is dedicated to on-chip functional logic. If computing resources are adaptable to the computing requirement, the maximum performance can be achieved. To overcome the gap between processor and memory technology, the size of on-chip cache memory has been consistently increasing. The larger cache memory capacity, though beneficial in general, does not guarantee a higher performance for all the applications as they may not utilize all of the cache efficiently. To utilize on-chip resources effectively and to accelerate the performance of multimedia applications specifically, we propose a new architecture---Adaptive Balanced Computing (ABC). ABC uses dynamic resource configuration of on-chip cache memory by integrating Reconfigurable Functional Caches (RFC). RFC can work as a conventional cache or as a specialized computing unit when necessary. In order to convert a cache memory to a computing unit, we include additional logic to embed multi-bit output LUTs into the cache structure. We add the reconfigurability of cache memory to a conventional processor with minimal modification to the load/store microarchitecture and with minimal compiler assistance. ABC architecture utilizes resources more efficiently by reconfiguring the cache memory to computing units dynamically. The area penalty for this reconfiguration is about 50--60% of the memory cell cache array-only area with faster cache access time. In a base array cache (parallel decoding caches), the area penalty is 10--20% of the data array with 1--2% increase in the cache access time. However, we save 27% for FIR and 44% for DCT/IDCT in area with respect to memory cell array cache and about 80% for both applications with respect to base array cache if we were to implement all these units separately (such as ASICs). The simulations with multimedia and DSP applications (DCT/IDCT and FIR/IIR) show that the resource configuration with the RFC speedups ranging from 1.04X to 3.94X in overall applications and from 2.61X to 27.4X in the core computations. The simulations with various parameters indicate that the impact of reconfiguration can be minimized if an appropriate cache organization is selected

Digital Repository @ Iowa State University (ISU)

Image Processing Using FPGAs

Author: Bailey Donald
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

This book presents a selection of papers representing current research on using field programmable gate arrays (FPGAs) for realising image processing algorithms. These papers are reprints of papers selected for a Special Issue of the Journal of Imaging on image processing using FPGAs. A diverse range of topics is covered, including parallel soft processors, memory management, image filters, segmentation, clustering, image analysis, and image compression. Applications include traffic sign recognition for autonomous driving, cell detection for histopathology, and video compression. Collectively, they represent the current state-of-the-art on image processing using FPGAs

Directory of Open Access Books (DOAB)

Recommended from our members

Efficient FPGA implementation and power modelling of image and signal processing IP cores

Author: Chandrasekaran Shrutisagar
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2007
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage and signal processing application areas such as consumer electronics, instrumentation, medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area. A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed

Brunel University Research Archive

The RACE-H ‘ Processor

Author: Frank Barry
Gerald G. Pechanek
Hypercube Processor
Mihailo Stojancic
Nikos Pitsianis
Publication venue
Publication date
Field of study

CiteSeerX

Low power digital signal processing

Author: Paker Ozgun
Publication venue: Technical University of Denmark
Publication date: 01/01/2003
Field of study

Online Research Database In Technology