Search CORE

376 research outputs found

Customisable arithmetic hardware designs

Author: Cheung Chak-Chung Ray
Cheung Chak-Chung Ray
Publication venue
Publication date: 01/01/2007
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Pipeline-Based Power Reduction in FPGA Applications

Author: Díaz Lavadores Antonio
Rodellar Biarge M. Victoria
Sacristán Miguel Angel
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2008
Field of study

This paper shows how temporal parallelism has an important role in the power dissipation reduction in the FPGA field. Glitches propagation is blocked by the flip-flops or registers in the pipeline. Several multiplication structures are implemented over modern FPGAs, StratixII and Virtex4, comparing their results with and without pipeline and hardware duplication

Archivo Digital UPM

Adding aspect-oriented features to MATLAB

Author: Cardoso João M. P.
Fernandes João M.
Monteiro Miguel Pessoa
Publication venue
Publication date: 01/03/2006
Field of study

This paper presents an approach to enrich MATLAB with aspect-oriented extensions to experiment different implementation features. The language we propose aims to configure the low-level data representation of real variables and expressions, to a specifically-tailored fixed-point data representation that benefits from a more efficient support by computing engines (e.g., DSPs, application-specific architectures, etc.) without specific hardware-based floating point units. Additionally, the approach aims to help developers to introduce handlers and monitoring features, and to configure a function with an optimized implementation.FCT under projects PPC-VM (POSI/CHS/47158/2002) and SOFTAS (POSI/EIA/ 60189/200

Universidade do Minho: RepositoriUM

FPGA Implementation of DHT Algorithms for Image Compression

Author: Agrawal Richa
Publication venue
Publication date: 14/05/2010
Field of study

Digital image processing is the use of computer algorithms to perform image processing on digital images. The basic operation performed by a simple digital camera is, to convert the light energy to electrical energy, then the energy is converted to digital format and a compression algorithm is used to reduce memory requirement for storing the image. This compression algorithm is frequently called for capturing and storing the images. This leads us to develop an efficient compression algorithm which will give the same result as that of the existing algorithms with low power consumption. Compression is useful as it helps in reduction of the usage of expensive resources, such as memory (hard disks), or the transmission bandwidth required. But on the downside, compression techniques result in distortion (due to lossy compression schemes) and also additional computational resources are required for compression-decompression of the data. Reduction of these resources by comparing different algorithms for DHT is required. FPGA Implementations of different algorithms for 1-DHT using VHDL as the synthesis tool are carried out and their comparison gives the optimum technique for compression. Finally 2-D DHT is implemented using the optimum 1-D technique for 8x8 matrix input. The results obtained are discussed and improvements are suggested to further optimize the design

ethesis@nitr

Novel load identification techniques and a steady state self-tuning prototype for switching mode power supplies

Author: Congiu Andrea
Publication venue
Publication date: 14/04/2014
Field of study

Control of Switched Mode Power Supplies (SMPS) has been traditionally achieved through analog means with dedicated integrated circuits (ICs). However, as power systems are becoming increasingly complex, the classical concept of control has gradually evolved into the more general problem of power management, demanding functionalities that are hardly achievable in analog controllers. The high flexibility offered by digital controllers and their capability to implement sophisticated control strategies, together with the programmability of controller parameters, make digital control very attractive as an option for improving the features of dcdc converters. On the other side, digital controllers find their major weak point in the achievable dynamic performances of the closed loop system. Indeed, analogto-digital conversion times, computational delays and sampling-related delays strongly limit the small signal closed loop bandwidth of a digitally controlled SMPS. Quantization effects set other severe constraints not known to analog solutions. For these reasons, intensive scientific research activity is addressing the problem of making digital compensator stronger competitors against their analog counterparts in terms of achievable performances. In a wide range of applications, dcdc converters with high efficiency over the whole range of their load values are required. Integrated digital controllers for Switching Mode Power Supplies are gaining growing interest, since it has been shown the feasibility of digital controller ICs specifically developed for high frequency switching converters. One very interesting potential benefit is the use of autotuning of controller parameters (on-line controllers), so that the dynamic response can be set at the software level, independently of output capacitor filters, component variations and ageing. These kind of algorithms are able to identify the output filter configuration (system identification) and then automatically compute the best compensator gains to adjust system margins and bandwidth. In order to be an interesting solution, however, the self-tuning should satisfy two important requirements: it should not heavily affect converter operation under nominal condition and it should be based on a simple and robust algorithm whose complexity does not require a significant increase of the silicon area of the IC controller. The first issue is avoided performing the system identification (SI) with the system open loop configuration, where perturbations can be induced in the system before the start up. Much more challenging is to satisfy this requirement during steady state operations, where perturbations on the output voltage are limited by the regular operations of the converter. The main advantage of steady state SI methods, is the detection of possible non-idealities occurring during the converter operations. In this way, the system dynamics can be consequently adjusted with the compensator parameters tuning. The resource saving issue, requires the development of äd-hocßelf-tuning techniques specifically tailored for integrated digitally controlled converters. Considering the flexibility of digital control, self-tuning algorithms can be studied and easily integrated at hardware level into closed loop SMPS reducing development time and R & D costs. The work of this dissertation finds its origin in this context. Smart power management is accomplished by tuning the controller parameters accordingly to the identified converter configuration. Themain difficult for self-tuning techniques is the identification of the converter output filter configuration. Two novel system identification techniques have been validated in this dissertation. The open loop SI method is based on the system step response, while dithering amplification effects are exploited for the steady state SI method. The open loop method can be used as autotunig approach during or before the system start up, a step evolving reference voltage has been used as system perturbation and to obtain the output filter information with the Power Spectral Density (PSD) computation of the system step response. The use of ¢§ modulator is largely increasing in digital control feedback. During the steady state, the finite resolution introduces quantization effects on the signal path causing low frequency contributes of the digital control word. Through oversampling-dithering capabilities of ¢§ modulators, resolution improvements are obtained. The presented steady state identification techniques demonstrates that, amplifying the dithering effects on the signal path, the output filter information can be obtained on the digital side by processing with the PSD computation the perturbed output voltage. The amount of noise added on the output voltage does not affect the converter operations, mathematical considerations have been addressed and then justified both with a Matlab/Simulink fixed-point and a FPGA-based closed loop system. The load output filter identification of both algorithms, refer to the frequency domain. When the respective perturbations occurs, the system response is observed on the digital side and processed with the PSD computation. The extracted parameters are the resonant frequency ans the possible ESR (Effective Series Resistance) contributes,which can be detected as maximumin the PSD output. The SI methods have been validated for different configurations of buck converters on a fixed-point closed loop model, however, they can be easily applied to further converter configurations. The steady state method has been successfully integrated into a FPGA-based prototype for digitally controlled buck converters, that integrates a PSD computer needed for the load parameters identification. At this purpose, a novel VHDL-coded full-scalable hybrid processor for Constant Geometry FFT (CG-FFT) computation has been designed and integrated into the PSD computation system. The processor is based on a variation of the conventional algorithm used for FFT, which is the Constant-Geometry FFT (CG-FFT).Hybrid CORDIC-LUT scalable architectures, has been introduced as alternative approach for the twiddle factors (phase factors) computation needed during the FFT algorithms execution. The shared core architecture uses a single phase rotator to satisfy all TF requests. It can achieve improved logic saving by trading off with computational speed. The pipelined architecture is composed of a number of stages equal to the number of PEs and achieves the highest possible throughput, at the expense of more hardware usage

Archivio istituzionale della ricerca - Università di Cagliari

UniCA Eprints

Energy-efficient embedded machine learning algorithms for smart sensing systems

Author: OSTA MARIO
Publication venue: Universit\ue0 degli studi di Genova
Publication date: 27/02/2020
Field of study

Embedded autonomous electronic systems are required in numerous application domains such as Internet of Things (IoT), wearable devices, and biomedical systems. Embedded electronic systems usually host sensors, and each sensor hosts multiple input channels (e.g., tactile, vision), tightly coupled to the electronic computing unit (ECU). The ECU extracts information by often employing sophisticated methods, e.g., Machine Learning. However, embedding Machine Learning algorithms poses essential challenges in terms of hardware resources and energy consumption because of: 1) the high amount of data to be processed; 2) computationally demanding methods. Leveraging on the trade-off between quality requirements versus computational complexity and time latency could reduce the system complexity without affecting the performance. The objectives of the thesis are to develop: 1) energy-efficient arithmetic circuits outperforming state of the art solutions for embedded machine learning algorithms, 2) an energy-efficient embedded electronic system for the \u201celectronic-skin\u201d (e-skin) application. As such, this thesis exploits two main approaches: Approximate Computing: In recent years, the approximate computing paradigm became a significant major field of research since it is able to enhance the energy efficiency and performance of digital systems. \u201cApproximate Computing\u201d(AC) turned out to be a practical approach to trade accuracy for better power, latency, and size . AC targets error-resilient applications and offers promising benefits by conserving some resources. Usually, approximate results are acceptable for many applications, e.g., tactile data processing,image processing , and data mining ; thus, it is highly recommended to take advantage of energy reduction with minimal variation in performance . In our work, we developed two approximate multipliers: 1) the first one is called \u201cMETA\u201d multiplier and is based on the Error Tolerant Adder (ETA), 2) the second one is called \u201cApproximate Baugh-Wooley(BW)\u201d multiplier where the approximations are implemented in the generation of the partial products. We showed that the proposed approximate arithmetic circuits could achieve a relevant reduction in power consumption and time delay around 80.4% and 24%, respectively, with respect to the exact BW multiplier. Next, to prove the feasibility of AC in real world applications, we explored the approximate multipliers on a case study as the e-skin application. The e-skin application is defined as multiple sensing components, including 1) structural materials, 2) signal processing, 3) data acquisition, and 4) data processing. Particularly, processing the originated data from the e-skin into low or high-level information is the main problem to be addressed by the embedded electronic system. Many studies have shown that Machine Learning is a promising approach in processing tactile data when classifying input touch modalities. In our work, we proposed a methodology for evaluating the behavior of the system when introducing approximate arithmetic circuits in the main stages (i.e., signal and data processing stages) of the system. Based on the proposed methodology, we first implemented the approximate multipliers on the low-pass Finite Impulse Response (FIR) filter in the signal processing stage of the application. We noticed that the FIR filter based on (Approx-BW) outperforms state of the art solutions, while respecting the tradeoff between accuracy and power consumption, with an SNR degradation of 1.39dB. Second, we implemented approximate adders and multipliers respectively into the Coordinate Rotational Digital Computer (CORDIC) and the Singular Value Decomposition (SVD) circuits; since CORDIC and SVD take a significant part of the computationally expensive Machine Learning algorithms employed in tactile data processing. We showed benefits of up to 21% and 19% in power reduction at the cost of less than 5% accuracy loss for CORDIC and SVD circuits when scaling the number of approximated bits. 2) Parallel Computing Platforms (PCP): Exploiting parallel architectures for near-threshold computing based on multi-core clusters is a promising approach to improve the performance of smart sensing systems. In our work, we exploited a novel computing platform embedding a Parallel Ultra Low Power processor (PULP), called \u201cMr. Wolf,\u201d for the implementation of Machine Learning (ML) algorithms for touch modalities classification. First, we tested the ML algorithms at the software level; for RGB images as a case study and tactile dataset, we achieved accuracy respectively equal to 97% and 83.5%. After validating the effectiveness of the ML algorithm at the software level, we performed the on-board classification of two touch modalities, demonstrating the promising use of Mr. Wolf for smart sensing systems. Moreover, we proposed a memory management strategy for storing the needed amount of trained tensors (i.e., 50 trained tensors for each class) in the on-chip memory. We evaluated the execution cycles for Mr. Wolf using a single core, 2 cores, and 3 cores, taking advantage of the benefits of the parallelization. We presented a comparison with the popular low power ARM Cortex-M4F microcontroller employed, usually for battery-operated devices. We showed that the ML algorithm on the proposed platform runs 3.7 times faster than ARM Cortex M4F (STM32F40), consuming only 28 mW. The proposed platform achieves 15 7 better energy efficiency than the classification done on the STM32F40, consuming 81mJ per classification and 150 pJ per operation

Archivio istituzionale della ricerca - Università di Genova

Novel load identification techniques and a steady state self-tuning prototype for switching mode power supplies

Author: CONGIU ANDREA
Publication venue: Università degli Studi di Cagliari
Publication date: 14/04/2014
Field of study

Archivio istituzionale della ricerca - Università di Cagliari

Low energy HEVC and VVC video compression hardware

Author: Azgın Hasan
Publication venue
Publication date: 19/07/2019
Field of study

Video compression standards compress a digital video by reducing and removing redundancy in the digital video using computationally complex algorithms. As spatial and temporal resolutions of videos increase, compression efficiencies of video compression algorithms are also increasing. However, increased compression efficiency comes with increased computational complexity. Therefore, it is necessary to reduce computational complexities of video compression algorithms without reducing their visual quality in order to reduce area and energy consumption of their hardware implementations. In this thesis, we propose a novel technique for reducing amount of computations performed by HEVC intra prediction algorithm. We designed low energy, reconfigurable HEVC intra prediction hardware using the proposed technique. We also designed a low energy FPGA implementation of HEVC intra prediction algorithm using the proposed technique and DSP blocks. We propose a reconfigurable VVC intra prediction hardware architecture. We also propose an efficient VVC intra prediction hardware architecture using DSP blocks. We designed low energy VVC fractional interpolation hardware. We propose a novel approximate absolute difference technique. We designed low energy approximate absolute difference hardware using the proposed technique. We propose a novel approximate constant multiplication technique. We designed approximate constant multiplication hardware using the proposed technique. We quantified computation reductions achieved by the proposed techniques and video quality loss caused by the proposed approximation techniques. The proposed approximate absolute difference technique and approximate constant multiplication technique cause very small PSNR loss. The other proposed techniques cause no PSNR loss. We implemented the proposed hardware architectures in Verilog HDL. We mapped the Verilog RTL codes to Xilinx Virtex 6 or Xilinx Virtex 7 FPGAs and estimated their power consumptions using Xilinx XPower Analyzer tool. The proposed techniques significantly reduced power and energy consumptions of these FPGA implementation

Sabanci University Research Database