Search CORE

812 research outputs found

Design of an Adaptable Run-Time Reconfigurable Software-Defined Radio Processing Architecture

Author: Templin Joshua R.
Publication venue: DigitalCommons@USU
Publication date: 01/01/2010
Field of study

Processing power is a key technical challenge holding back the development of a high-performance software defined radio (SDR). Traditionally, SDR has utilized digital signal processors (DSPs), but increasingly complex algorithms, higher data rates, and multi-tasking needs have exceed the processing capabilities of modern DSPs. Reconfigurable computers, such as field-programmable gate arrays (FPGAs), are popular alternatives because of their performance gains over software for streaming data applications like SDR. However, FPGAs have not yet realized the ideal SDR because architectures have not fully utilized their partial reconfiguration (PR) capabilities to bring needed flexibility. A reconfigurable processor architecture is proposed that utilizes PR in reconfigurable computers to achieve a more sophisticated SDR. The proposed processor contains run-time swappable blocks whose parameters and interconnects are programmable. The architecture is analyzed for performance and flexibility and compared with available alternate technologies. For a sample QPSK algorithm, hardware performance gains of at least 44x are seen over modern desktop processors and DSPs while most of their flexibility and extensibility is maintained

CiteSeerX

DigitalCommons@USU

Transformations of High-Level Synthesis Codes for High-Performance Computing

Author: Besta Maciej
Hoefler Torsten
Licht Johannes de Fine
Meierhans Simon
Publication venue
Publication date: 29/10/2019
Field of study

Specialized hardware architectures promise a major step in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems. The adoption of high-level synthesis (HLS) from languages such as C/C++ and OpenCL has greatly increased programmer productivity when designing for such platforms. While this has enabled a wider audience to target specialized hardware, the optimization principles known from traditional software design are no longer sufficient to implement high-performance codes. Fast and efficient codes for reconfigurable platforms are thus still challenging to design. To alleviate this, we present a set of optimizing transformations for HLS, targeting scalable and efficient architectures for high-performance computing (HPC) applications. Our work provides a toolbox for developers, where we systematically identify classes of transformations, the characteristics of their effect on the HLS code and the resulting hardware (e.g., increases data reuse or resource consumption), and the objectives that each transformation can target (e.g., resolve interface contention, or increase parallelism). We show how these can be used to efficiently exploit pipelining, on-chip distributed fast memory, and on-chip streaming dataflow, allowing for massively parallel architectures. To quantify the effect of our transformations, we use them to optimize a set of throughput-oriented FPGA kernels, demonstrating that our enhancements are sufficient to scale up parallelism within the hardware constraints. With the transformations covered, we hope to establish a common framework for performance engineers, compiler developers, and hardware developers, to tap into the performance potential offered by specialized hardware architectures using HLS

arXiv.org e-Print Archive

Repository for Publications and Research Data

Recommended from our members

Efficient FPGA implementation and power modelling of image and signal processing IP cores

Author: Chandrasekaran Shrutisagar
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2007
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage and signal processing application areas such as consumer electronics, instrumentation, medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area. A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed

Brunel University Research Archive

Reconfigurable microarchitectures at the programmable logic interface

Author: Donlin Adam
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Edinburgh Research Archive

Low power field programmable gate array implementation of fast digital signal processing algorithms: characterisation and manipulation of data locality

Author: McKeown S.
Woods R.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/03/2011
Field of study

Queen's University Belfast Research Portal

Active FPGA Security through Decoy Circuits

Author: Christiansen Bradley D.
Publication venue: AFIT Scholar
Publication date: 01/06/2006
Field of study

Field Programmable Gate Arrays (FPGAs) based on Static Random Access Memory (SRAM) are vulnerable to tampering attacks such as readback and cloning attacks. Such attacks enable the reverse engineering of the design programmed into an FPGA. To counter such attacks, measures that protect the design with low performance penalties should be employed. This research proposes a method which employs the addition of active decoy circuits to protect SRAM FPGAs from reverse engineering. The effects of the protection method on security, execution time, power consumption, and FPGA resource usage are quantified. The method significantly increases the security of the design with only minor increases in execution time, power consumption, and resource usage. For the circuits used to characterize the method, security increased to more than one million times the original values, while execution time increased to at most 1.2 times, dynamic power consumption increased to at most two times, and look-up table usage increased to at most seven times the original values. These are reasonable penalties given the size and security of the modified circuits. The proposed design protection method also extends to FPGAs based on other technologies and to Application-Specific Integrated Circuits (ASICs). In addition to the design methodology proposed, a new classification of tampering attacks and countermeasures is presented

AFTI Scholar (Air Force Institute of Technology)

A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives

Author: Dong Mengyao
Fan Dongrui
Liu Xin
Lv Zhengyang
Sun Ninghui
Yan Mingyu
Ye Xiaochun
Publication venue
Publication date: 14/09/2023
Field of study

Graph-related applications have experienced significant growth in academia and industry, driven by the powerful representation capabilities of graph. However, efficiently executing these applications faces various challenges, such as load imbalance, random memory access, etc. To address these challenges, researchers have proposed various acceleration systems, including software frameworks and hardware accelerators, all of which incorporate graph pre-processing (GPP). GPP serves as a preparatory step before the formal execution of applications, involving techniques such as sampling, reorder, etc. However, GPP execution often remains overlooked, as the primary focus is directed towards enhancing graph applications themselves. This oversight is concerning, especially considering the explosive growth of real-world graph data, where GPP becomes essential and even dominates system running overhead. Furthermore, GPP methods exhibit significant variations across devices and applications due to high customization. Unfortunately, no comprehensive work systematically summarizes GPP. To address this gap and foster a better understanding of GPP, we present a comprehensive survey dedicated to this area. We propose a double-level taxonomy of GPP, considering both algorithmic and hardware perspectives. Through listing relavent works, we illustrate our taxonomy and conduct a thorough analysis and summary of diverse GPP techniques. Lastly, we discuss challenges in GPP and potential future directions

arXiv.org e-Print Archive

Blind adaptive equalization for QAM signals: New algorithms and FPGA implementation.

Author: Banovic Kevin
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2006
Field of study

Adaptive equalizers remove signal distortion attributed to intersymbol interference in band-limited channels. The tap coefficients of adaptive equalizers are time-varying and can be adapted using several methods. When these do not include the transmission of a training sequence, it is referred to as blind equalization. The radius-adjusted approach is a method to achieve blind equalizer tap adaptation based on the equalizer output radius for quadrature amplitude modulation (QAM) signals. Static circular contours are defined around an estimated symbol in a QAM constellation, which create regions that correspond to fixed step sizes and weighting factors. The equalizer tap adjustment consists of a linearly weighted sum of adaptation criteria that is scaled by a variable step size. This approach is the basis of two new algorithms: the radius-adjusted modified multitmodulus algorithm (RMMA) and the radius-adjusted multimodulus decision-directed algorithm (RMDA). An extension of the radius-adjusted approach is the selective update method, which is a computationally-efficient method for equalization. The selective update method employs a stop-and-go strategy based on the equalizer output radius to selectively update the equalizer tap coefficients, thereby, reducing the number of computations in steady-state operation. (Abstract shortened by UMI.) Source: Masters Abstracts International, Volume: 45-01, page: 0401. Thesis (M.A.Sc.)--University of Windsor (Canada), 2006

Scholarship at UWindsor