2,866 research outputs found
Viability of Numerical Full-Wave Techniques in Telecommunication Channel Modelling
In telecommunication channel modelling the wavelength is small compared to the physical features of interest, therefore deterministic ray tracing techniques provide solutions that are more efficient, faster and still within time constraints than current numerical full-wave techniques. Solving fundamental Maxwell's equations is at the core of computational electrodynamics and best suited for modelling electrical field interactions with physical objects where characteristic dimensions of a computing domain is on the order of a few wavelengths in size. However, extreme communication speeds, wireless access points closer to the user and smaller pico and femto cells will require increased accuracy in predicting and planning wireless signals, testing the accuracy limits of the ray tracing methods. The increased computing capabilities and the demand for better characterization of communication channels that span smaller geographical areas make numerical full-wave techniques attractive alternative even for larger problems. The paper surveys ways of overcoming excessive time requirements of numerical full-wave techniques while providing acceptable channel modelling accuracy for the smallest radio cells and possibly wider. We identify several research paths that could lead to improved channel modelling, including numerical algorithm adaptations for large-scale problems, alternative finite-difference approaches, such as meshless methods, and dedicated parallel hardware, possibly as a realization of a dataflow machine
Research on Acceleration Technology for FDTD Based on Vivado HLS
时域有限差分法(Finitedifferencetimedomainmethod,FDTD)是一种电磁学计算的基本方法,通过空间内电场和磁场的交替计算,得到整个研究空间的电磁分布情况。对于很多电磁学问题,不论从概念上还是可实现性上来讲,时域有限差分方法都是最简单的计算方法。时域有限差分法可以解决复杂的电磁计算问题,但同时要消耗大量的计算机资源,并且花费较长的计算时间。为了更快速高效地得到计算结果,可以利用硬件技术进行加速,这也是近年来FDTD方法研究领域比较受关注的部分。Xilinx公司新推出的高级综合工具VivadoHLS(HighLevelSynthesis),直接通过C/C++语言开发硬...In the field of computational electromagnetics, finite difference time domain method (FDTD) has been widely used. Using FDTD, the electromagnetic distribution of the whole field is obtained by alternating calculation of the electric and magnetic field. For many electromagnetical computational problems, FDTD is the simplest method, in consideration of conception and achievability. Although FDTD can...学位:工程硕士院系专业:物理科学与技术学院_工程硕士(电子与通信工程)学号:3432014115280
FPGA Acceleration of Domain-specific Kernels via High-Level Synthesis
L'abstract è presente nell'allegato / the abstract is in the attachmen
Accelerating Reconfigurable Financial Computing
This thesis proposes novel approaches to the design, optimisation, and management of reconfigurable
computer accelerators for financial computing. There are three contributions. First, we propose novel
reconfigurable designs for derivative pricing using both Monte-Carlo and quadrature methods. Such
designs involve exploring techniques such as control variate optimisation for Monte-Carlo, and multi-dimensional
analysis for quadrature methods. Significant speedups and energy savings are achieved
using our Field-Programmable Gate Array (FPGA) designs over both Central Processing Unit (CPU)
and Graphical Processing Unit (GPU) designs. Second, we propose a framework for distributing computing
tasks on multi-accelerator heterogeneous clusters. In this framework, different computational
devices including FPGAs, GPUs and CPUs work collaboratively on the same financial problem based
on a dynamic scheduling policy. The trade-off in speed and in energy consumption of different accelerator
allocations is investigated. Third, we propose a mixed precision methodology for optimising
Monte-Carlo designs, and a reduced precision methodology for optimising quadrature designs. These
methodologies enable us to optimise throughput of reconfigurable designs by using datapaths with
minimised precision, while maintaining the same accuracy of the results as in the original designs
Hardware Acceleration of Beamforming in a UWB Imaging Unit for Breast Cancer Detection
The Ultrawideband (UWB) imaging technique for breast cancer detection is based on the fact that cancerous cells have different
dielectric characteristics than healthy tissues.When a UWB pulse in the microwave range strikes a cancerous region, the reflected
signal is more intense than the backscatter originating from the surrounding fat tissue. A UWB imaging system consists of transmitters, receivers, and antennas for the RF part, and of a digital back-end for processing the received signals. In this paper we focus on the imaging unit, which elaborates the acquired data and produces 2D or 3D maps of reflected energies.We show that one of the processing tasks, Beamforming, is the most timing critical and cannot be executed in software by a standard microprocessor in a reasonable time.We thus propose a specialized hardware accelerator for it.We design the accelerator in VHDL and test it in an FPGA-based prototype. We also evaluate its performance when implemented on a CMOS 45nm ASIC technology. The speed-up with respect to a software implementation is on the order of tens to hundreds, depending on the degree of parallelism permitted by the target technology
Recommended from our members
Efficient architectures and power modelling of multiresolution analysis algorithms on FPGA
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In the past two decades, there has been huge amount of interest in Multiresolution Analysis Algorithms (MAAs) and their applications. Processing some of their applications such as medical imaging are computationally intensive, power hungry and requires large amount of memory which cause a high demand for efficient algorithm implementation, low power architecture and acceleration. Recently, some MAAs such as Finite Ridgelet Transform (FRIT) Haar Wavelet Transform (HWT) are became very popular and they are suitable for a number of image processing applications such as detection of line singularities and contiguous edges, edge detection (useful for compression and feature detection), medical image denoising and segmentation. Efficient hardware implementation and acceleration of these algorithms particularly when addressing large problems are becoming very chal-lenging and consume lot of power which leads to a number of issues including mobility, reliability concerns. To overcome the computation problems, Field Programmable Gate Arrays (FPGAs) are the technology of choice for accelerating computationally intensive applications due to their high performance. Addressing the power issue requires optimi- sation and awareness at all level of abstractions in the design flow.
The most important achievements of the work presented in this thesis are summarised
here.
Two factorisation methodologies for HWT which are called HWT Factorisation Method1 and (HWTFM1) and HWT Factorasation Method2 (HWTFM2) have been explored to increase number of zeros and reduce hardware resources. In addition, two novel efficient and optimised architectures for proposed methodologies based on Distributed Arithmetic (DA) principles have been proposed. The evaluation of the architectural results have shown that the proposed architectures results have reduced the arithmetics calculation (additions/subtractions) by 33% and 25% respectively compared to direct implementa-tion of HWT and outperformed existing results in place. The proposed HWTFM2 is implemented on advanced and low power FPGA devices using Handel-C language. The FPGAs implementation results have outperformed other existing results in terms of area and maximum frequency. In addition, a novel efficient architecture for Finite Radon Trans-form (FRAT) has also been proposed. The proposed architecture is integrated with the developed HWT architecture to build an optimised architecture for FRIT. Strategies such as parallelism and pipelining have been deployed at the architectural level for efficient im-plementation on different FPGA devices. The proposed FRIT architecture performance has been evaluated and the results outperformed some other existing architecture in place. Both FRAT and FRIT architectures have been implemented on FPGAs using Handel-C language. The evaluation of both architectures have shown that the obtained results out-performed existing results in place by almost 10% in terms of frequency and area. The proposed architectures are also applied on image data (256 £ 256) and their Peak Signal to Noise Ratio (PSNR) is evaluated for quality purposes.
Two architectures for cyclic convolution based on systolic array using parallelism and pipelining which can be used as the main building block for the proposed FRIT architec-ture have been proposed. The first proposed architecture is a linear systolic array with pipelining process and the second architecture is a systolic array with parallel process. The second architecture reduces the number of registers by 42% compare to first architec-ture and both architectures outperformed other existing results in place. The proposed pipelined architecture has been implemented on different FPGA devices with vector size (N) 4,8,16,32 and word-length (W=8). The implementation results have shown a signifi-cant improvement and outperformed other existing results in place.
Ultimately, an in-depth evaluation of a high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called func-tional level power modelling approach have been presented. The mathematical techniques that form the basis of the proposed power modeling has been validated by a range of custom IP cores. The proposed power modelling is scalable, platform independent and compares favorably with existing approaches. A hybrid, top-down design flow paradigm integrating functional level power modelling with commercially available design tools for systematic optimisation of IP cores has also been developed. The in-depth evaluation of this tool enables us to observe the behavior of different custom IP cores in terms of power consumption and accuracy using different design methodologies and arithmetic techniques on virous FPGA platforms. Based on the results achieved, the proposed model accuracy is almost 99% true for all IP core's Dynamic Power (DP) components.Thomas Gerald Gray Charitable Trus
- …