46 research outputs found
SMTBDD: New Form of BDD for Logic Synthesis
The main purpose of the paper is to suggest a new form of BDD – SMTBDD diagram, methods of obtaining, and its basic features. The idea of using SMTBDD diagram in the process of logic synthesis dedicated to FPGA structures is presented. The creation of SMTBDD diagrams is the result of cutting BDD diagram which is the effect of multiple decomposition. The essence of a proposed decomposition method rests on the way of determining the number of necessary ‘g’ bounded functions on the basis of the content of a root table connected with an appropriate SMTBDD diagram. The article presents the methods of searching non-disjoint decomposition using SMTBDD diagrams. Besides, it analyzes the techniques of choosing cutting levels as far as effective technology mapping is concerned. The paper also discusses the results of the experiments which confirm the efficiency of the analyzed decomposition methods
Digital Beamforming Implementation on an FPGA Platform
This work is part of UPC contribution to the CORPA (Cost-Optimised high Performance Active Receive Phase Array antenna for mobile terminals) project of ESA (European Space Agency)The objective of the work presented is to implement a Digital Beamforming (DBF) platform for an antenna array receiver designed for the S-DMB system. Our project deals with
the design of antenna arrays from a hardware point of view, in contrast to other theo-
retic studies regarding DBF algorithms. Hence, we will study practical aspects of DBF
implementation such as signal quantization and required computational resources
Recommended from our members
The realization of signal processing methods and their hardware implementation over multi-carrier modulation using FPGA technology. Validation and implementation of multi-carrier modulation on FPGA, and signal processing of the channel estimation techniques and filter bank architectures for DWT using HDL coding for mobile and wireless applications.
First part of this thesis presents the design, validation, and implementation of an Orthogonal
Frequency Division Multiplexing (OFDM) transmitter and receiver on a Cyclone II FPGA chip using DSP builder and Quartus II high level design tools. The resources in terms of logical elements (LE) including combinational functions and logic registers allocated by the model have been investigated and addressed. The result shows that implementing the basic OFDM transceiver allocates about 14% (equivalent to 6% at transmitter and 8% at receiver) of the available LE resources on an Altera Cyclone II EP2C35F672C6 FPGA chip, largely taken up by the FFT, IFFT and soft decision encoder.
Secondly, a new wavelet-based OFDM system based on FDPP-DA based channel estimation is proposed as a reliable ECG Patient Monitoring System, a Personal Wireless telemedicine application. The system performance for different wavelet mothers has been investigated. The effects of AWGN and multipath Rayleigh fading channels have also been studied in the analysis. The performances of FDPP-DA and HDPP-DA-based channel estimations are compared based on both DFT-based OFDM and wavelet-based OFDM systems. The system model was studied using MATLAB software in which the average BER was addressed for randomized data. The main error differences that reflect the quality of the received ECG signals between the reconstructed and original ECG signals are established.
Finally a DA-based architecture for 1-D iDWT/DWT based on an OFDM model is implemented for an ECG-PMS wireless telemedicine application. In the portable wireless body transmitter unit at the patient site, a fully Serial-DA-based scheme for iDWT is realized to support higher hardware utilization and lower power consumption; whereas a fully Parallel-DA-based scheme for DWT is applied at the base unit of the hospital site to support a higher throughput. It should be noted that the behavioural level of HDL models of the proposed system was developed and implemented to confirm its correctness in simulation. Then, after the simulation process the design models were synthesised and implemented for the target FPGA to confirm their validation
Уменьшение числа LUT-элементов в схеме автомата Мура
Предложен метод синтеза автомата Мура, позволяющий уменьшить число LUT-элементов в схеме формирования функций возбуждения триггеров памяти. Метод основан на наличии свободных выходов встроенных блоков памяти, используемых для реализации системы микроопераций автомата. Приведен пример использования предложенного метода.The method of the Moore FSM synthesis is proposed. It allows to reduce the LUT number, the elements in the circuit of forming the function of the trigger stimulation memory. The method is based on the availability EMB free outputs, that are used for the achievement of the microoperation Moore FSM system. An example of the method application is given.Запропоновано метод синтезу автомата Мура, який дозволяє зменшити число LUT-елементів у схемі формування функцій збудження тригерів пам'яті. Метод засновано на наявності вільних виходів вбудованих блоків пам'яті, які використовуються для реалізації системи мікрооперацій автомата. Наведено приклад застосування запропонованого методу
Mengenal pasti tahap pengetahuan pelajar tahun akhir Ijazah Sarjana Muda Kejuruteraan di KUiTTHO dalam bidang keusahawanan dari aspek pengurusan modal
Malaysia ialah sebuah negara membangun di dunia. Dalam proses pembangunan
ini, hasrat negara untuk melahirkan bakal usahawan beijaya tidak boleh dipandang
ringan. Oleh itu, pengetahuan dalam bidang keusahawanan perlu diberi perhatian
dengan sewajarnya; antara aspek utama dalam keusahawanan ialah modal. Pengurusan
modal yang tidak cekap menjadi punca utama kegagalan usahawan. Menyedari hakikat
ini, kajian berkaitan Pengurusan Modal dijalankan ke atas 100 orang pelajar Tahun
Akhir Kejuruteraan di KUiTTHO. Sampel ini dipilih kerana pelajar-pelajar ini akan
menempuhi alam pekeijaan di mana mereka boleh memilih keusahawanan sebagai satu
keijaya. Walau pun mereka bukanlah pelajar dari jurusan perniagaan, namun mereka
mempunyai kemahiran dalam mereka cipta produk yang boleh dikomersialkan. Hasil
dapatan kajian membuktikan bahawa pelajar-pelajar ini berminat dalam bidang
keusahawanan namun masih kurang pengetahuan tentang pengurusan modal
terutamanya dalam menentukan modal permulaan, pengurusan modal keija dan caracara
menentukan pembiayaan kewangan menggunakan kaedah jualan harian. Oleh itu,
satu garis panduan Pengurusan Modal dibina untuk memberi pendedahan kepada
mereka
Recommended from our members
Efficient architectures and power modelling of multiresolution analysis algorithms on FPGA
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In the past two decades, there has been huge amount of interest in Multiresolution Analysis Algorithms (MAAs) and their applications. Processing some of their applications such as medical imaging are computationally intensive, power hungry and requires large amount of memory which cause a high demand for efficient algorithm implementation, low power architecture and acceleration. Recently, some MAAs such as Finite Ridgelet Transform (FRIT) Haar Wavelet Transform (HWT) are became very popular and they are suitable for a number of image processing applications such as detection of line singularities and contiguous edges, edge detection (useful for compression and feature detection), medical image denoising and segmentation. Efficient hardware implementation and acceleration of these algorithms particularly when addressing large problems are becoming very chal-lenging and consume lot of power which leads to a number of issues including mobility, reliability concerns. To overcome the computation problems, Field Programmable Gate Arrays (FPGAs) are the technology of choice for accelerating computationally intensive applications due to their high performance. Addressing the power issue requires optimi- sation and awareness at all level of abstractions in the design flow.
The most important achievements of the work presented in this thesis are summarised
here.
Two factorisation methodologies for HWT which are called HWT Factorisation Method1 and (HWTFM1) and HWT Factorasation Method2 (HWTFM2) have been explored to increase number of zeros and reduce hardware resources. In addition, two novel efficient and optimised architectures for proposed methodologies based on Distributed Arithmetic (DA) principles have been proposed. The evaluation of the architectural results have shown that the proposed architectures results have reduced the arithmetics calculation (additions/subtractions) by 33% and 25% respectively compared to direct implementa-tion of HWT and outperformed existing results in place. The proposed HWTFM2 is implemented on advanced and low power FPGA devices using Handel-C language. The FPGAs implementation results have outperformed other existing results in terms of area and maximum frequency. In addition, a novel efficient architecture for Finite Radon Trans-form (FRAT) has also been proposed. The proposed architecture is integrated with the developed HWT architecture to build an optimised architecture for FRIT. Strategies such as parallelism and pipelining have been deployed at the architectural level for efficient im-plementation on different FPGA devices. The proposed FRIT architecture performance has been evaluated and the results outperformed some other existing architecture in place. Both FRAT and FRIT architectures have been implemented on FPGAs using Handel-C language. The evaluation of both architectures have shown that the obtained results out-performed existing results in place by almost 10% in terms of frequency and area. The proposed architectures are also applied on image data (256 £ 256) and their Peak Signal to Noise Ratio (PSNR) is evaluated for quality purposes.
Two architectures for cyclic convolution based on systolic array using parallelism and pipelining which can be used as the main building block for the proposed FRIT architec-ture have been proposed. The first proposed architecture is a linear systolic array with pipelining process and the second architecture is a systolic array with parallel process. The second architecture reduces the number of registers by 42% compare to first architec-ture and both architectures outperformed other existing results in place. The proposed pipelined architecture has been implemented on different FPGA devices with vector size (N) 4,8,16,32 and word-length (W=8). The implementation results have shown a signifi-cant improvement and outperformed other existing results in place.
Ultimately, an in-depth evaluation of a high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called func-tional level power modelling approach have been presented. The mathematical techniques that form the basis of the proposed power modeling has been validated by a range of custom IP cores. The proposed power modelling is scalable, platform independent and compares favorably with existing approaches. A hybrid, top-down design flow paradigm integrating functional level power modelling with commercially available design tools for systematic optimisation of IP cores has also been developed. The in-depth evaluation of this tool enables us to observe the behavior of different custom IP cores in terms of power consumption and accuracy using different design methodologies and arithmetic techniques on virous FPGA platforms. Based on the results achieved, the proposed model accuracy is almost 99% true for all IP core's Dynamic Power (DP) components.Thomas Gerald Gray Charitable Trus
Recommended from our members
Efficient FPGA implementation and power modelling of image and signal processing IP cores
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage
and signal processing application areas such as consumer electronics, instrumentation,
medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA
devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the
work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of
cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area.
A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM
is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed
Arquitecturas reconfiguráveis para problemas de optimização combinatória
Os problemas combinatórios têm uma gama extremamente ampla de
aplicações numa variedade de áreas de engenharia, incluindo teste de
circuitos electrónicos, reconhecimento de padrões, síntese lógica, etc. Muitos
dos problemas de interesse pertencem às classes NP-hard e NP-complete, o
que implica que os algoritmos relevantes têm no pior caso complexidade
exponencial. Este facto impede a solução de muitos problemas práticos com a
ajuda de computadores convencionais. As implementações em circuitos
integrados específicos também não são viáveis, em particular por causa da
própria heterogeneidade dos problemas combinatórios. Uma solução
alternativa consiste no uso de dispositivos reconfiguráveis que podem ser
personalizados para um algoritmo específico e reutilizados para outros
algoritmos via uma simples reprogramação da sua estrutura interna. As
implementações baseadas em hardware reconfigurável permitem optimizar a
execução dos algoritmos relevantes com a ajuda de técnicas tais como
processamento paralelo, unidades funcionais personalizadas, etc. Tais
implementações possibilitam conter o efeito de crescimento exponencial do
tempo de computação, permitindo deste modo a solução de problemas
combinatórios complexos.
Recentemente foram desenvolvidos vários sistemas reconfiguráveis
destinados a resolver problemas combinatórios. Estes são principalmente
baseados na ideia de hardware específico para a instância, em que para cada
instância do problema é gerado um circuito particular. Nesta tese exploramos
duas abordagens alternativas. A primeira é orientada para o domínio e permite
processar uma variedade de problemas da área da computação combinatória.
Para tal é projectado e implementado um processador combinatório
reconfigurável e são desenvolvidos métodos e ferramentas que asseguram a
sua reconfiguração dinâmica parcial. A segunda abordagem é orientada para a
aplicação e é destinada a resolver um problema combinatório específico. Em
particular, é proposta uma arquitectura inovadora para a solução do problema
de satisfação booleana com a ajuda de uma combinação de software e de
hardware reconfigurável. A técnica adoptada elimina a compilação de
hardware específica à instância e permite processar problemas que excedem
os recursos lógicos disponíveis. São também exploradas as possibilidades de
implementação em hardware reconfigurável de estratégias evolutivas para o
caso do problema do caixeiro viajante.
Esta tese estende o domínio de aplicação da computação reconfigurável ao
demonstrar que esta é capaz de acelerar algoritmos com fluxos de controlo
complexos.Combinatorial problems have an extremely wide range of practical applications
in a variety of engineering areas, including the testing of electronic circuits,
pattern recognition, logic synthesis, etc. Many of the problems of interest
belong to the classes NP-hard and NP-complete, which implies that the
relevant algorithms have an exponential worst-case complexity. This fact
precludes the solution of many practical problems with conventional
computers. ASIC-based implementations are also not viable, in particular
because of the inherent heterogeneity of combinatorial problems.
Reconfigurable devices offer an alternative solution, which can be customized
to the requirements of a specific algorithm and reutilized for other algorithms
via a simple reprogramming of their internal structure. Implementations based
on reconfigurable hardware permit the execution of the relevant algorithms to
be optimized with the aid of such techniques as parallel processing,
personalized functional units, etc. Such implementations allow the effect of
exponential growth in the computation time to be delayed, thus enabling more
complex problem instances to be solved.
Recently, a few reconfigurable engines for combinatorial problems have been
developed. They are mainly based on the idea of instance-specific hardware,
which assumes that a particular circuit is generated for each problem instance.
In this thesis we explore two alternative approaches. The first, domain-specific,
approach enables a variety of problems in the area of combinatorial
computation to be addressed. For this purpose, a reconfigurable combinatorial
processor has been designed and implemented and a number of methods and
tools that support its partial dynamic reconfiguration have been developed. The
second, application-specific, approach is oriented towards solving individual
combinatorial problems. In particular, a novel architecture is proposed for
solving the Boolean satisfiability problem with the aid of software and
reconfigurable hardware. The adopted technique avoids instance-specific
hardware compilation and permits problems that exceed the available logic
resources to be solved. The possibility of implementing evolutionary strategies
for the traveling salesman problem in reconfigurable hardware is also explored.
This thesis extends the application domain of reconfigurable computing by
demonstrating that it is effective in accelerating algorithms with complex control
flows
Hexarray: A Novel Self-Reconfigurable Hardware System
Evolvable hardware (EHW) is a powerful autonomous system for adapting and finding solutions within a changing environment. EHW consists of two main components: a reconfigurable hardware core and an evolutionary algorithm. The majority of prior research focuses on improving either the reconfigurable hardware or the evolutionary algorithm in place, but not both. Thus, current implementations suffer from being application oriented and having slow reconfiguration times, low efficiencies, and less routing flexibility. In this work, a novel evolvable hardware platform is proposed that combines a novel reconfigurable hardware core and a novel evolutionary algorithm.
The proposed reconfigurable hardware core is a systolic array, which is called HexArray. HexArray was constructed using processing elements with a redesigned architecture, called HexCells, which provide routing flexibility and support for hybrid reconfiguration schemes. The improved evolutionary algorithm is a genome-aware genetic algorithm (GAGA) that accelerates evolution. Guided by a fitness function the GAGA utilizes context-aware genetic operators to evolve solutions. The operators are genome-aware constrained (GAC) selection, genome-aware mutation (GAM), and genome-aware crossover (GAX). The GAC selection operator improves parallelism and reduces the redundant evaluations. The GAM operator restricts the mutation to the part of the genome that affects the selected output. The GAX operator cascades, interleaves, or parallel-recombines genomes at the cell level to generate better genomes. These operators improve evolution while not limiting the algorithm from exploring all areas of a solution space.
The system was implemented on a SoC that includes a programmable logic (i.e., field-programmable gate array) to realize the HexArray and a processing system to execute the GAGA. A computationally intensive application that evolves adaptive filters for image processing was chosen as a case study and used to conduct a set of experiments to prove the developed system robustness. Through an iterative process using the genetic operators and a fitness function, the EHW system configures and adapts itself to evolve fitter solutions. In a relatively short time (e.g., seconds), HexArray is able to evolve autonomously to the desired filter.
By exploiting the routing flexibility in the HexArray architecture, the EHW has a simple yet effective mechanism to detect and tolerate faulty cells, which improves system reliability. Finally, a mechanism that accelerates the evolution process by hiding the reconfiguration time in an “evolve-while-reconfigure” process is presented. In this process, the GAGA utilizes the array routing flexibility to bypass cells that are being configured and evaluates several genomes in parallel