553 research outputs found
Implementation of Elliptic Curve Crypto Processor and Its Performance Analysis
ECDSA stands for "Elliptic Curve Digital Signature Algorithm", its used to create a digital signature of data (a file for example) in order to allow you toverify its authenticity without compromising its security. This paper presents the architecture of finite field multiplication. The proposed multiplier is hybrid Karatsuba multiplier used in this processor. For multiplicative inverse we choose the Itoh-Tsujii Algorithm (ITA). This work presents the design of high performance elliptic curve crypto processor (ECCP) for an elliptic curve over the finite field GF (2^233). The curve which we choose is the standard curve for the digital signature. The processor is synthesized for Xilinx FPGA
Parametrizable Architecture for Function Recursive Evaluation
Paper submitted to the XVIII Conference on Design of Circuits and Integrated Systems (DCIS), Ciudad Real, España, 2003.This paper presents a function evaluation method developed under the scope of recursive expression of function convolution. This approach is based on a unique parametrizable formula capable of providing function points by successive iteration. When tackling design level, it also shows suitable for developing architectural schemes capable of dealing with different speed and precision issues. An architecture for reconfigurable FPGA based in serial distributed arithmetic implements the design for fast prototyping. The case of combined trigonometric functions involved in rotation is analyzed under this scope. Compared with others methods, our proposal offers a good balance between speed and precision
Realtime image noise reduction FPGA implementation with edge detection
The purpose of this dissertation was to develop and implement, in a Field
Programmable Gate Array (FPGA), a noise reduction algorithm for real-time
sensor acquired images. A Moving Average filter was chosen due to its
fulfillment of a low demanding computational expenditure nature, speed, good
precision and low to medium hardware resources utilization. The technique is
simple to implement, however, if all pixels are indiscriminately filtered, the result
will be a blurry image which is undesirable.
Since human eye is more sensitive to contrasts, a technique was
introduced to preserve sharp contour transitions which, in the author’s opinion,
is the dissertation contribution. Synthetic and real images were tested.
Synthetic, composed both with sharp and soft tone transitions, were generated
with a developed algorithm, while real images were captured with an 8-kbit
(8192 shades) high resolution sensor scaled up to 10 × 103 shades.
A least-squares polynomial data smoothing filter, Savitzky-Golay, was
used as comparison. It can be adjusted using 3 degrees of freedom ─ the
window frame length which varies the filtering relation size between pixels’
neighborhood, the derivative order, which varies the curviness and the
polynomial coefficients which change the adaptability of the curve. Moving
Average filter only permits one degree of freedom, the window frame length.
Tests revealed promising results with 2 and 4ℎ polynomial orders. Higher
qualitative results were achieved with Savitzky-Golay’s better signal
characteristics preservation, especially at high frequencies.
FPGA algorithms were implemented in 64-bit integer registers serving
two purposes: increase precision, hence, reducing the error comparatively as if
it were done in floating-point registers; accommodate the registers’ growing
cumulative multiplications. Results were then compared with MATLAB’s double
precision 64-bit floating-point computations to verify the error difference
between both. Used comparison parameters were Mean Squared Error, Signalto-Noise Ratio and Similarity coefficient.O objetivo desta dissertação foi desenvolver e implementar, em FPGA,
um algoritmo de redução de ruído para imagens adquiridas em tempo real.
Optou-se por um filtro de Média Deslizante por não exigir uma elevada
complexidade computacional, ser rápido, ter boa precisão e requerer moderada
utilização de recursos. A técnica é simples, mas se abordada como filtragem
monotónica, o resultado é uma indesejável imagem desfocada.
Dado o olho humano ser mais sensível ao contraste, introduziu-se uma
técnica para preservar os contornos que, na opinião do autor, é a sua principal
contribuição. Utilizaram-se imagens sintéticas e reais nos testes. As sintéticas,
compostas por fortes e suaves contrastes foram geradas por um algoritmo
desenvolvido. As reais foram capturadas com um sensor de alta resolução de
8-kbit (8192 tons) e escalonadas a 10 × 103 tons.
Um filtro com suavização polinomial de mínimos quadrados, SavitzkyGolay, foi usado como comparação. Possui 3 graus de liberdade: o tamanho da
janela, que varia o tamanho da relação de filtragem entre os pixels vizinhos; a
ordem da derivada, que varia a curvatura do filtro e os coeficientes polinomiais,
que variam a adaptabilidade da curva aos pontos a suavizar. O filtro de Média
Deslizante é apenas ajustável no tamanho da janela. Os testes revelaram-se
promissores nas 2ª e 4ª ordens polinomiais. Obtiveram-se resultados
qualitativos com o filtro Savitzky-Golay que detém melhores características na
preservação do sinal, especialmente em altas frequências.
Os algoritmos em FPGA foram implementados em registos de vírgula
fixa de 64-bits, servindo dois propósitos: aumentar a precisão, reduzindo o erro
comparativamente ao terem sido em vírgula flutuante; acomodar o efeito
cumulativo das multiplicações. Os resultados foram comparados com os
cálculos de 64-bits obtidos pelo MATLAB para verificar a diferença de erro
entre ambos. Os parâmetros de medida foram MSE, SNR e coeficiente de
Semelhança
Application-Specific Number Representation
Reconfigurable devices, such as Field Programmable Gate Arrays (FPGAs), enable application-
specific number representations. Well-known number formats include fixed-point, floating-
point, logarithmic number system (LNS), and residue number system (RNS). Such different
number representations lead to different arithmetic designs and error behaviours, thus produc-
ing implementations with different performance, accuracy, and cost.
To investigate the design options in number representations, the first part of this thesis presents
a platform that enables automated exploration of the number representation design space. The
second part of the thesis shows case studies that optimise the designs for area, latency or
throughput from the perspective of number representations.
Automated design space exploration in the first part addresses the following two major issues:
² Automation requires arithmetic unit generation. This thesis provides optimised
arithmetic library generators for logarithmic and residue arithmetic units, which support
a wide range of bit widths and achieve significant improvement over previous designs.
² Generation of arithmetic units requires specifying the bit widths for each
variable. This thesis describes an automatic bit-width optimisation tool called R-Tool,
which combines dynamic and static analysis methods, and supports different number
systems (fixed-point, floating-point, and LNS numbers).
Putting it all together, the second part explores the effects of application-specific number
representation on practical benchmarks, such as radiative Monte Carlo simulation, and seismic
imaging computations. Experimental results show that customising the number representations
brings benefits to hardware implementations: by selecting a more appropriate number format,
we can reduce the area cost by up to 73.5% and improve the throughput by 14.2% to 34.1%; by
performing the bit-width optimisation, we can further reduce the area cost by 9.7% to 17.3%.
On the performance side, hardware implementations with customised number formats achieve
5 to potentially over 40 times speedup over software implementations
Precision analysis for hardware acceleration of numerical algorithms
The precision used in an algorithm affects the error and performance of individual computations, the
memory usage, and the potential parallelism for a fixed hardware budget. However, when migrating
an algorithm onto hardware, the potential improvements that can be obtained by tuning the precision
throughout an algorithm to meet a range or error specification are often overlooked; the major reason
is that it is hard to choose a number system which can guarantee any such specification can be met.
Instead, the problem is mitigated by opting to use IEEE standard double precision arithmetic so as to be
‘no worse’ than a software implementation. However, the flexibility in the number representation is one
of the key factors that can be exploited on reconfigurable hardware such as FPGAs, and hence ignoring
this potential significantly limits the performance achievable.
In order to optimise the performance of hardware reliably, we require a method that can tractably
calculate tight bounds for the error or range of any variable within an algorithm, but currently only a
handful of methods to calculate such bounds exist, and these either sacrifice tightness or tractability,
whilst simulation-based methods cannot guarantee the given error estimate. This thesis presents a new
method to calculate these bounds, taking into account both input ranges and finite precision effects,
which we show to be, in general, tighter in comparison to existing methods; this in turn can be used to
tune the hardware to the algorithm specifications.
We demonstrate the use of this software to optimise hardware for various algorithms to accelerate
the solution of a system of linear equations, which forms the basis of many problems in engineering
and science, and show that significant performance gains can be obtained by using this new approach in
conjunction with more traditional hardware optimisations
Design and analysis of an FPGA-based, multi-processor HW-SW system for SCC applications
The last 30 years have seen an increase in the complexity of embedded systems from a collection of simple circuits to systems consisting of multiple processors managing a wide variety of devices. This ever increasing complexity frequently requires that high assurance, fail-safe and secure design techniques be applied to protect against possible failures and breaches. To facilitate the implementation of these embedded systems in an efficient way, the FPGA industry recently created new families of devices. New features added to these devices include anti-tamper monitoring, bit stream encryption, and optimized routing architectures for physical and functional logic partition isolation. These devices have high capacities and are capable of implementing processors using their reprogrammable logic structures. This allows for an unprecedented level of hardware and software interaction within a single FPGA chip. High assurance and fail-safe systems can now be implemented within the reconfigurable hardware fabric of an FPGA, enabling these systems to maintain flexibility and achieve high performance while providing a high level of data security. The objective of this thesis was to design and analyze an FPGA-based system containing two isolated, softcore Nios processors that share data through two crypto-engines. FPGA-based single-chip cryptographic (SCC) techniques were employed to ensure proper component isolation when the design is placed on a device supporting the appropriate security primitives. Each crypto-engine is an implementation of the Advanced Encryption Standard (AES), operating in Galois/Counter Mode (GCM) for both encryption and authentication. The features of the microprocessors and architectures of the AES crypto-engines were varied with the goal of determining combinations which best target high performance, minimal hardware usage, or a combination of the two
Tehomuuntajan säädön toteutus FPGA:lla
High switching frequencies and control rates in switched-mode power supplies are hard to implement with microcontrollers. Very high clock frequency is required to execute complex control algorithms with high control rate. FPGA chips offer a solution with inherent parallel processing. In this thesis, the feasibility of implementing the control of a typical telecom power converter with FPGA is studied. Requirements for the control system partitioning are considered. The control of a resonant LLC converter is studied in detail and implemented in VHDL. As part of the controller, a high-resolution variable frequency PWM module and floating-point arithmetic modules are implemented. Finally, a complete VHDL simulation model is created and run in different conditions to verify the functionality of the design.Korkeat kytkentä- ja säätötaajuudet hakkuriteholähteissä ovat haastavia toteuttaa mikrokontrollereilla. Monimutkaiset säätöalgoritmit edellyttävät mikrokontrollereilta korkeaa kellotaajuutta. FPGA-teknologia mahdollistaa rinnakkaislaskennan, joka on etu säätösovelluksissa. Tässä työssä tutkitaan FPGA teknologian soveltumista tyypillisen telecom-tehomuuntajan säätöön. Työssä selvitetään säätöjärjestelmän partitiointia sekä toteutetaan LLC-muuntajan ja sen säätöjärjestelmän simulaatiomalli VHDL-kielellä. Säädön osana toteutetaan korkearesoluutioinen PWM-moduuli sekä liukulukuaritmetiikkamoduuleja
- …