Search CORE

1,243 research outputs found

FPGA acceleration of a quantized neural network for remote-sensed cloud detection

Author: Crockett Louise
Greenland Steve
Ireland Murray
Karagiannakis Philipp
Reiter Philippe
Publication venue
Publication date: 23/09/2020
Field of study

The capture and transmission of remote-sensed imagery for Earth observation is both computationally and bandwidth expensive. In the analyses of remote-sensed imagery in the visual band, atmospheric cloud cover can obstruct up to two-thirds of observations, resulting in costly imagery being discarded. Mission objectives and satellite operational details vary; however, assuming a cloud-free observation requirement, a doubling of useful data downlinked with an associated halving of delivery cost is possible through effective cloud detection. A minimal-resource, real-time inference neural network is ideally suited to perform automatic cloud detection, both for pre-processing captured images prior to transmission and preventing unnecessary images being taken by larger payload sensors. Much of the hardware complexity of modern neural network implementations resides in high-precision floating-point calculation pipelines. In recent years, research has been conducted in identifying quantized, or low-integer precision equivalents to known deep learning models, which do not require the extensive resources of their floating-point, full-precision counterparts. Our work leverages existing research on binary and quantized neural networks to develop a real-time, remote-sensed cloud detection solution using a commodity field-programmable gate array. This follows on developments of the Forwards Looking Imager for predictive cloud detection developed by Craft Prospect, a space engineering practice based in Glasgow, UK. The synthesized cloud detection accelerator achieved an inference throughput of 358.1 images per second with a maximum power consumption of 2.4 W. This throughput is an order of magnitude faster than alternate algorithmic options for the Forwards Looking Imager at around one third reduction in classification accuracy, and approximately two orders of magnitude faster than the CloudScout deep neural network, deployed with HyperScout 2 on the European Space Agency PhiSat-1 mission. Strategies for incorporating fault tolerance mechanisms are expounded

University of Strathclyde Institutional Repository

Support-reducing decomposition for FPGA mapping

Author: Cortadella Jordi
Machado Lucas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Decomposition is a technology-independent process, in which a large complex function is broken into smaller, less complex functions. The costs of two-level or factored-form representations (cubes and literals) are used in most decomposition methods, as they have a high correlation with the area of cell-based designs. However, this correlation is weaker for field-programmable gate arrays (FPGAs) based on look-up tables. Furthermore, local optimizations have limited power due to the structural bias of the circuit descriptions. This paper tries to reduce the structural biasing by remapping the LUT network and decomposing the derived functions using the support as cost function. The proposed method improves the FPGA mapping results of a commercial tool for the 20 largest MCNC benchmarks, with gains of 28% in delay plus 18% in area when targeting delay, and a reduction of 28% in area plus 14% in delay with area as cost function. Results with 23% less area and 6% less delay are obtained after physical synthesis (post place-and-route). Moreover, 12 of the best known results for delay (and 3 for area) of the EPFL benchmarks are improved.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

Efficient FPGA implementation and power modelling of image and signal processing IP cores

Author: Chandrasekaran Shrutisagar
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2007
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage and signal processing application areas such as consumer electronics, instrumentation, medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area. A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed

Brunel University Research Archive

Automatic design of domain-specific instructions for low-power processors

Author: Alvarez Carlos
Eeckhout Lieven
González-Álvarez Cecilia
Jimenez-Gonzalez Daniel
Sartor Jennifer
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper explores hardware specialization of low power processors to improve performance and energy efficiency. Our main contribution is an automated framework that analyzes instruction sequences of applications within a domain at the loop body level and identifies exactly and partially-matching sequences across applications that can become custom instructions. Our framework transforms sequences to a new code abstraction, a Merging Diagram, that improves similarity identification, clusters alike groups of potential custom instructions to effectively reduce the search space, and selects merged custom instructions to efficiently exploit the available customizable area. For a set of 11 media applications, our fast framework generates instructions that significantly improve the energy-delay product and speed up, achieving more than double the savings as compared to a technique analyzing sequences within basic blocks. This paper shows that partially-matched custom instructions, which do not significantly increase design time, are crucial to achieving higher energy efficiency at limited hardware areas

Crossref

Ghent University Academic Bibliography

Nanosecond machine learning event classification with boosted decision trees in FPGA for high energy physics

Author: Carlson Benjamin
Eubanks Brandon
Hong Tae Min
Racz Stephen
Roche Stephen
Stelzer Joerg
Stumpp Daniel
Publication venue: 'IOP Publishing'
Publication date: 17/08/2021
Field of study

We present a novel implementation of classification using the machine learning / artificial intelligence method called boosted decision trees (BDT) on field programmable gate arrays (FPGA). The firmware implementation of binary classification requiring 100 training trees with a maximum depth of 4 using four input variables gives a latency value of about 10 ns, independent of the clock speed from 100 to 320 MHz in our setup. The low timing values are achieved by restructuring the BDT layout and reconfiguring its parameters. The FPGA resource utilization is also kept low at a range from 0.01% to 0.2% in our setup. A software package called fwXmachina achieves this implementation. Our intended user is an expert of custom electronics-based trigger systems in high energy physics experiments or anyone that needs decisions at the lowest latency values for real-time event classification. Two problems from high energy physics are considered, in the separation of electrons vs. photons and in the selection of vector boson fusion-produced Higgs bosons vs. the rejection of the multijet processes.Comment: 66 pages, 27 figures, 13 tables, JINST versio

arXiv.org e-Print Archive

Digital electronic predistortion for optical communications

Author: Waegemans R.
Publication venue: UCL (University College London)
Publication date: 28/05/2010
Field of study

The distortion of optical signals has long been an issue limiting the performance of communication systems. With the increase of transmission speeds the effects of distortion are becoming more prominent. Because of this, the use of methods known from digital signal processing (DSP) are being introduced to compensate for them. Applying DSP to improve optical signals has been limited by a discrepancy in digital signal processing speeds and optical transmission speeds. However high speed Field Programmable Gate Arrays (FPGA) which are sufficiently fast have now become available making DSP experiments without costly ASIC implementation possible for optical transmission experiments. This thesis focuses on Look Up Table (LUT) based digital Electronic Predistortion (EPD) for optical transmission. Because it is only one out of many possible implementations of EPD, it has to be placed in context with other EPD techniques and other distortion combating techniques in general, especially since it is possible to combine the different techniques. Building an actual transmitter means that compromises and decisions have to be made in the design and implementation of an EPD based system. These are based on balancing the desire to achieve optimal performance with technological and economic limitations. This is partly done using optical simulations to asses the performance. This thesis describes a novel experimental transmitter that has been built as part of this research applying LUT based EPD to an optical signal. The experimental transmitter consists of a digital design (using a hardware description language) for a pair of FPGAs and an analogue optical/electronic setup including two standard DAC integrated circuits. The DSP in the transmitter compensated for both chromatic dispersion and self phase modulation. We achieved transmission of 10.7 Gb/s non-return-to-zero (NRZ) signals with a +4 dBm launch power over 450 km keeping the required optical-signal-to-noise-ratio (OSNR) for a bit-error-rate of 2x10^{-3} below 11 dB. In doing so we showed experimentally, for the first time, that nonlinear effects can be compensated with this approach and that the combination of FPGA-DAC is a viable approach for an experimental setup

UCL Discovery

Anti-Tamper Method for Field Programmable Gate Arrays Through Dynamic Reconfiguration and Decoy Circuits

Author: Stone Samuel J.
Publication venue: AFIT Scholar
Publication date: 05/03/2008
Field of study

As Field Programmable Gate Arrays (FPGAs) become more widely used, security concerns have been raised regarding FPGA use for cryptographic, sensitive, or proprietary data. Storing or implementing proprietary code and designs on FPGAs could result in the compromise of sensitive information if the FPGA device was physically relinquished or remotely accessible to adversaries seeking to obtain the information. Although multiple defensive measures have been implemented (and overcome), the possibility exists to create a secure design through the implementation of polymorphic Dynamically Reconfigurable FPGA (DRFPGA) circuits. Using polymorphic DRFPGAs removes the static attributes from their design; thus, substantially increasing the difficulty of successful adversarial reverse-engineering attacks. A variety of dynamically reconfigurable methodologies exist for implementation that challenge designers in the reconfigurable technology field. A Hardware Description Language (HDL) DRFPGA model is presented for use in security applications. The Very High Speed Integrated Circuit HDL (VHSIC) language was chosen to take advantage of its capabilities, which are well suited to the current research. Additionally, algorithms that explicitly support granular autonomous reconfiguration have been developed and implemented on the DRFPGA as a means of protecting its designs. Documented testing validates the reconfiguration results and compares power usage, timing, and area estimates from a conventional and DRFPGA model

AFTI Scholar (Air Force Institute of Technology)

Null Convention Logic applications of asynchronous design in nanotechnology and cryptographic security

Author: Wu Jun
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2012
Field of study

This dissertation presents two Null Convention Logic (NCL) applications of asynchronous logic circuit design in nanotechnology and cryptographic security. The first application is the Asynchronous Nanowire Reconfigurable Crossbar Architecture (ANRCA); the second one is an asynchronous S-Box design for cryptographic system against Side-Channel Attacks (SCA). The following are the contributions of the first application: 1) Proposed a diode- and resistor-based ANRCA (DR-ANRCA). Three configurable logic block (CLB) structures were designed to efficiently reconfigure a given DR-PGMB as one of the 27 arbitrary NCL threshold gates. A hierarchical architecture was also proposed to implement the higher level logic that requires a large number of DR-PGMBs, such as multiple-bit NCL registers. 2) Proposed a memristor look-up-table based ANRCA (MLUT-ANRCA). An equivalent circuit simulation model has been presented in VHDL and simulated in Quartus II. Meanwhile, the comparison between these two ANRCAs have been analyzed numerically. 3) Presented the defect-tolerance and repair strategies for both DR-ANRCA and MLUT-ANRCA. The following are the contributions of the second application: 1) Designed an NCL based S-Box for Advanced Encryption Standard (AES). Functional verification has been done using Modelsim and Field-Programmable Gate Array (FPGA). 2) Implemented two different power analysis attacks on both NCL S-Box and conventional synchronous S-Box. 3) Developed a novel approach based on stochastic logics to enhance the resistance against DPA and CPA attacks. The functionality of the proposed design has been verified using an 8-bit AES S-box design. The effects of decision weight, bitstream length, and input repetition times on error rates have been also studied. Experimental results shows that the proposed approach enhances the resistance to against the CPA attack by successfully protecting the hidden key --Abstract, page iii

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine