Search CORE

1,362 research outputs found

Hardware Accelarated Visual Tracking Algorithms. A Systematic Literature Review

Author: Korhonen Sirpa
Lahdenoja Olli
Lehtonen Teijo
Sakari Leo
Säntti Tero
Publication venue: University of Turku, Technology Research Center
Publication date: 14/08/2015
Field of study

Many industrial applications need object recognition and tracking capabilities. The algorithms developed for those purposes are computationally expensive. Yet ,real time performance, high accuracy and small power consumption are essential measures of the system. When all these requirements are combined, hardware acceleration of these algorithms becomes a feasible solution. The purpose of this study is to analyze the current state of these hardware acceleration solutions, which algorithms have been implemented in hardware and what modiﬁcations have been done in order to adapt these algorithms to hardware.Siirretty Doriast

UTUPub

HARDWARE ACCELARATED VISUAL TRACKING ALGORITHMS – A Systematic Literature Review

Author: Leo Sakari
Olli Lahdenoja
Sirpa Korhonen
Teijo Lehtonen
Tero Säntti
Publication venue: Society of Social and Economic Research in the Universities of Turku
Publication date: 28/10/2022
Field of study

UTUPub

Harnessing Reconfigurable Hardware to Design Heterogeneous Systems

Author: Iordanou Konstantinos
Publication venue
Publication date: 01/08/2023
Field of study

The University of Manchester - Institutional Repository

TreeBASIS Feature Descriptor and Its Hardware Implementation

Author: Alok Desai
Dah-Jye Lee
Dan Ventura
James Archibald
Spencer Fowers
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

This paper presents a novel feature descriptor called TreeBASIS that provides improvements in descriptor size, computation time, matching speed, and accuracy. This new descriptor uses a binary vocabulary tree that is computed using basis dictionary images and a test set of feature region images. To facilitate real-time implementation, a feature region image is binary quantized and the resulting quantized vector is passed into the BASIS vocabulary tree. A Hamming distance is then computed between the feature region image and the effectively descriptive basis dictionary image at a node to determine the branch taken and the path the feature region image takes is saved as a descriptor. The TreeBASIS feature descriptor is an excellent candidate for hardware implementation because of its reduced descriptor size and the fact that descriptors can be created and features matched without the use of floating point operations. The TreeBASIS descriptor is more computationally and space efficient than other descriptors such as BASIS, SIFT, and SURF. Moreover, it can be computed entirely in hardware without the support of a CPU for additional software-based computations. Experimental results and a hardware implementation show that the TreeBASIS descriptor compares well with other descriptors for frame-to-frame homography computation while requiring fewer hardware resources

Crossref

Directory of Open Access Journals

Low power and high performance heterogeneous computing on FPGAs

Author: Ma Liang
Publication venue: Politecnico di Torino
Publication date
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Reconfigurable acceleration of Recurrent Neural Networks

Author: Que Zhiqiang
Publication venue: Computing, Imperial College London
Publication date: 01/09/2023
Field of study

Recurrent Neural Networks (RNNs) have been successful in a wide range of applications involving temporal sequences such as natural language processing, speech recognition and video analysis. However, RNNs often require a significant amount of memory and computational resources. In addition, the recurrent nature and data dependencies in RNN computations can lead to system stall, resulting in low throughput and high latency. This work describes novel parallel hardware architectures for accelerating RNN inference using Field-Programmable Gate Array (FPGA) technology, which considers the data dependencies and high computational costs of RNNs. The first contribution of this thesis is a latency-hiding architecture that utilizes column-wise matrix-vector multiplication instead of the conventional row-wise operation to eliminate data dependencies and improve the throughput of RNN inference designs. This architecture is further enhanced by a configurable checkerboard tiling strategy which allows large dimensions of weight matrices, while supporting element-based parallelism and vector-based parallelism. The presented reconfigurable RNN designs show significant speedup over CPU, GPU, and other FPGA designs. The second contribution of this thesis is a weight reuse approach for large RNN models with weights stored in off-chip memory, running with a batch size of one. A novel blocking-batching strategy is proposed to optimize the throughput of large RNN designs on FPGAs by reusing the RNN weights. Performance analysis is also introduced to enable FPGA designs to achieve the best trade-off between area, power consumption and performance. Promising power efficiency improvement has been achieved in addition to speeding up over CPU and GPU designs. The third contribution of this thesis is a low latency design for RNNs based on a partially-folded hardware architecture. It also introduces a technique that balances initiation interval of multi-layer RNN inferences to increase hardware efficiency and throughput while reducing latency. The approach is evaluated on a variety of applications, including gravitational wave detection and Bayesian RNN-based ECG anomaly detection. To facilitate the use of this approach, we open source an RNN template which enables the generation of low-latency FPGA designs with efficient resource utilization using high-level synthesis tools.Open Acces

Spiral - Imperial College Digital Repository

Analysis and Hardware In the Loop Testing of ADCS Algorithm for the CubeSat 3AMADEUS

Author: Pontes Hugo Brandão
Publication venue
Publication date: 22/01/2021
Field of study

One of the main challenges with Cubesats’ ADCSs (Attitude Determination and Control Subsystems) is how heavy and power consuming the most precise systems are. This means that developing lighter, less consuming ones is of the greatest importance. 3AMADEUS is a mission that aims to find a solution to this exact problem. Magnetic ADCS components are among the lightest, least power consuming and most reliable options in the CubeSat industry. However, due to their low precision, this kind of component can’t be used by themselves in missions that require precise attitude control. One of the ways to improve the precision of this kind of component is to use novel ADCS algorithms that maximize system performance for magnetic ADCSs. That is why 3AMADEUS has the purpose of, not only developing, but also testing multiple of these algorithms inflight, with hopes that one day the implementation of purely magnetic ADCSs can be generalized in nanosatellites. In order to possibilitate an analysis of what algorithms are to be implemented in the 3AMADEUS mission, this work presents a satellite attitude model that allows for a SIL (Software In the Loop) simulation. Furthermore, a HIL (Hardware In the Loop) simulation is made, aiming at validating the usage of an FPGA (Field Programmable Gate Array) for the implementation of this kind of algorithm, since the usage of FPGAs in CubeSats has been rising significantly, and is particularly interesting in a project where reprogrammability is useful. Having that in mind, since the algorithms for this mission are still under development, a purely magnetic ADCS algorithm that has been developed in another context is then tested in a SIL environment, where its performance in terms of accuracy and stabilization, as well as its suitability for the 3AMADEUS mission, is analyzed under different conditions. Finally, one of these tests is performed but this time in a HIL Simulation, not considering attitude determination. The results of this simulation are compared to those obtained in the SIL test, providing relevant data on the feasibility and performance of a real life ADCS algorithm implementation in an FPGA.Um dos grandes entraves dos ADCSs (Attitude Determination and Control Subsystems) de CubeSats é o elevado peso e o alto consumo dos seus componentes de maior precisão, o que significa que desenvolver opções mais leves e de menor consumo é de extrema importância. A 3AMADEUS é uma missão que visa a encontrar uma solução para este mesmo problema. Componentes de ADCS magnéticos estão entre as opções mais leves, de menor consumo energético e mais fíaveis na indústria dos CubeSats. No entanto, devido à sua baixa precisão, estes não podem ser utilizados por si só em missões cujos requisitos de precisão de controlo de atitude sejam elevados. Uma das formas de aumentar a precisão deste tipo de componentes é o uso de novos algoritmos que maximizem o desempenho de ADCSs magnéticos, que é a razão pela qual a 3AMADEUS tem o propósito de desenvolver e testar, em voo, vários destes algoritmos, com a esperança de que um dia a implementação de ADCSs exclusivamente magnéticos seja generalizada em CubeSats. Para que seja possível analisar quais algoritmos devem ser implementados na missão 3AMADEUS, este trabalho apresenta um modelo de atitude de um satélite que permite uma simulação SIL (Software In the Loop). Para além disso, é também feita uma simulação HIL (Hardware In the Loop) que procura validar o uso de um FPGA (Field Programmable Gate Array) para a implementação deste tipo de algoritmo, já que o uso de FPGAs em CubeSats tem tido um crescimento significativo, e é particularmente interessante num projeto onde a reprogramabilidade é uma característica útil. Tendo isto em conta, como os algoritmos para esta missão ainda estão em desenvolvimento, um algoritmo puramente magnético desenvolvido noutro contexto é então testado num ambiente SIL, no qual o seu desempenho em termos de precisão e estabilização, assim como a sua viabilidade para a missão 3AMADEUS, são analisados sob diferentes condições. Por fim, um destes testes é realizado num ambiente de simulação HIL. Os resultados desta simulação, que não têm em conta a determinação da atitude, são comparados com os obtidos no teste em ambiente SIL, fornecendo dados relevantes sobre a viabilidade e desempenho de uma implementação de um algoritmo de ADCS num FPGA na realidade

UBibliorum repositorio digital da ubi