354 research outputs found
Binary object recognition system on FPGA with bSOM
Tri-state Self Organizing Map (bSOM), which takes binary inputs and maintains tri-state weights, has been used for classification rather than clustering in this paper. The major contribution here is the demonstration of the potential use of the modified bSOM in security surveillance, as a recognition system on FPGA
Real-time implementation of 3D LiDAR point cloud semantic segmentation in an FPGA
Dissertação de mestrado em Informatics EngineeringIn the last few years, the automotive industry has relied heavily on deep learning applications for
perception solutions. With data-heavy sensors, such as LiDAR, becoming a standard, the task of
developing low-power and real-time applications has become increasingly more challenging. To obtain
the maximum computational efficiency, no longer can one focus solely on the software aspect of such
applications, while disregarding the underlying hardware.
In this thesis, a hardware-software co-design approach is used to implement an inference application
leveraging the SqueezeSegV3, a LiDAR-based convolutional neural network, on the Versal ACAP VCK190
FPGA. Automotive requirements carefully drive the development of the proposed solution, with real-time
performance and low power consumption being the target metrics.
A first experiment validates the suitability of Xilinx’s Vitis-AI tool for the deployment of deep
convolutional neural networks on FPGAs. Both the ResNet-18 and SqueezeNet neural networks are
deployed to the Zynq UltraScale+ MPSoC ZCU104 and Versal ACAP VCK190 FPGAs. The results show
that both networks achieve far more than the real-time requirements while consuming low power.
Compared to an NVIDIA RTX 3090 GPU, the performance per watt during both network’s inference is 12x
and 47.8x higher and 15.1x and 26.6x higher respectively for the Zynq UltraScale+ MPSoC ZCU104 and
the Versal ACAP VCK190 FPGA. These results are obtained with no drop in accuracy in the quantization
step.
A second experiment builds upon the results of the first by deploying a real-time application containing
the SqueezeSegV3 model using the Semantic-KITTI dataset. A framerate of 11 Hz is achieved with a peak
power consumption of 78 Watts. The quantization step results in a minimal accuracy and IoU degradation
of 0.7 and 1.5 points respectively. A smaller version of the same model is also deployed achieving a
framerate of 19 Hz and a peak power consumption of 76 Watts. The application performs semantic
segmentation over all the point cloud with a field of view of 360°.Nos últimos anos a indústria automóvel tem cada vez mais aplicado deep learning para solucionar
problemas de perceção. Dado que os sensores que produzem grandes quantidades de dados, como o
LiDAR, se têm tornado standard, a tarefa de desenvolver aplicações de baixo consumo energético e com
capacidades de reagir em tempo real tem-se tornado cada vez mais desafiante. Para obter a máxima
eficiência computacional, deixou de ser possível focar-se apenas no software aquando do
desenvolvimento de uma aplicação deixando de lado o hardware subjacente.
Nesta tese, uma abordagem de desenvolvimento simultâneo de hardware e software é usada para
implementar uma aplicação de inferência usando o SqueezeSegV3, uma rede neuronal convolucional
profunda, na FPGA Versal ACAP VCK190. São os requisitos automotive que guiam o desenvolvimento da
solução proposta, sendo a performance em tempo real e o baixo consumo energético, as métricas alvo
principais.
Uma primeira experiência valida a aptidão da ferramenta Vitis-AI para a implantação de redes
neuronais convolucionais profundas em FPGAs. As redes ResNet-18 e SqueezeNet são ambas
implantadas nas FPGAs Zynq UltraScale+ MPSoC ZCU104 e Versal ACAP VCK190. Os resultados
mostram que ambas as redes ultrapassam os requisitos de tempo real consumindo pouca energia.
Comparado com a GPU NVIDIA RTX 3090, a performance por Watt durante a inferência de ambas as
redes é superior em 12x e 47.8x e 15.1x e 26.6x respetivamente na Zynq UltraScale+ MPSoC ZCU104
e na Versal ACAP VCK190. Estes resultados foram obtidos sem qualquer perda de accuracy na etapa de
quantização.
Uma segunda experiência é feita no seguimento dos resultados da primeira, implantando uma
aplicação de inferência em tempo real contendo o modelo SqueezeSegV3 e usando o conjunto de dados
Semantic-KITTI. Um framerate de 11 Hz é atingido com um pico de consumo energético de 78 Watts. O
processo de quantização resulta numa perda mínima de accuracy e IoU com valores de 0.7 e 1.5 pontos
respetivamente. Uma versão mais pequena do mesmo modelo é também implantada, atingindo uma
framerate de 19 Hz e um pico de consumo energético de 76 Watts. A aplicação desenvolvida executa
segmentação semântica sobre a totalidade das nuvens de pontos LiDAR, com um campo de visão de
360°
A High-Performance System Architecture for Medical Imaging
Medical imaging is classified into different modalities such as ultrasound, X-ray, computed tomography (CT), positron emission tomography (PET), magnetic resonance imaging (MRI), single-photon emission tomography (SPECT), nuclear medicine (NM), mammography, and fluoroscopy. Medical imaging includes various imaging diagnostic and treatment techniques and methods to model the human body, and therefore, performs an essential role to improve the health care of the community. Medical imaging, scans (such as X-Ray, CT, etc.) are essential in a variety of medical health-care environments. With the enhanced health-care management and increase in availability of medical imaging equipment, the number of global imaging-based systems is growing. Effective, safe, and high-quality imaging is essential for the medical decision-making. In this chapter, we proposed a medical imaging-based high-performance hardware architecture and software programming toolkit called high-performance medical imaging system (HPMIS). The HPMIS can perform medical image registration, storage, and processing in hardware with the support of C/C++ function calls. The system is easy to program and gives high performance to different medical imaging applications
- …