74 research outputs found

    Fast fluorescence lifetime imaging and sensing via deep learning

    Get PDF
    Error on title page – year of award is 2023.Fluorescence lifetime imaging microscopy (FLIM) has become a valuable tool in diverse disciplines. This thesis presents deep learning (DL) approaches to addressing two major challenges in FLIM: slow and complex data analysis and the high photon budget for precisely quantifying the fluorescence lifetimes. DL's ability to extract high-dimensional features from data has revolutionized optical and biomedical imaging analysis. This thesis contributes several novel DL FLIM algorithms that significantly expand FLIM's scope. Firstly, a hardware-friendly pixel-wise DL algorithm is proposed for fast FLIM data analysis. The algorithm has a simple architecture yet can effectively resolve multi-exponential decay models. The calculation speed and accuracy outperform conventional methods significantly. Secondly, a DL algorithm is proposed to improve FLIM image spatial resolution, obtaining high-resolution (HR) fluorescence lifetime images from low-resolution (LR) images. A computational framework is developed to generate large-scale semi-synthetic FLIM datasets to address the challenge of the lack of sufficient high-quality FLIM datasets. This algorithm offers a practical approach to obtaining HR FLIM images quickly for FLIM systems. Thirdly, a DL algorithm is developed to analyze FLIM images with only a few photons per pixel, named Few-Photon Fluorescence Lifetime Imaging (FPFLI) algorithm. FPFLI uses spatial correlation and intensity information to robustly estimate the fluorescence lifetime images, pushing this photon budget to a record-low level of only a few photons per pixel. Finally, a time-resolved flow cytometry (TRFC) system is developed by integrating an advanced CMOS single-photon avalanche diode (SPAD) array and a DL processor. The SPAD array, using a parallel light detection scheme, shows an excellent photon-counting throughput. A quantized convolutional neural network (QCNN) algorithm is designed and implemented on a field-programmable gate array as an embedded processor. The processor resolves fluorescence lifetimes against disturbing noise, showing unparalleled high accuracy, fast analysis speed, and low power consumption.Fluorescence lifetime imaging microscopy (FLIM) has become a valuable tool in diverse disciplines. This thesis presents deep learning (DL) approaches to addressing two major challenges in FLIM: slow and complex data analysis and the high photon budget for precisely quantifying the fluorescence lifetimes. DL's ability to extract high-dimensional features from data has revolutionized optical and biomedical imaging analysis. This thesis contributes several novel DL FLIM algorithms that significantly expand FLIM's scope. Firstly, a hardware-friendly pixel-wise DL algorithm is proposed for fast FLIM data analysis. The algorithm has a simple architecture yet can effectively resolve multi-exponential decay models. The calculation speed and accuracy outperform conventional methods significantly. Secondly, a DL algorithm is proposed to improve FLIM image spatial resolution, obtaining high-resolution (HR) fluorescence lifetime images from low-resolution (LR) images. A computational framework is developed to generate large-scale semi-synthetic FLIM datasets to address the challenge of the lack of sufficient high-quality FLIM datasets. This algorithm offers a practical approach to obtaining HR FLIM images quickly for FLIM systems. Thirdly, a DL algorithm is developed to analyze FLIM images with only a few photons per pixel, named Few-Photon Fluorescence Lifetime Imaging (FPFLI) algorithm. FPFLI uses spatial correlation and intensity information to robustly estimate the fluorescence lifetime images, pushing this photon budget to a record-low level of only a few photons per pixel. Finally, a time-resolved flow cytometry (TRFC) system is developed by integrating an advanced CMOS single-photon avalanche diode (SPAD) array and a DL processor. The SPAD array, using a parallel light detection scheme, shows an excellent photon-counting throughput. A quantized convolutional neural network (QCNN) algorithm is designed and implemented on a field-programmable gate array as an embedded processor. The processor resolves fluorescence lifetimes against disturbing noise, showing unparalleled high accuracy, fast analysis speed, and low power consumption

    Implementation of Efficient Multilayer Perceptron ANN Neurons on Field Programmable Gate Array Chip

    Get PDF
    Artificial Neural Network is widely used to learn data from systems for different types of applications. The capability of different types of Integrated Circuit (IC) based ANN structures also depends on the hardware backbone used for their implementation. In this work, Field Programmable Gate Array (FPGA) based Multilayer Perceptron Artificial Neural Network (MLP-ANN) neuron is developed. Experiments were carried out to demonstrate the hardware realization of the artificial neuron using FPGA. Two different activation functions (i.e. tan-sigmoid and log-sigmoid) were tested for the implementation of the proposed neuron. Simulation result shows that tan-sigmoid with a high index (i.e. k >= 40) is a better choice of sigmoid activation function for the harware implemetation of a MLP-ANN neuron

    FPGA implementations of feed forward neural network by using floating point hardware accelerators

    Get PDF
    This paper documents the research towards the analysis of different solutions to implement a Neural Network architecture on a FPGA design by using floating point accelerators. In particular, two different implementations are investigated: a high level solution to create a neural network on a soft processor design, with different strategies for enhancing the performance of the process; a low level solution, achieved by a cascade of floating point arithmetic elements. Comparisons of the achieved performance in terms of both time consumptions and FPGA resources employed for the architectures are presented

    Accelerated artificial neural networks on FPGA for fault detection in automotive systems

    Get PDF
    Modern vehicles are complex distributed systems with critical real-time electronic controls that have progressively replaced their mechanical/hydraulic counterparts, for performance and cost benefits. The harsh and varying vehicular environment can induce multiple errors in the computational/communication path, with temporary or permanent effects, thus demanding the use of fault-tolerant schemes. Constraints in location, weight, and cost prevent the use of physical redundancy for critical systems in many cases, such as within an internal combustion engine. Alternatively, algorithmic techniques like artificial neural networks (ANNs) can be used to detect errors and apply corrective measures in computation. Though adaptability of ANNs presents advantages for fault-detection and fault-tolerance measures for critical sensors, implementation on automotive grade processors may not serve required hard deadlines and accuracy simultaneously. In this work, we present an ANN-based fault-tolerance system based on hybrid FPGAs and evaluate it using a diesel engine case study. We show that the hybrid platform outperforms an optimised software implementation on an automotive grade ARM Cortex M4 processor in terms of latency and power consumption, also providing better consolidation

    A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking

    Full text link
    Vision Transformer (ViT) architectures are becoming increasingly popular and widely employed to tackle computer vision applications. Their main feature is the capacity to extract global information through the self-attention mechanism, outperforming earlier convolutional neural networks. However, ViT deployment and performance have grown steadily with their size, number of trainable parameters, and operations. Furthermore, self-attention's computational and memory cost quadratically increases with the image resolution. Generally speaking, it is challenging to employ these architectures in real-world applications due to many hardware and environmental restrictions, such as processing and computational capabilities. Therefore, this survey investigates the most efficient methodologies to ensure sub-optimal estimation performances. More in detail, four efficient categories will be analyzed: compact architecture, pruning, knowledge distillation, and quantization strategies. Moreover, a new metric called Efficient Error Rate has been introduced in order to normalize and compare models' features that affect hardware devices at inference time, such as the number of parameters, bits, FLOPs, and model size. Summarizing, this paper firstly mathematically defines the strategies used to make Vision Transformer efficient, describes and discusses state-of-the-art methodologies, and analyzes their performances over different application scenarios. Toward the end of this paper, we also discuss open challenges and promising research directions

    Field programmable gate array based sigmoid function implementation using differential lookup table and second order nonlinear function

    Get PDF
    Artificial neural network (ANN) is an established artificial intelligence technique that is widely used for solving numerous problems such as classification and clustering in various fields. However, the major problem with ANN is a factor of time. ANN takes a longer time to execute a huge number of neurons. In order to overcome this, ANN is implemented into hardware namely field-programmable-gate-array (FPGA). However, implementing the ANN into a field-programmable gate array (FPGA) has led to a new problem related to the sigmoid function implementation. Often used as the activation function for ANN, a sigmoid function cannot be directly implemented in FPGA. Owing to its accuracy, the lookup table (LUT) has always been used to implement the sigmoid function in FPGA. In this case, obtaining the high accuracy of LUT is expensive particularly in terms of its memory requirements in FPGA. Second-order nonlinear function (SONF) is an appealing replacement for LUT due to its small memory requirement. Although there is a trade-off between accuracy and memory size. Taking the advantage of the aforementioned approaches, this thesis proposed a combination of SONF and a modified LUT namely differential lookup table (dLUT). The deviation values between SONF and sigmoid function are used to create the dLUT. SONF is used as the first step to approximate the sigmoid function. Then it is followed by adding or deducting with the value that has been stored in the dLUT as a second step as demonstrated via simulation. This combination has successfully reduced the deviation value. The reduction value is significant as compared to previous implementations such as SONF, and LUT itself. Further simulation has been carried out to evaluate the accuracy of the ANN in detecting the object in an indoor environment by using the proposed method as a sigmoid function. The result has proven that the proposed method has produced the output almost as accurately as software implementation in detecting the target in indoor positioning problems. Therefore, the proposed method can be applied in any field that demands higher processing and high accuracy in sigmoid function outpu

    Implementação em hardware de uma Rede Neuronal com ummicroprocessador embebido em FPGA

    Get PDF
    O objectivo deste trabalho é a implementação em hardware de uma Rede Neuronal com um microprocessador embebido, podendo ser um recurso valioso em várias áreas científicas. A importância das implementações em hardware deve-se à flexibilidade, maior desempenho e baixo consumo de energia. Para esta implementação foi utilizado o dispositivo FPGA Virtex II Pro XC2VP30 com um MicroBlaze soft core, da Xilinx. O MicroBlaze tem vantagens como a simplicidade no design, sua reutilização e fácil integração com outras tecnologias. A primeira fase do trabalho consistiu num estudo sobre o FPGA, um sistema reconfigurável que possui características importantes como a capacidade de executar em paralelo tarefas complexas. Em seguida, desenvolveu-se o código de implementação de uma Rede Neuronal Artificial baseado numa linguagem de programação de alto nível. Na implementação da Rede Neuronal aplicou-se, na camada escondida, a função de activação tangente hiperbólica, que serve para fornecer a não linearidade à Rede Neuronal. A implementação é feita usando um tipo de Rede Neuronal que permite apenas ligações no sentido de saída, chamado Redes Neuronais sem realimentação (do Inglês Feedforward Neural Networks - FNN). Como as Redes Neuronais Artificiais são sistemas de processamento de informações, e as suas características são comuns às Redes Neuronais Biológicas, aplicaram-se testes na implementação em hardware e analisou-se a sua importância, a sua eficiência e o seu desempenho. E finalmente, diante dos resultados, fez-se uma análise de abordagem e metodologia adoptada e sua viabilidade.Orientador: Fernando Manuel Rosmaninho Morgado Ferrão Dia
    corecore