42 research outputs found

    Time domain based image generation for synthetic aperture radar on field programmable gate arrays

    Get PDF
    Aerial images are important in different scenarios including surface cartography, surveillance, disaster control, height map generation, etc. Synthetic Aperture Radar (SAR) is one way to generate these images even through clouds and in the absence of daylight. For a wide and easy usage of this technology, SAR systems should be small, mounted to Unmanned Aerial Vehicles (UAVs) and process images in real-time. Since UAVs are small and lightweight, more robust (but also more complex) time-domain algorithms are required for good image quality in case of heavy turbulence. Typically the SAR data set size does not allow for ground transmission and processing, while the UAV size does not allow for huge systems and high power consumption to process the data. A small and energy-efficient signal processing system is therefore required. To fill the gap between existing systems that are capable of either high-speed processing or low power consumption, the focus of this thesis is the analysis, design, and implementation of such a system. A survey shows that most architectures either have to high power budgets or too few processing capabilities to match real-time requirements for time-domain-based processing. Therefore, a Field Programmable Gate Array (FPGA) based system is designed, as it allows for high performance and low-power consumption. The Global Backprojection (GBP) is implemented, as it is the standard time-domain-based algorithm which allows for highest image quality at arbitrary trajectories at the complexity of O(N3). To satisfy real-time requirements under all circumstances, the accelerated Fast Factorized Backprojection (FFBP) algorithm with a complexity of O(N2logN) is implemented as well, to allow for a trade-off between image quality and processing time. Additionally, algorithm and design are enhanced to correct the failing assumptions for Frequency Modulated Continuous Wave (FMCW) Radio Detection And Ranging (Radar) data at high velocities. Such sensors offer high-resolution data at considerably low transmit power which is especially interesting for UAVs. A full analysis of all algorithms is carried out, to design a highly utilized architecture for maximum throughput. The process covers the analysis of mathematical steps and approximations for hardware speedup, the analysis of code dependencies for instruction parallelism and the analysis of streaming capabilities, including memory access and caching strategies, as well as parallelization considerations and pipeline analysis. Each architecture is described in all details with its surrounding control structure. As proof of concepts, the architectures are mapped on a Virtex 6 FPGA and results on resource utilization, runtime and image quality are presented and discussed. A special framework allows to scale and port the design to other FPGAs easily and to enable for maximum resource utilization and speedup. The result is streaming architectures that are capable of massive parallelization with a minimum in system stalls. It is shown that real-time processing on FPGAs with strict power budgets in time-domain is possible with the GBP (mid-sized images) and the FFBP (any image size with a trade-off in quality), allowing for a UAV scenario

    Radiation safety based on the sky shine effect in reactor

    Get PDF
    In the reactor operation, neutrons and gamma rays are the most dominant radiation. As protection, lead and concrete shields are built around the reactor. However, the radiation can penetrate the water shielding inside the reactor pool. This incident leads to the occurrence of sky shine where a physical phenomenon of nuclear radiation sources was transmitted panoramic that extends to the environment. The effect of this phenomenon is caused by the fallout radiation into the surrounding area which causes the radiation dose to increase. High doses of exposure cause a person to have stochastic effects or deterministic effects. Therefore, this study was conducted to measure the radiation dose from sky shine effect that scattered around the reactor at different distances and different height above the reactor platform. In this paper, the analysis of the radiation dose of sky shine effect was measured using the experimental metho

    AI and ML Accelerator Survey and Trends

    Full text link
    This paper updates the survey of AI accelerators and processors from past three years. This paper collects and summarizes the current commercial accelerators that have been publicly announced with peak performance and power consumption numbers. The performance and power values are plotted on a scatter graph, and a number of dimensions and observations from the trends on this plot are again discussed and analyzed. Two new trends plots based on accelerator release dates are included in this year's paper, along with the additional trends of some neuromorphic, photonic, and memristor-based inference accelerators.Comment: 10 pages, 4 figures, 2022 IEEE High Performance Extreme Computing (HPEC) Conference. arXiv admin note: substantial text overlap with arXiv:2009.00993, arXiv:2109.0895

    Parallelization and improvement of beamforming process in synthetic aperture systems for real-time ultrasonic image generation

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 9-02-2016La ecografía es hoy en día uno de los métodos de visualización más populares para examinar el interior de cuerpos opacos. Su aplicación es especialmente significativa tanto en el campo del diagnóstico médico como en las aplicaciones de evaluación no destructiva en el ámbito industrial, donde se evalúa la integridad de un componente o una estructura. El desarrollo de sistemas ecográficos de alta calidad y con buenas prestaciones se basa en el empleo de sistemas multisensoriales conocidos como arrays que pueden estar compuestos por varias decenas de elementos. El desarrollo de estos dispositivos tiene asociada una elevada complejidad, tanto por el número de sensores y la electrónica necesaria para la adquisición paralela de señales, como por la etapa de procesamiento de los datos adquiridos que debe operar en tiempo real. Esta etapa de procesamiento de señal trabaja con un elevado flujo de datos en paralelo y desarrolla, además de la composición de imagen, otras sofisticadas técnicas de medidas sobre los datos (medida de elasticidad, flujo, etc). En este sentido, el desarrollo de nuevos sistemas de imagen con mayores prestaciones (resolución, rango dinámico, imagen 3D, etc) está fuertemente limitado por el número de canales en la apertura del array. Mientras algunos estudios se han centrado en la reducción activa de sensores (sparse arrays como ejemplo), otros se han centrado en analizar diferentes estrategias de adquisiciónn que, operando con un número reducido de canales electrónicos en paralelo, sean capaz por multiplexación emular el funcionamiento de una apertura plena. A estas últimas técnicas se las agrupa mediante el concepto de Técnicas de Apertura Sintética (SAFT). Su interés radica en que no solo son capaces de reducir los requerimientos hardware del sistema (bajo consumo, portabilidad, coste, etc) sino que además permiten dentro de cierto compromiso la mejora de la calidad de imagen respecto a los sistemas convencionales...Ultrasound is nowadays one of the most popular visualization methods to examine the interior of opaque objects. Its application is particularly significant in the field of medical diagnosis as well as non-destructive evaluation applications in industry. The development of high performance ultrasound imaging systems is based on the use of multisensory systems known as arrays, which may be composed by dozens of elements. The development of these devices has associated a high complexity, due to the number of sensors and electronics needed for the parallel acquisition of signals, and for the processing stage of the acquired data which must operate on real-time. This signal processing stage works with a high data flow in parallel and develops, besides the image composition, other sophisticated measure techniques (measure of elasticity, flow, etc). In this sense, the development of new imaging systems with higher performance (resolution, dynamic range, 3D imaging, etc) is strongly limited by the number of channels in array’s aperture. While some studies have been focused on the reduction of active sensors (i.e. sparse arrays), others have been centered on analysing different acquisition strategies which, operating with reduced number of electronic channels in parallel, are able to emulate by multiplexing the behavior of a full aperture. These latest techniques are grouped under the term known as Synthetic Aperture Techniques (SAFT). Their interest is that they are able to reduce hardware requirements (low power, portability, cost, etc) and also allow to improve the image quality over conventional systems...Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    Performance Aspects of Synthesizable Computing Systems

    Get PDF

    Deep learning-based change detection in remote sensing images:a review

    Get PDF
    Images gathered from different satellites are vastly available these days due to the fast development of remote sensing (RS) technology. These images significantly enhance the data sources of change detection (CD). CD is a technique of recognizing the dissimilarities in the images acquired at distinct intervals and are used for numerous applications, such as urban area development, disaster management, land cover object identification, etc. In recent years, deep learning (DL) techniques have been used tremendously in change detection processes, where it has achieved great success because of their practical applications. Some researchers have even claimed that DL approaches outperform traditional approaches and enhance change detection accuracy. Therefore, this review focuses on deep learning techniques, such as supervised, unsupervised, and semi-supervised for different change detection datasets, such as SAR, multispectral, hyperspectral, VHR, and heterogeneous images, and their advantages and disadvantages will be highlighted. In the end, some significant challenges are discussed to understand the context of improvements in change detection datasets and deep learning models. Overall, this review will be beneficial for the future development of CD methods

    Large-Scale Discrete Fourier Transform on TPUs

    Full text link
    In this work, we present two parallel algorithms for the large-scale discrete Fourier transform (DFT) on Tensor Processing Unit (TPU) clusters. The two parallel algorithms are associated with two formulations of DFT: one is based on the Kronecker product, to be specific, dense matrix multiplications between the input data and the Vandermonde matrix, denoted as KDFT in this work; the other is based on the famous Cooley-Tukey algorithm and phase adjustment, denoted as FFT in this work. Both KDFT and FFT formulations take full advantage of TPU's strength in matrix multiplications. The KDFT formulation allows direct use of nonuniform inputs without additional step. In the two parallel algorithms, the same strategy of data decomposition is applied to the input data. Through the data decomposition, the dense matrix multiplications in KDFT and FFT are kept local within TPU cores, which can be performed completely in parallel. The communication among TPU cores is achieved through the one-shuffle scheme in both parallel algorithms, with which sending and receiving data takes place simultaneously between two neighboring cores and along the same direction on the interconnect network. The one-shuffle scheme is designed for the interconnect topology of TPU clusters, minimizing the time required by the communication among TPU cores. Both KDFT and FFT are implemented in TensorFlow. The three-dimensional complex DFT is performed on an example of dimension 8192×8192×81928192 \times 8192 \times 8192 with a full TPU Pod: the run time of KDFT is 12.66 seconds and that of FFT is 8.3 seconds. Scaling analysis is provided to demonstrate the high parallel efficiency of the two DFT implementations on TPUs
    corecore