360 research outputs found

    An fpga-based loco-ans implementation for lossless and near-lossless image compression using high-level synthesis

    Full text link
    MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliationsIn this work, we present and evaluate a hardware architecture for the LOCO-ANS (Low Complexity Lossless Compression with Asymmetric Numeral Systems) lossless and near-lossless image compressor, which is based on JPEG-LS standard. The design is implemented in two FPGA generations, evaluating its performance for different codec configurations. The tests show that the design is capable of up to 40.5 MPixels/s and 124 MPixels/s per lane for Zynq 7020 and UltraScale+ FPGAs, respectively. Compared to the single thread LOCO-ANS software implementation running in a 1.2 GHz Raspberry Pi 3B, each hardware lane achieves 6.5 times higher throughput, even when implemented in an older and cost-optimized chip like the Zynq 7020. Results are also presented for a lossless only version, which achieves a lower footprint and approximately 50% higher performance than the version that supports both lossless and near-lossless. Interestingly, these great results were obtained applying High-Level Synthesis, describing the coder with C++ code, which tends to establish a trade-off between design time and quality of results. These results show that the algorithm is very suitable for hardware implementation. Moreover, the implemented system is faster and achieves higher compression than the best previously available near-lossless JPEG-LS hardware implementationThis research was funded in part by the Spanish Research Agency under the project AgileMon (AEI PID2019-104451RB-C21

    Zooplankton visualization system: design and real-time lossless image compression

    Get PDF
    In this thesis, I present a design of a small, self-contained, underwater plankton imaging system. I base the imaging system’s design on an embedded PC architecture based on PC/104-Plus standards to meet the compact size and low power requirements. I developed a simple graphical user interface to run on a real-time operating system to control the imaging system. I also address how a real-time image compression scheme implemented on an FPGA chip speeds up image transfer speeds of the imaging system. Since lossless compression of the image is required in order to retain all image details, I began with an established compression scheme like SPIHT, and latter proposed a new compression scheme that suits the imaging system’s requirements. I provide an estimate of the total amount of resources required and propose suitable FPGA chips to implement the compression scheme. Finally, I present various parallel designs by which the FPGA chip can be integrated into the imaging system

    Capsule endoscopy system with novel imaging algorithms

    Get PDF
    Wireless capsule endoscopy (WCE) is a state-of-the-art technology to receive images of human intestine for medical diagnostics. In WCE, the patient ingests a specially designed electronic capsule which has imaging and wireless transmission capabilities inside it. While the capsule travels through the gastrointestinal (GI) tract, it captures images and sends them wirelessly to an outside data logger unit. The data logger stores the image data and then they are transferred to a personal computer (PC) where the images are reconstructed and displayed for diagnosis. The key design challenge in WCE is to reduce the area and power consumption of the capsule while maintaining acceptable image reconstruction. In this research, the unique properties of WCE images are identified by analyzing hundreds of endoscopic images and video frames, and then these properties are used to develop novel and low complexity compression algorithms tailored for capsule endoscopy. The proposed image compressor consists of a new YEF color space converter, lossless prediction coder, customizable chrominance sub-sampler and an efficient Golomb-Rice encoder. The scheme has both lossy and lossless modes and is further customized to work with two lighting modes – conventional white light imaging (WLI) and emerging narrow band imaging (NBI). The average compression ratio achieved using the proposed lossy compression algorithm is 80.4% for WBI and 79.2% for NBI with high reconstruction quality index for both bands. Two surveys have been conducted which show that the reconstructed images have high acceptability among medical imaging doctors and gastroenterologists. The imaging algorithms have been realized in hardware description language (HDL) and their functionalities have been verified in field programmable gate array (FPGA) board. Later it was implemented in a 0.18 μm complementary metal oxide semiconductor (CMOS) technology and the chip was fabricated. Due to the low complexity of the core compressor, it consumes only 43 µW of power and 0.032 mm2 of area. The compressor is designed to work with commercial low-power image sensor that outputs image pixels in raster scan fashion, eliminating the need of significant input buffer memory. To demonstrate the advantage, a prototype of the complete WCE system including an FPGA based electronic capsule, a microcontroller based data logger unit and a Windows based image reconstruction software have been developed. The capsule contains the proposed low complexity image compressor and can generate both lossy and lossless compressed bit-stream. The capsule prototype also supports both white light imaging (WLI) and narrow band imaging (NBI) imaging modes and communicates with the data logger in full duplex fashion, which enables configuring the image size and imaging mode in real time during the examination. The developed data logger is portable and has a high data rate wireless connectivity including Bluetooth, graphical display for real time image viewing with state-of-the-art touch screen technology. The data are logged in micro SD cards and can be transferred to PC or Smartphone using card reader, USB interface, or Bluetooth wireless link. The workstation software can decompress and show the reconstructed images. The images can be navigated, marked, zoomed and can be played as video. Finally, ex-vivo testing of the WCE system has been done in pig's intestine to validate its performance

    Image and Video Coding Techniques for Ultra-low Latency

    Get PDF
    The next generation of wireless networks fosters the adoption of latency-critical applications such as XR, connected industry, or autonomous driving. This survey gathers implementation aspects of different image and video coding schemes and discusses their tradeoffs. Standardized video coding technologies such as HEVC or VVC provide a high compression ratio, but their enormous complexity sets the scene for alternative approaches like still image, mezzanine, or texture compression in scenarios with tight resource or latency constraints. Regardless of the coding scheme, we found inter-device memory transfers and the lack of sub-frame coding as limitations of current full-system and software-programmable implementations.publishedVersionPeer reviewe

    Efficient architectures of heterogeneous fpga-gpu for 3-d medical image compression

    Get PDF
    The advent of development in three-dimensional (3-D) imaging modalities have generated a massive amount of volumetric data in 3-D images such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and ultrasound (US). Existing survey reveals the presence of a huge gap for further research in exploiting reconfigurable computing for 3-D medical image compression. This research proposes an FPGA based co-processing solution to accelerate the mentioned medical imaging system. The HWT block implemented on the sbRIO-9632 FPGA board is Spartan 3 (XC3S2000) chip prototyping board. Analysis and performance evaluation of the 3-D images were been conducted. Furthermore, a novel architecture of context-based adaptive binary arithmetic coder (CABAC) is the advanced entropy coding tool employed by main and higher profiles of H.264/AVC. This research focuses on GPU implementation of CABAC and comparative study of discrete wavelet transform (DWT) and without DWT for 3-D medical image compression systems. Implementation results on MRI and CT images, showing GPU significantly outperforming single-threaded CPU implementation. Overall, CT and MRI modalities with DWT outperform in term of compression ratio, peak signal to noise ratio (PSNR) and latency compared with images without DWT process. For heterogeneous computing, MRI images with various sizes and format, such as JPEG and DICOM was implemented. Evaluation results are shown for each memory iteration, transfer sizes from GPU to CPU consuming more bandwidth or throughput. For size 786, 486 bytes JPEG format, both directions consumed bandwidth tend to balance. Bandwidth is relative to the transfer size, the larger sizing will take more latency and throughput. Next, OpenCL implementation for concurrent task via dedicated FPGA. Finding from implementation reveals, OpenCL on batch procession mode with AOC techniques offers substantial results where the amount of logic, area, register and memory increased proportionally to the number of batch. It is because of the kernel will copy the kernel block refer to batch number. Therefore memory bank increased periodically related to kernel block. It was found through comparative study that the tree balance and unroll loop architecture provides better achievement, in term of local memory, latency and throughput

    Pipelined implementation of Jpeg image compression using Hdl

    Full text link
    This thesis presents the architecture and design of a JPEG compressor for color images using VHDL. The system consists of major parts like color space converter, down sampler, 2-D DCT module, quantization, zigzag scanning and entropy coDing The color space conversion transforms the RGB colors to YCbCr color coDing The down sampling operation reduces the sampling rate of the color information (Cb and Cr). The 2-D DCT transform the pixel data from the spatial domain to the frequency domain. The quantization operation eliminates the high frequency components and the small amplitude coefficients of the co-sine expansion. Finally, the entropy coding uses run-length encoding (RLE), Huffman, variable length coding (VLC) and differential coding to decrease the number of bits used to represent the image. The JPEG compression is a lossy compression, since downsampling and quantization operations are irreversible. But the losses can be controlled in order to keep the necessary image quality; Architectures for these parts were designed and described in VHDL. The results were observed using Active-HDL simulator and the code being synthesized using xilinx ise for vertex-4 FPGA. This pipelined architecture has a minimum latency of 187 clock cycles

    Development of Advanced Closed-Loop Brain Electrophysiology Systems for Freely Behaving Rodents

    Full text link
    [ES] La electrofisiología extracelular es una técnica ampliamente usada en investigación neurocientífica, la cual estudia el funcionamiento del cerebro mediante la medición de campos eléctricos generados por la actividad neuronal. Esto se realiza a través de electrodos implantados en el cerebro y conectados a dispositivos electrónicos para amplificación y digitalización de las señales. De los muchos modelos animales usados en experimentación, las ratas y los ratones se encuentran entre las especies más comúnmente utilizadas. Actualmente, la experimentación electrofisiológica busca condiciones cada vez más complejas, limitadas por la tecnología de los dispositivos de adquisición. Dos aspectos son de particular interés: Realimentación de lazo cerrado y comportamiento en condiciones naturales. En esta tesis se presentan desarrollos con el objetivo de mejorar diferentes facetas de estos dos problemas. La realimentación en lazo cerrado se refiere a todas las técnicas en las que los estímulos son producidos en respuesta a un evento generado por el animal. La latencia debe ajustarse a las escalas temporales bajo estudio. Los sistemas modernos de adquisición presentan latencias en el orden de los 10ms. Sin embargo, para responder a eventos rápidos, como pueden ser los potenciales de acción, se requieren latencias por debajo de 1ms. Además, los algoritmos para detectar los eventos o generar los estímulos pueden ser complejos, integrando varias entradas de datos en tiempo real. Integrar el desarrollo de dichos algoritmos en las herramientas de adquisición forma parte del diseño experimental. Para estudiar comportamientos naturales, los animales deben ser capaces de moverse libremente en entornos emulando condiciones naturales. Experimentos de este tipo se ven dificultados por la naturaleza cableada de los sistemas de adquisición. Otras restricciones físicas, como el peso de los implantes o limitaciones en el consumo de energía, pueden también afectar a la duración de los experimentos, limitándola. La experimentación puede verse enriquecida cuando los datos electrofisiológicos se ven complementados con múltiples fuentes distintas. Por ejemplo, seguimiento de los animales o miscroscopía. Herramientas capaces de integrar datos independientemente de su origen abren la puerta a nuevas posibilidades. Los avances tecnológicos presentados abordan estas limitaciones. Se han diseñado dispositivos con latencias de lazo cerrado inferiores a 200us que permiten combinar cientos de canales electrofisiológicos con otras fuentes de datos, como vídeo o seguimiento. El software de control para estos dispositivos se ha diseñado manteniendo la flexibilidad como objetivo. Se han desarrollado interfaces y estándares de naturaleza abierta para incentivar el desarrollo de herramientas compatibles entre ellas. Para resolver los problemas de cableado se siguieron dos métodos distintos. Uno fue el desarrollo de headstages ligeros combinados con cables coaxiales ultra finos y conmutadores activos, gracias al seguimiento de animales. Este desarrollo permite reducir el esfuerzo impuesto a los animales, permitiendo espacios amplios y experimentos de larga duración, al tiempo que permite el uso de headstages con características avanzadas. Paralelamente se desarrolló un tipo diferente de headstage, con tecnología inalámbrica. Se creó un algoritmo de compresión digital especializado capaz de reducir el ancho de banda a menos del 65% de su tamaño original, ahorrando energía. Esta reducción permite baterías más ligeras y mayores tiempos de operación. El algoritmo fue diseñado para ser capaz de ser implementado en una gran variedad de dispositivos. Los desarrollos presentados abren la puerta a nuevas posibilidades experimentales para la neurociencia, combinando adquisición elextrofisiológica con estudios conductuales en condiciones naturales y estímulos complejos en tiempo real.[CA] L'electrofisiologia extracel·lular és una tècnica àmpliament utilitzada en la investigació neurocientífica, la qual permet estudiar el funcionament del cervell mitjançant el mesurament de camps elèctrics generats per l'activitat neuronal. Això es realitza a través d'elèctrodes implantats al cervell, connectats a dispositius electrònics per a l'amplificació i digitalització dels senyals. Dels molts models animals utilitzats en experimentació electrofisiològica, les rates i els ratolins es troben entre les espècies més utilitzades. Actualment, l'experimentació electrofisiològica busca condicions cada vegada més complexes, limitades per la tecnologia dels dispositius d'adquisició. Dos aspectes són d'especial interès: La realimentació de sistemes de llaç tancat i el comportament en condicions naturals. En aquesta tesi es presenten desenvolupaments amb l'objectiu de millorar diferents aspectes d'aquestos dos problemes. La realimentació de sistemes de llaç tancat es refereix a totes aquestes tècniques on els estímuls es produeixen en resposta a un esdeveniment generat per l'animal. La latència ha d'ajustar-se a les escales temporals sota estudi. Els sistemes moderns d'adquisició presenten latències en l'ordre dels 10ms. No obstant això, per a respondre a esdeveniments ràpids, com poden ser els potencials d'acció, es requereixen latències per davall de 1ms. A més a més, els algoritmes per a detectar els esdeveniments o generar els estímuls poden ser complexos, integrant varies entrades de dades a temps real. Integrar el desenvolupament d'aquests algoritmes en les eines d'adquisició forma part del disseny dels experiments. Per a estudiar comportaments naturals, els animals han de ser capaços de moure's lliurement en ambients emulant condicions naturals. Aquestos experiments es veuen limitats per la natura cablejada dels sistemes d'adquisició. Altres restriccions físiques, com el pes dels implants o el consum d'energia, poden també limitar la duració dels experiments. L'experimentació es pot enriquir quan les dades electrofisiològiques es complementen amb dades de múltiples fonts. Per exemple, el seguiment d'animals o microscòpia. Eines capaces d'integrar dades independentment del seu origen obrin la porta a noves possibilitats. Els avanços tecnològics presentats tracten aquestes limitacions. S'han dissenyat dispositius amb latències de llaç tancat inferiors a 200us que permeten combinar centenars de canals electrofisiològics amb altres fonts de dades, com vídeo o seguiment. El software de control per a aquests dispositius s'ha dissenyat mantenint la flexibilitat com a objectiu. S'han desenvolupat interfícies i estàndards de naturalesa oberta per a incentivar el desenvolupament d'eines compatibles entre elles. Per a resoldre els problemes de cablejat es van seguir dos mètodes diferents. Un va ser el desenvolupament de headstages lleugers combinats amb cables coaxials ultra fins i commutadors actius, gràcies al seguiment d'animals. Aquest desenvolupament permet reduir al mínim l'esforç imposat als animals, permetent espais amplis i experiments de llarga durada, al mateix temps que permet l'ús de headstages amb característiques avançades. Paral·lelament es va desenvolupar un tipus diferent de headstage, amb tecnologia sense fil. Es va crear un algorisme de compressió digital especialitzat capaç de reduir l'amplada de banda a menys del 65% de la seua grandària original, estalviant energia. Aquesta reducció permet bateries més lleugeres i majors temps d'operació. L'algorisme va ser dissenyat per a ser capaç de ser implementat a una gran varietat de dispositius. Els desenvolupaments presentats obrin la porta a noves possibilitats experimentals per a la neurociència, combinant l'adquisició electrofisiològica amb estudis conductuals en condicions naturals i estímuls complexos en temps real.[EN] Extracellular electrophysiology is a technique widely used in neuroscience research. It can offer insights on how the brain works by measuring the electrical fields generated by neural activity. This is done through electrodes implanted in the brain and connected to amplification and digitization electronic circuitry. Of the many animal models used in electrophysiology experimentation, rodents such as rats and mice are among the most popular species. Modern electrophysiology experiments seek increasingly complex conditions that are limited by acquisition hardware technology. Two particular aspects are of special interest: Closed-loop feedback and naturalistic behavior. In this thesis, we present developments aiming to improve on different facets of these two problems. Closed-loop feedback encompasses all techniques in which stimuli is produced in response of an event generated by the animal. Latency, the time between trigger event and stimuli generation, must adjust to the biological timescale being studied. While modern acquisition systems feature latencies in the order of 10ms, response to fast events such as high-frequency electrical transients created by neuronal activity require latencies under 1ms1ms. In addition, algorithms for triggering or generating closed-loop stimuli can be complex, integrating multiple inputs in real-time. Integration of algorithm development into acquisition tools becomes an important part of experiment design. For electrophysiology experiments featuring naturalistic behavior, animals must be able to move freely in ecologically meaningful environments, mimicking natural conditions. Experiments featuring elements such as large arenaa, environmental objects or the presence of another animals are, however, hindered by the wired nature of acquisition systems. Other physical constraints, such as implant weight or power restrictions can also affect experiment time, limiting their duration. Beyond the technical limits, complex experiments are enriched when electrophysiology data is integrated with multiple sources, for example animal tracking or brain microscopy. Tools allowing mixing data independently of the source open new experimental possibilities. The technological advances presented on this thesis addresses these topics. We have designed devices with closed-loop latencies under 200us while featuring high-bandwidth interfaces. These allow the simultaneous acquisition of hundreds of electrophysiological channels combined with other heterogeneous data sources, such as video or tracking. The control software for these devices was designed with flexibility in mind, allowing easy implementation of closed-loop algorithms. Open interface standards were created to encourage the development of interoperable tools for experimental data integration. To solve wiring issues in behavioral experiments, we followed two different approaches. One was the design of light headstages, coupled with ultra-thin coaxial cables and active commutator technology, making use of animal tracking. This allowed to reduce animal strain to a minimum allowing large arenas and prolonged experiments with advanced headstages. A different, wireless headstage was also developed. We created a digital compression algorithm specialized for neural electrophysiological signals able to reduce data bandwidth to less than 65.5% its original size without introducing distortions. Bandwidth has a large effect on power requirements. Thus, this reduction allows for lighter batteries and extended operational time. The algorithm is designed to be able to be implemented in a wide variety of devices, requiring low hardware resources and adding negligible power requirements to a system. Combined, the developments we present open new possibilities for neuroscience experiments combining electrophysiology acquisition with natural behaviors and complex, real-time, stimuli.The research described in this thesis was carried out at the Polytechnic University of Valencia (Universitat Politècnica de València), Valencia, Spain in an extremely close collaboration with the Neuroscience Institute - Spanish National Research Council - Miguel Hernández University (Instituto de Neurociencias - Consejo Superior de Investigaciones Cientí cas - Universidad Miguel Hernández), San Juan de Alicante, Spain. The projects described in chapters 3 and 4 were developed in collabo- ration with, and funded by, Open Ephys, Cambridge, MA, USA and OEPS - Eléctronica e produção, unipessoal lda, Algés, Portugal.Cuevas López, A. (2021). Development of Advanced Closed-Loop Brain Electrophysiology Systems for Freely Behaving Rodents [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/179718TESI

    Remote Sensing Data Compression

    Get PDF
    A huge amount of data is acquired nowadays by different remote sensing systems installed on satellites, aircrafts, and UAV. The acquired data then have to be transferred to image processing centres, stored and/or delivered to customers. In restricted scenarios, data compression is strongly desired or necessary. A wide diversity of coding methods can be used, depending on the requirements and their priority. In addition, the types and properties of images differ a lot, thus, practical implementation aspects have to be taken into account. The Special Issue paper collection taken as basis of this book touches on all of the aforementioned items to some degree, giving the reader an opportunity to learn about recent developments and research directions in the field of image compression. In particular, lossless and near-lossless compression of multi- and hyperspectral images still remains current, since such images constitute data arrays that are of extremely large size with rich information that can be retrieved from them for various applications. Another important aspect is the impact of lossless compression on image classification and segmentation, where a reasonable compromise between the characteristics of compression and the final tasks of data processing has to be achieved. The problems of data transition from UAV-based acquisition platforms, as well as the use of FPGA and neural networks, have become very important. Finally, attempts to apply compressive sensing approaches in remote sensing image processing with positive outcomes are observed. We hope that readers will find our book useful and interestin

    Implementation of JPEG compression and motion estimation on FPGA hardware

    Full text link
    A hardware implementation of JPEG allows for real-time compression in data intensivve applications, such as high speed scanning, medical imaging and satellite image transmission. Implementation options include dedicated DSP or media processors, FPGA boards, and ASICs. Factors that affect the choice of platform selection involve cost, speed, memory, size, power consumption, and case of reconfiguration. The proposed hardware solution is based on a Very high speed integrated circuit Hardware Description Language (VHDL) implememtation of the codec with prefered realization using an FPGA board due to speed, cost and flexibility factors; The VHDL language is commonly used to model hardware impletations from a top down perspective. The VHDL code may be simulated to correct mistakes and subsequently synthesized into hardware using a synthesis tool, such as the xilinx ise suite. The same VHDL code may be synthesized into a number of sifferent hardware architetcures based on constraints given. For example speed was the major constraint when synthesizing the pipeline of jpeg encoding and decoding, while chip area and power consumption were primary constraints when synthesizing the on-die memory because of large area. Thus, there is a trade off between area and speed in logic synthesis
    corecore