349 research outputs found

    A real-time deep learning OFDM receiver

    Get PDF

    Low-Complexity Hyperspectral Image Compression on a Multi-tiled Architecture

    Get PDF
    The increasing amount of data produced in satellites poses a downlink communication problem due to the limited data rate of the downlink. This bottleneck is solved by introducing more and more processing power on-board to compress data to a satisfiable rate. Currently, this processing power is often provided by custom off the shelf hardware which is needed to run the complex image compression standards. The increase in required processing power often increases the energy required to power the hardware. This in turn pushes algorithm developers to develop lower complexity algorithms which are able to compress the data for the least amount of processing per data element. On the other hand hardware developers are pushed to develop flexible hardware which can be used on multiple missions to cut development cost and can be re-used for different missions. This paper introduces an algorithm which has been developed\ud to compress hyperspectral images at low complexity and describes its mapping to a new hardware platform which has been developed to offer flexibility as well as high performance processing power called the Xentium tile processor

    An investigative study of multispectral data compression for remotely-sensed images using vector quantization and difference-mapped shift-coding

    Get PDF
    A study is conducted to investigate the effects and advantages of data compression techniques on multispectral imagery data acquired by NASA's airborne scanners at the Stennis Space Center. The first technique used was vector quantization. The vector is defined in the multispectral imagery context as an array of pixels from the same location from each channel. The error obtained in substituting the reconstructed images for the original set is compared for different compression ratios. Also, the eigenvalues of the covariance matrix obtained from the reconstructed data set are compared with the eigenvalues of the original set. The effects of varying the size of the vector codebook on the quality of the compression and on subsequent classification are also presented. The output data from the Vector Quantization algorithm was further compressed by a lossless technique called Difference-mapped Shift-extended Huffman coding. The overall compression for 7 channels of data acquired by the Calibrated Airborne Multispectral Scanner (CAMS), with an RMS error of 15.8 pixels was 195:1 (0.41 bpp) and with an RMS error of 3.6 pixels was 18:1 (.447 bpp). The algorithms were implemented in software and interfaced with the help of dedicated image processing boards to an 80386 PC compatible computer. Modules were developed for the task of image compression and image analysis. Also, supporting software to perform image processing for visual display and interpretation of the compressed/classified images was developed

    Sub-micron technology development and system-on-chip (Soc) design - data compression core

    Get PDF
    Data compression removes redundancy from the source data and thereby increases storage capacity of a storage medium or efficiency of data transmission in a communication link. Although several data compression techniques have been implemented in hardware, they are not flexible enough to be embedded in more complex applications. Data compression software meanwhile cannot support the demand of high-speed computing applications. Due to these deficiencies, in this project we develop a parameterized lossless universal data compression IP core for high-speed applications. The design of the core is based on the combination of Lempel-Ziv-Storer-Szymanski (LZSS) compression algorithm and Huffman coding. The resulting IP core offers a data-independent throughput that can process a symbol in every clock cycle. The design is described in parameterized VHDL code to enable a user to make a suitable compromise between resource constraints, operation speed and compression saving, so that it can be adapted for any target application. In implementation on Altera FLEX10KE FPGA device, the design offers a performance of 800 Mbps with an operating frequency of 50 MHz. This IP core is suitable for high-speed computing applications or for storage systems

    An Application-Specific VLIW Processor with Vector Instruction Set for CNN Acceleration

    Full text link
    In recent years, neural networks have surpassed classical algorithms in areas such as object recognition, e.g. in the well-known ImageNet challenge. As a result, great effort is being put into developing fast and efficient accelerators, especially for Convolutional Neural Networks (CNNs). In this work we present ConvAix, a fully C-programmable processor, which -- contrary to many existing architectures -- does not rely on a hard-wired array of multiply-and-accumulate (MAC) units. Instead it maps computations onto independent vector lanes making use of a carefully designed vector instruction set. The presented processor is targeted towards latency-sensitive applications and is capable of executing up to 192 MAC operations per cycle. ConvAix operates at a target clock frequency of 400 MHz in 28nm CMOS, thereby offering state-of-the-art performance with proper flexibility within its target domain. Simulation results for several 2D convolutional layers from well known CNNs (AlexNet, VGG-16) show an average ALU utilization of 72.5% using vector instructions with 16 bit fixed-point arithmetic. Compared to other well-known designs which are less flexible, ConvAix offers competitive energy efficiency of up to 497 GOP/s/W while even surpassing them in terms of area efficiency and processing speed.Comment: Accepted for publication in the proceedings of the 2019 IEEE International Symposium on Circuits and Systems (ISCAS
    corecore