113,674 research outputs found

    Hardware Implementation of Lossless Adaptive Compression of Data From a Hyperspectral Imager

    Get PDF
    Efficient onboard data compression can reduce the data volume from hyperspectral imagers on NASA and DoD spacecraft in order to return as much imagery as possible through constrained downlink channels. Lossless compression is important for signature extraction, object recognition, and feature classification capabilities. To provide onboard data compression, a hardware implementation of a lossless hyperspectral compression algorithm was developed using a field programmable gate array (FPGA). The underlying algorithm is the Fast Lossless (FL) compression algorithm reported in Fast Lossless Compression of Multispectral- Image Data (NPO-42517), NASA Tech Briefs, Vol. 30, No. 8 (August 2006), p. 26 with the modification reported in Lossless, Multi-Spectral Data Comressor for Improved Compression for Pushbroom-Type Instruments (NPO-45473), NASA Tech Briefs, Vol. 32, No. 7 (July 2008) p. 63, which provides improved compression performance for data from pushbroom-type imagers. An FPGA implementation of the unmodified FL algorithm was previously developed and reported in Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System (NPO-46867), NASA Tech Briefs, Vol. 36, No. 5 (May 2012) p. 42. The essence of the FL algorithm is adaptive linear predictive compression using the sign algorithm for filter adaption. The FL compressor achieves a combination of low complexity and compression effectiveness that exceeds that of stateof- the-art techniques currently in use. The modification changes the predictor structure to tolerate differences in sensitivity of different detector elements, as occurs in pushbroom-type imagers, which are suitable for spacecraft use. The FPGA implementation offers a low-cost, flexible solution compared to traditional ASIC (application specific integrated circuit) and can be integrated as an intellectual property (IP) for part of, e.g., a design that manages the instrument interface. The FPGA implementation was benchmarked on the Xilinx Virtex IV LX25 device, and ported to a Xilinx prototype board. The current implementation has a critical path of 29.5 ns, which dictated a clock speed of 33 MHz. The critical path delay is end-to-end measurement between the uncompressed input data and the output compression data stream. The implementation compresses one sample every clock cycle, which results in a speed of 33 Msample/s. The implementation has a rather low device use of the Xilinx Virtex IV LX25, making the total power consumption of the implementation about 1.27 W

    Sub-micron technology development and system-on-chip (Soc) design - data compression core

    Get PDF
    Data compression removes redundancy from the source data and thereby increases storage capacity of a storage medium or efficiency of data transmission in a communication link. Although several data compression techniques have been implemented in hardware, they are not flexible enough to be embedded in more complex applications. Data compression software meanwhile cannot support the demand of high-speed computing applications. Due to these deficiencies, in this project we develop a parameterized lossless universal data compression IP core for high-speed applications. The design of the core is based on the combination of Lempel-Ziv-Storer-Szymanski (LZSS) compression algorithm and Huffman coding. The resulting IP core offers a data-independent throughput that can process a symbol in every clock cycle. The design is described in parameterized VHDL code to enable a user to make a suitable compromise between resource constraints, operation speed and compression saving, so that it can be adapted for any target application. In implementation on Altera FLEX10KE FPGA device, the design offers a performance of 800 Mbps with an operating frequency of 50 MHz. This IP core is suitable for high-speed computing applications or for storage systems

    Learning parametric dictionaries for graph signals

    Get PDF
    In sparse signal representation, the choice of a dictionary often involves a tradeoff between two desirable properties -- the ability to adapt to specific signal data and a fast implementation of the dictionary. To sparsely represent signals residing on weighted graphs, an additional design challenge is to incorporate the intrinsic geometric structure of the irregular data domain into the atoms of the dictionary. In this work, we propose a parametric dictionary learning algorithm to design data-adapted, structured dictionaries that sparsely represent graph signals. In particular, we model graph signals as combinations of overlapping local patterns. We impose the constraint that each dictionary is a concatenation of subdictionaries, with each subdictionary being a polynomial of the graph Laplacian matrix, representing a single pattern translated to different areas of the graph. The learning algorithm adapts the patterns to a training set of graph signals. Experimental results on both synthetic and real datasets demonstrate that the dictionaries learned by the proposed algorithm are competitive with and often better than unstructured dictionaries learned by state-of-the-art numerical learning algorithms in terms of sparse approximation of graph signals. In contrast to the unstructured dictionaries, however, the dictionaries learned by the proposed algorithm feature localized atoms and can be implemented in a computationally efficient manner in signal processing tasks such as compression, denoising, and classification

    Lossy Multi/Hyperspectral Compression HW Implementation at high data rate

    Get PDF
    Image compression is becoming more and more important, as new multispectral and hyperspectral instruments are going to generate very high data rates due to the increased spatial and spectral resolutions. Transmitting all the acquired data to the ground segment is a serious bottleneck, and compression techniques are a feasible solution to this problem. The CCSDS has established a working group (WG) on multispectral and Hyperspectral Data Compression (MHDC), which has the purpose of standardizing compression techniques to be used onboard. The WG has already standardized a lossless compression algorithm for multispectral and hyperspectral images, and has started working on a lossy compression algorithm. The complexity of lossless compression algorithms is typically larger than that of lossy ones, leading to potentially lower throughputs. Therefore, a careful assessment is required in order to identify techniques that are able to sustain very high data rates. The increased complexity can also lead to increased resource occupancy on a hardware device such as an FPGA. Lossy compression introduces information losses in the images, and these losses must be accurately characterized, and their effect on the applications investigated. For these reasons, developing a lossy algorithm requires a more elaborate process. Under an ESA contract primed by Politecnico of Torino, TSD is currently designing an IP core for FPGA and/or ASIC implementation of a lossy compression algorithm that is being proposed for CCSDS standardization. In addition to the IP core, TSD is developing a HW platform based on the Xilinx Virtex-5 XQR5VFX130, the industry's first high performance rad-hard reconfigurable FPGA for processing-intensive for space systems. Advanced results along with details of electronic platform design will be presented in this paper

    Hu-Tucker alogorithm for building optimal alphabetic binary search trees

    Get PDF
    The purpose of this thesis is to study the behavior of the Hu- Tucker algorithm for building Optimal Alphabetic Binary Search Trees (OABST), to design an efficient implementation, and to evaluate the performance of the algorithm, and the implementation. The three phases of the algorithm are described and their time complexities evaluated. Two separate implementations for the most expensive phase, Combination, are presented achieving 0(n2) and O(nlogn) time and 0(n) space complexity. The break even point between them is experimentally established and the complexities of the implementations are compared against their theoretical time complexities. The electronic version of The Complete Works of William Shakespeare is compressed using the Hu- Tucker algorithm and other popular compression algorithms to compare the performance of the different techniques. The experiments justified the price that has to be paid to implement the Hu- Tucker algorithm. It is shown that an efficient implementation can process extremely large data sets relatively fast and can achieve optimality close to the Optimal Binary Tree, built using the Huffman algorithm, however the OABST can be used in both encoding and decoding processes, unlike the OBT where an additional mapping mechanism is needed for the decoding phase

    FPGA-Based Lossless Data Compression Using GNU Zip

    Get PDF
    Lossless data compression algorithms are widely used by data communication systems and data storage systems to reduce the amount of data transferred and stored. GNU Zip (GZIP) [1] is a popular compression utility that delivers reasonable compression ratios without the need for exploiting patented compression algorithms [2, 3]. The compression algorithm in GZIP uses a variation of LZ77 encoding, static Huffman encoding and dynamic Huffman encoding. Given the fact that web traffic accounts for 42% [4] of all internet traffic, the acceleration of algorithms like GZIP could be quite beneficial towards reducing internet traffic. A hardware implementation of the GZIP algorithm could be used to allow CPUs to perform other tasks, thus boosting system performance. This thesis presents a hardware implementation of GZIP encoder written in VHDL. Unlike previous attempts to design hardware-based encoders [5, 6], the design is compliant with GZIP specification and includes all three of the GZIP compression modes. Files compressed in hardware can be decompressed with the software version of GZIP. The flexibility of the design allows for hardware-based implementations using either FPGAs or ASICs. The design has been prototyped on an Altera DE2 Educational Board. Data is read and stored using an on board SD Card reader implemented in NIOS II processor. The design utilizes 20 610 LEs, 68 913 memory bits, and the on board SRAM, and the SDRAM to implement a fully functional GZIP encoder

    Lossless data compression and decompression algorithm and its hardware architecture

    Get PDF
    LZW (Lempel Ziv Welch) and AH (Adaptive Huffman) algorithms were most widely used for lossless data compression. But both of these algorithms take more memory for hardware implementation. The thesis basically discuss about the design of the two-stage hardware architecture with Parallel dictionary LZW algorithm first and Adaptive Huffman algorithm in the next stage. In this architecture, an ordered list instead of the tree based structure is used in the AH algorithm for speeding up the compression data rate. The resulting architecture shows that it not only outperforms the AH algorithm at the cost of only one-fourth the hardware resource but it is also competitive to the performance of LZW algorithm (compress). In addition, both compression and decompression rates of the proposed architecture are greater than those of the AH algorithm even in the case realized by software.Three different schemes of adaptive Huffman algorithm are designed called AHAT, AHFB and AHDB algorithm. Compression ratios are calculated and results are compared with Adaptive Huffman algorithm which is implemented in C language. AHDB algorithm gives good performance compared to AHAT and AHFB algorithms. The performance of the PDLZW algorithm is enhanced by incorporating it with the AH algorithm. The two stage algorithm is discussed to increase compression ratio with PDLZW algorithm in first stage and AHDB in second stage. Results are compared with LZW (compress) and AH algorithm. The percentage of data compression increases more than 5% by cascading with adaptive algorithm, which implies that one can use a smaller dictionary size in the PDLZW algorithm if the memory size is limited and then use the AH algorithm as the second stage to compensate the loss of the percentage of data reduction. The Proposed two–stage compression/decompression processors have been coded using Verilog HDL language, simulated in Xilinx ISE 9.1 and synthesized by Synopsys using design vision

    Reconfigurable Hardware for Compressing Hyperspectral Image Data

    Get PDF
    High-speed, low-power, reconfigurable electronic hardware has been developed to implement ICER-3D, an algorithm for compressing hyperspectral-image data. The algorithm and parts thereof have been the topics of several NASA Tech Briefs articles, including Context Modeler for Wavelet Compression of Hyperspectral Images (NPO-43239) and ICER-3D Hyperspectral Image Compression Software (NPO-43238), which appear elsewhere in this issue of NASA Tech Briefs. As described in more detail in those articles, the algorithm includes three main subalgorithms: one for computing wavelet transforms, one for context modeling, and one for entropy encoding. For the purpose of designing the hardware, these subalgorithms are treated as modules to be implemented efficiently in field-programmable gate arrays (FPGAs). The design takes advantage of industry- standard, commercially available FPGAs. The implementation targets the Xilinx Virtex II pro architecture, which has embedded PowerPC processor cores with flexible on-chip bus architecture. It incorporates an efficient parallel and pipelined architecture to compress the three-dimensional image data. The design provides for internal buffering to minimize intensive input/output operations while making efficient use of offchip memory. The design is scalable in that the subalgorithms are implemented as independent hardware modules that can be combined in parallel to increase throughput. The on-chip processor manages the overall operation of the compression system, including execution of the top-level control functions as well as scheduling, initiating, and monitoring processes. The design prototype has been demonstrated to be capable of compressing hyperspectral data at a rate of 4.5 megasamples per second at a conservative clock frequency of 50 MHz, with a potential for substantially greater throughput at a higher clock frequency. The power consumption of the prototype is less than 6.5 W. The reconfigurability (by means of reprogramming) of the FPGAs makes it possible to effectively alter the design to some extent to satisfy different requirements without adding hardware. The implementation could be easily propagated to future FPGA generations and/or to custom application-specific integrated circuits

    BinDCT Design and Implementation on FPGA with Low Power Architecture

    Get PDF
    Image compression is widely used in today's consumer applications such as digital camcorders, digital cameras, videophones and high-definition television (HDTV). As Discrete Cosine Transform (DCT) is dominant in many international standards for image/video and audio compression, the introduction of multiplierless algorithm for fast DCT computation known as BinDCT (Binary DCT) is very well suited for VLSI implementation. Its performances in term of Peak Signal-to-Noise (PSNR), compression ratio and coding gain is proved to be best approximation to the DCT algorithm. In this work, the design and implementation of 8 x 8 block 2-D forward BinDCT algorithm on a Field Programmable Gate Array (FPGA) is presented. As this algorithm uses simple arithmetic operations (shift and add) rather than floating-point multiplications, low power hardware implementation is very promising. The aim for low power implementation was achieved at architectural level by employing 4 stages pipeline architecture with parallel processing in each stage. However, due to the trade off between hardware area and speed, this design is focusing on optimising hardware area in each stage such that it can fit the target FPGA device. The 8 x 8 block two-dimensional (2-D) forward BinDCT implementation can be run at 68.58 MHz with the power consumption of 144.10 mW. This implementation achieved 12.45% less power compare with the implementation of BinDCT presented previously if the design runs at the same speed. Furthermore, results have shown that this implementation achieved good accuracy compare with software implementation as the maximum error of the output from 2-D computation is 1.26 %. Several works can be done for further power optimisation such as data gating and latency balancing at each stage (which can improves the throughput as well). Besides, the implementation of 8 x 8 block 2-D inverse BinDCT should be carried out such that its accuracy over floating-point DCT in terms of hardware implementation can be analyzed

    A 65nm CMOS lossless bio-signal compression circuit with 250 femtoJoule performance per bit.

    Get PDF
    A 65nm CMOS integrated circuit implementation of a bio-physiological signal compression device is presented, reporting exceptionally low power, and extremely low silicon area cost, relative to state-of-the-art. A novel `xor-log2-sub-band' data compression scheme is evaluated, achieving modest compression, but with very low resource cost. With the intent to design the `simplest useful compression algorithm', the outcome is demonstrated to be very favourable where power must be saved by trading off compression effort against data storage capacity, or data transmission power, even where more complex algorithms can deliver higher compression ratios. A VLSI design and fabricated Integrated Circuit implementation are presented, and estimated performance gains and efficiency measures for various bio-medical use-cases are given. Power costs as low as 1.2 pJ per sample-bit are suggested for a 10kSa/s data-rate, whilst utilizing a power-gating scenario, and dropping to 250fJ/bit at continuous conversion data-rates of 5MSa/sec. This is achieved with a diminutive circuit area of 155 um2. Both power and area appear to be state-of-the-art in terms of compression versus resource cost, and this yields benefit for system optimization