6,107 research outputs found

    New architecture for high throughput-rate real-time 2-D DCT and the VLSI design

    Get PDF
    [[abstract]]The discrete cosine transform (DCT) has been widely used as the core of digital image and video signal compression. However, its computation is so intensive and is of great necessity to meet the requirement of high speed. In this paper, a new architecture for the VLSI design of 2-D DCT has been developed. This architecture contains the following features: (1) using the programmable logic array (PLA) to replace multipliers, (2) overlapped row-column operations and pipeline structure to reduce the total computation time, and (3) highly modular and regular structure for the efficient VLSI implementation. The architecture is implemented to a 8×8 2-D DCT. The circuit is designed by UMC's 0.8 μm spdm CMOS process and the cell library is provided by ITRI CCL. The simulation is shown that the speed of the data processing for this DCT is more than 50 MHz. It performs equivalently 800 million multiplication and accumulations per second[[conferencetype]]國際[[conferencedate]]19960923~19960927[[booktype]]紙本[[iscallforpapers]]Y[[conferencelocation]]Rochester, NY, US

    Fast object detection in compressed JPEG Images

    Full text link
    Object detection in still images has drawn a lot of attention over past few years, and with the advent of Deep Learning impressive performances have been achieved with numerous industrial applications. Most of these deep learning models rely on RGB images to localize and identify objects in the image. However in some application scenarii, images are compressed either for storage savings or fast transmission. Therefore a time consuming image decompression step is compulsory in order to apply the aforementioned deep models. To alleviate this drawback, we propose a fast deep architecture for object detection in JPEG images, one of the most widespread compression format. We train a neural network to detect objects based on the blockwise DCT (discrete cosine transform) coefficients {issued from} the JPEG compression algorithm. We modify the well-known Single Shot multibox Detector (SSD) by replacing its first layers with one convolutional layer dedicated to process the DCT inputs. Experimental evaluations on PASCAL VOC and industrial dataset comprising images of road traffic surveillance show that the model is about 2×2\times faster than regular SSD with promising detection performances. To the best of our knowledge, this paper is the first to address detection in compressed JPEG images

    Data compression techniques applied to high resolution high frame rate video technology

    Get PDF
    An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended

    Overview of Parallel Platforms for Common High Performance Computing

    Get PDF
    The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well

    A C++-embedded Domain-Specific Language for programming the MORA soft processor array

    Get PDF
    MORA is a novel platform for high-level FPGA programming of streaming vector and matrix operations, aimed at multimedia applications. It consists of soft array of pipelined low-complexity SIMD processors-in-memory (PIM). We present a Domain-Specific Language (DSL) for high-level programming of the MORA soft processor array. The DSL is embedded in C++, providing designers with a familiar language framework and the ability to compile designs using a standard compiler for functional testing before generating the FPGA bitstream using the MORA toolchain. The paper discusses the MORA-C++ DSL and the compilation route into the assembly for the MORA machine and provides examples to illustrate the programming model and performance
    corecore