139 research outputs found

    GPU acceleration of predictive partitioned vector quantization for ultraspectral sounder data compression

    Get PDF
    [[abstract]]For the large-volume ultraspectral sounder data, compression is desirable to save storage space and transmission time. To retrieve the geophysical paramters without losing precision the ultraspectral sounder data compression has to be lossless. Recently there is a boom on the use of graphic processor units (GPU) for speedup of scientific computations. By identifying the time dominant portions of the code that can be executed in parallel, significant speedup can be achieved by using GPU. Predictive partitioned vector quantization (PPVQ) has been proven to be an effective lossless compression scheme for ultraspectral sounder data. It consists of linear prediction, bit depth partitioning, vector quantization, and entropy coding. Two most time consuming stages of linear prediction and vector quantization are chosen for GPU-based implementation. By exploiting the data parallel characteristics of these two stages, a spatial division design shows a speedup of 72x in our four-GPU-based implementation of the PPVQ compression scheme.[[notice]]補正完畢[[journaltype]]國外[[incitationindex]]SCI[[booktype]]紙本[[countrycodes]]US

    A Self Organization-Based Optical Flow Estimator with GPU Implementation

    Get PDF
    This work describes a parallelizable optical flow estimator that uses a modified batch version of the Self Organizing Map (SOM). This gradient-based estimator handles the ill-posedness in motion estimation via a novel combination of regression and a self organization strategy. The aperture problem is explicitly modeled using an algebraic framework that partitions motion estimates obtained from regression into two sets, one (set Hc) with estimates with high confidence and another (set Hp) with low confidence estimates. The self organization step uses a uniquely designed pair of training set (Q=Hc) and the initial weights set (W=Hc U Hp). It is shown that with this specific choice of training and initial weights sets, the interpolation of flow vectors is achieved primarily due to the regularization property of SOM. Moreover, the computationally involved step of finding the winner unit in SOM simplifies to indexing into a 2D array making the algorithm parallelizable and highly scalable. To preserve flow discontinuities at occlusion boundaries, we have designed anisotropic neighborhood function for SOM that uses a novel OFCE residual-based distance measure. A multi-resolution or pyramidal approach is used to estimate large motion. As the algorithm is scalable, with sufficient number of computing cores (for example on a GPU), the implementation of the estimator can be made real-time. With the available true motion from Middlebury database, error metrics are computed

    Software Coherence in Multiprocessor Memory Systems

    Get PDF
    Processors are becoming faster and multiprocessor memory interconnection systems are not keeping up. Therefore, it is necessary to have threads and the memory they access as near one another as possible. Typically, this involves putting memory or caches with the processors, which gives rise to the problem of coherence: if one processor writes an address, any other processor reading that address must see the new value. This coherence can be maintained by the hardware or with software intervention. Systems of both types have been built in the past; the hardware-based systems tended to outperform the software ones. However, the ratio of processor to interconnect speed is now so high that the extra overhead of the software systems may no longer be significant. This issue is explored both by implementing a software maintained system and by introducing and using the technique of offline optimal analysis of memory reference traces. It finds that in properly built systems, software maintained coherence can perform comparably to or even better than hardware maintained coherence. The architectural features necessary for efficient software coherence to be profitable include a small page size, a fast trap mechanism, and the ability to execute instructions while remote memory references are outstanding

    Low bit-rate image sequence coding

    Get PDF

    Media gateway utilizando um GPU

    Get PDF
    Mestrado em Engenharia de Computadores e Telemátic

    Computational analysis of woodwind instruments using the lattice Boltzmann method

    Get PDF
    Through the use of the lattice Boltzmann method, a series of computational models were created to simulate air flow in woodwind instruments. Start- ing as a two-dimensional code in Matlab running on the CPU, the model went through a series of iterations before becoming a three-dimension code in Fortran that was accelerated through the use of GPU parallel computing. The accuracy and stability of the model are shown by comparison to various published benchmark tests. Thus far, the air flow in organ pipes for a two dimensional model was simulated showing oscillating flow by the labium as expected. This thesis offers the mathematical and computational background as well as a description of the implementation of the basic LBM for simulat- ing flow in musical instruments. The method described here is meant as a first step to a code that is highly flexible and can be used to study many as- pects of acoustics in musical instruments. Future applicability of the model includes observing flow at the exit of both square and round organ pipes in addition to modeling the reed-mouthpiece system of the clarinet

    Computing and data processing

    Get PDF
    The applications of computers and data processing to astronomy are discussed. Among the topics covered are the emerging national information infrastructure, workstations and supercomputers, supertelescopes, digital astronomy, astrophysics in a numerical laboratory, community software, archiving of ground-based observations, dynamical simulations of complex systems, plasma astrophysics, and the remote control of fourth dimension supercomputers
    corecore