828 research outputs found

    Auxiliary Guided Autoregressive Variational Autoencoders

    Get PDF
    Generative modeling of high-dimensional data is a key problem in machine learning. Successful approaches include latent variable models and autoregressive models. The complementary strengths of these approaches, to model global and local image statistics respectively, suggest hybrid models that encode global image structure into latent variables while autoregressively modeling low level detail. Previous approaches to such hybrid models restrict the capacity of the autoregressive decoder to prevent degenerate models that ignore the latent variables and only rely on autoregressive modeling. Our contribution is a training procedure relying on an auxiliary loss function that controls which information is captured by the latent variables and what is left to the autoregressive decoder. Our approach can leverage arbitrarily powerful autoregressive decoders, achieves state-of-the art quantitative performance among models with latent variables, and generates qualitatively convincing samples.Comment: Published as a conference paper at ECML-PKDD 201

    Random on-board pixel sampling (ROPS) X-ray Camera

    Full text link
    Recent advances in compressed sensing theory and algorithms offer new possibilities for high-speed X-ray camera design. In many CMOS cameras, each pixel has an independent on-board circuit that includes an amplifier, noise rejection, signal shaper, an analog-to-digital converter (ADC), and optional in-pixel storage. When X-ray images are sparse, i.e., when one of the following cases is true: (a.) The number of pixels with true X-ray hits is much smaller than the total number of pixels; (b.) The X-ray information is redundant; or (c.) Some prior knowledge about the X-ray images exists, sparse sampling may be allowed. Here we first illustrate the feasibility of random on-board pixel sampling (ROPS) using an existing set of X-ray images, followed by a discussion about signal to noise as a function of pixel size. Next, we describe a possible circuit architecture to achieve random pixel access and in-pixel storage. The combination of a multilayer architecture, sparse on-chip sampling, and computational image techniques, is expected to facilitate the development and applications of high-speed X-ray camera technology.Comment: 9 pages, 6 figures, Presented in 19th iWoRI

    Practical Full Resolution Learned Lossless Image Compression

    Full text link
    We propose the first practical learned lossless image compression system, L3C, and show that it outperforms the popular engineered codecs, PNG, WebP and JPEG 2000. At the core of our method is a fully parallelizable hierarchical probabilistic model for adaptive entropy coding which is optimized end-to-end for the compression task. In contrast to recent autoregressive discrete probabilistic models such as PixelCNN, our method i) models the image distribution jointly with learned auxiliary representations instead of exclusively modeling the image distribution in RGB space, and ii) only requires three forward-passes to predict all pixel probabilities instead of one for each pixel. As a result, L3C obtains over two orders of magnitude speedups when sampling compared to the fastest PixelCNN variant (Multiscale-PixelCNN). Furthermore, we find that learning the auxiliary representation is crucial and outperforms predefined auxiliary representations such as an RGB pyramid significantly.Comment: Updated preprocessing and Table 1, see A.1 in supplementary. Code and models: https://github.com/fab-jul/L3C-PyTorc

    JP3D compression of solar data-cubes: photospheric imaging and spectropolarimetry

    Full text link
    Hyperspectral imaging is an ubiquitous technique in solar physics observations and the recent advances in solar instrumentation enabled us to acquire and record data at an unprecedented rate. The huge amount of data which will be archived in the upcoming solar observatories press us to compress the data in order to reduce the storage space and transfer times. The correlation present over all dimensions, spatial, temporal and spectral, of solar data-sets suggests the use of a 3D base wavelet decomposition, to achieve higher compression rates. In this work, we evaluate the performance of the recent JPEG2000 Part 10 standard, known as JP3D, for the lossless compression of several types of solar data-cubes. We explore the differences in: a) The compressibility of broad-band or narrow-band time-sequence; I or V stokes profiles in spectropolarimetric data-sets; b) Compressing data in [x,y,λ\lambda] packages at different times or data in [x,y,t] packages of different wavelength; c) Compressing a single large data-cube or several smaller data-cubes; d) Compressing data which is under-sampled or super-sampled with respect to the diffraction cut-off

    High-Fidelity Image Compression with Score-based Generative Models

    Full text link
    Despite the tremendous success of diffusion generative models in text-to-image generation, replicating this success in the domain of image compression has proven difficult. In this paper, we demonstrate that diffusion can significantly improve perceptual quality at a given bit-rate, outperforming state-of-the-art approaches PO-ELIC and HiFiC as measured by FID score. This is achieved using a simple but theoretically motivated two-stage approach combining an autoencoder targeting MSE followed by a further score-based decoder. However, as we will show, implementation details matter and the optimal design decisions can differ greatly from typical text-to-image models

    Scanned Document Compression Technique

    Get PDF
    These days’ different media records are utilized to impart data. The media documents are content records, picture, sound, video and so forth. All these media documents required substantial measure of spaces when it is to be exchanged. Regular five page report records involve 75 KB of space, though a solitary picture can take up around 1.4 MB. In our paper, fundamental center is on two pressure procedures which are named as DjVU pressure strategy and the second is Block-based Hybrid Video Codec. In which we will chiefly concentrate on DjVU pressure strategy. DjVu is a picture pressure procedure particularly equipped towards the pressure of checked records in shading at high determination. Run of the mill magazine pages in shading filtered at 300dpi are compacted to somewhere around 40 and 80 KB, or 5 to 10 times littler than with JPEG for a comparative level of subjective quality. The frontal area layer, which contains the content and drawings and requires high spatial determination, is isolated from the foundation layer, which contains pictures and foundations and requires less determination. The closer view is packed with a bi-tonal picture pressure system that exploits character shape similitudes. The foundation is compacted with another dynamic, wavelet-based pressure strategy. A constant, memory proficient variant of the decoder is accessible as a module for famous web programs. We likewise exhibit that the proposed division calculation can enhance the nature of decoded reports while at the same time bringing down the bit rate
    • …
    corecore