768 research outputs found

    Fractal-cluster theory and thermodynamic principles of the control and analysis for the self-organizing systems

    Full text link
    The theory of resource distribution in self-organizing systems on the basis of the fractal-cluster method has been presented. This theory consists of two parts: determined and probable. The first part includes the static and dynamic criteria, the fractal-cluster dynamic equations which are based on the fractal-cluster correlations and Fibonacci's range characteristics. The second part of the one includes the foundations of the probable characteristics of the fractal-cluster system. This part includes the dynamic equations of the probable evolution of these systems. By using the numerical researches of these equations for the stationary case the random state field of the one in the phase space of the DD, HH, FF criteria have been obtained. For the socio-economical and biological systems this theory has been tested.Comment: 37 pages, 20 figures, 4 table

    Fast and Efficient Entropy Coding Architectures for Massive Data Compression

    Get PDF
    The compression of data is fundamental to alleviating the costs of transmitting and storing massive datasets employed in myriad fields of our society. Most compression systems employ an entropy coder in their coding pipeline to remove the redundancy of coded symbols. The entropy-coding stage needs to be efficient, to yield high compression ratios, and fast, to process large amounts of data rapidly. Despite their widespread use, entropy coders are commonly assessed for some particular scenario or coding system. This work provides a general framework to assess and optimize different entropy coders. First, the paper describes three main families of entropy coders, namely those based on variable-to-variable length codes (V2VLC), arithmetic coding (AC), and tabled asymmetric numeral systems (tANS). Then, a low-complexity architecture for the most representative coder(s) of each family is presented-more precisely, a general version of V2VLC, the MQ, M, and a fixed-length version of AC and two different implementations of tANS. These coders are evaluated under different coding conditions in terms of compression efficiency and computational throughput. The results obtained suggest that V2VLC and tANS achieve the highest compression ratios for most coding rates and that the AC coder that uses fixed-length codewords attains the highest throughput. The experimental evaluation discloses the advantages and shortcomings of each entropy-coding scheme, providing insights that may help to select this stage in forthcoming compression systems

    Lossless compression with latent variable models

    Get PDF
    We develop a simple and elegant method for lossless compression using latent variable models, which we call `bits back with asymmetric numeral systems' (BB-ANS). The method involves interleaving encode and decode steps, and achieves an optimal rate when compressing batches of data. We demonstrate it rstly on the MNIST test set, showing that state-of-the-art lossless compression is possible using a small variational autoencoder (VAE) model. We then make use of a novel empirical insight, that fully convolutional generative models, trained on small images, are able to generalize to images of arbitrary size, and extend BB-ANS to hierarchical latent variable models, enabling state-of-the-art lossless compression of full-size colour images from the ImageNet dataset. We describe `Craystack', a modular software framework which we have developed for rapid prototyping of compression using deep generative models

    The Traffic Phases of Road Networks

    Full text link
    We study the relation between the average traffic flow and the vehicle density on road networks that we call 2D-traffic fundamental diagram. We show that this diagram presents mainly four phases. We analyze different cases. First, the case of a junction managed with a priority rule is presented, four traffic phases are identified and described, and a good analytic approximation of the fundamental diagram is obtained by computing a generalized eigenvalue of the dynamics of the system. Then, the model is extended to the case of two junctions, and finally to a regular city. The system still presents mainly four phases. The role of a critical circuit of non-priority roads appears clearly in the two junctions case. In Section 4, we use traffic light controls to improve the traffic diagram. We present the improvements obtained by open-loop, local feedback, and global feedback strategies. A comparison based on the response times to reach the stationary regime is also given. Finally, we show the importance of the design of the junction. It appears that if the junction is enough large, the traffic is almost not slowed down by the junction.Comment: 37 page

    Language Modeling Is Compression

    Full text link
    It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (language) models. Since these large language models exhibit impressive predictive capabilities, they are well-positioned to be strong compressors. In this work, we advocate for viewing the prediction problem through the lens of compression and evaluate the compression capabilities of large (foundation) models. We show that large language models are powerful general-purpose predictors and that the compression viewpoint provides novel insights into scaling laws, tokenization, and in-context learning. For example, Chinchilla 70B, while trained primarily on text, compresses ImageNet patches to 43.4% and LibriSpeech samples to 16.4% of their raw size, beating domain-specific compressors like PNG (58.5%) or FLAC (30.3%), respectively. Finally, we show that the prediction-compression equivalence allows us to use any compressor (like gzip) to build a conditional generative model
    • …
    corecore