768 research outputs found
Fractal-cluster theory and thermodynamic principles of the control and analysis for the self-organizing systems
The theory of resource distribution in self-organizing systems on the basis
of the fractal-cluster method has been presented. This theory consists of two
parts: determined and probable. The first part includes the static and dynamic
criteria, the fractal-cluster dynamic equations which are based on the
fractal-cluster correlations and Fibonacci's range characteristics. The second
part of the one includes the foundations of the probable characteristics of the
fractal-cluster system. This part includes the dynamic equations of the
probable evolution of these systems. By using the numerical researches of these
equations for the stationary case the random state field of the one in the
phase space of the , , criteria have been obtained. For the
socio-economical and biological systems this theory has been tested.Comment: 37 pages, 20 figures, 4 table
Fast and Efficient Entropy Coding Architectures for Massive Data Compression
The compression of data is fundamental to alleviating the costs of transmitting and storing massive datasets employed in myriad fields of our society. Most compression systems employ an entropy coder in their coding pipeline to remove the redundancy of coded symbols. The entropy-coding stage needs to be efficient, to yield high compression ratios, and fast, to process large amounts of data rapidly. Despite their widespread use, entropy coders are commonly assessed for some particular scenario or coding system. This work provides a general framework to assess and optimize different entropy coders. First, the paper describes three main families of entropy coders, namely those based on variable-to-variable length codes (V2VLC), arithmetic coding (AC), and tabled asymmetric numeral systems (tANS). Then, a low-complexity architecture for the most representative coder(s) of each family is presented-more precisely, a general version of V2VLC, the MQ, M, and a fixed-length version of AC and two different implementations of tANS. These coders are evaluated under different coding conditions in terms of compression efficiency and computational throughput. The results obtained suggest that V2VLC and tANS achieve the highest compression ratios for most coding rates and that the AC coder that uses fixed-length codewords attains the highest throughput. The experimental evaluation discloses the advantages and shortcomings of each entropy-coding scheme, providing insights that may help to select this stage in forthcoming compression systems
Lossless compression with latent variable models
We develop a simple and elegant method for lossless compression using latent variable models, which we call `bits back with asymmetric numeral systems' (BB-ANS). The method involves interleaving encode and decode steps, and achieves an optimal rate when compressing batches of data. We demonstrate it rstly on the MNIST test set, showing that state-of-the-art lossless compression is possible using a small variational autoencoder (VAE) model. We then make use of a novel empirical insight, that fully convolutional generative models, trained on small images, are able to generalize to images of arbitrary size, and extend BB-ANS to hierarchical latent variable models, enabling state-of-the-art lossless compression of full-size colour images from the ImageNet dataset. We describe `Craystack', a modular software framework which we have developed for rapid prototyping of compression using deep generative models
The Traffic Phases of Road Networks
We study the relation between the average traffic flow and the vehicle
density on road networks that we call 2D-traffic fundamental diagram. We show
that this diagram presents mainly four phases. We analyze different cases.
First, the case of a junction managed with a priority rule is presented, four
traffic phases are identified and described, and a good analytic approximation
of the fundamental diagram is obtained by computing a generalized eigenvalue of
the dynamics of the system. Then, the model is extended to the case of two
junctions, and finally to a regular city. The system still presents mainly four
phases. The role of a critical circuit of non-priority roads appears clearly in
the two junctions case. In Section 4, we use traffic light controls to improve
the traffic diagram. We present the improvements obtained by open-loop, local
feedback, and global feedback strategies. A comparison based on the response
times to reach the stationary regime is also given. Finally, we show the
importance of the design of the junction. It appears that if the junction is
enough large, the traffic is almost not slowed down by the junction.Comment: 37 page
Language Modeling Is Compression
It has long been established that predictive models can be transformed into
lossless compressors and vice versa. Incidentally, in recent years, the machine
learning community has focused on training increasingly large and powerful
self-supervised (language) models. Since these large language models exhibit
impressive predictive capabilities, they are well-positioned to be strong
compressors. In this work, we advocate for viewing the prediction problem
through the lens of compression and evaluate the compression capabilities of
large (foundation) models. We show that large language models are powerful
general-purpose predictors and that the compression viewpoint provides novel
insights into scaling laws, tokenization, and in-context learning. For example,
Chinchilla 70B, while trained primarily on text, compresses ImageNet patches to
43.4% and LibriSpeech samples to 16.4% of their raw size, beating
domain-specific compressors like PNG (58.5%) or FLAC (30.3%), respectively.
Finally, we show that the prediction-compression equivalence allows us to use
any compressor (like gzip) to build a conditional generative model
- …