Search CORE

3,703 research outputs found

Source Coding for Quasiarithmetic Penalties

Author: Baer Michael B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Huffman coding finds a prefix code that minimizes mean codeword length for a given probability distribution over a finite number of items. Campbell generalized the Huffman problem to a family of problems in which the goal is to minimize not mean codeword length but rather a generalized mean known as a quasiarithmetic or quasilinear mean. Such generalized means have a number of diverse applications, including applications in queueing. Several quasiarithmetic-mean problems have novel simple redundancy bounds in terms of a generalized entropy. A related property involves the existence of optimal codes: For ``well-behaved'' cost functions, optimal codes always exist for (possibly infinite-alphabet) sources having finite generalized entropy. Solving finite instances of such problems is done by generalizing an algorithm for finding length-limited binary codes to a new algorithm for finding optimal binary codes for any quasiarithmetic mean with a convex cost function. This algorithm can be performed using quadratic time and linear space, and can be extended to other penalty functions, some of which are solvable with similar space and time complexity, and others of which are solvable with slightly greater complexity. This reduces the computational complexity of a problem involving minimum delay in a queue, allows combinations of previously considered problems to be optimized, and greatly expands the space of problems solvable in quadratic time and linear space. The algorithm can be extended for purposes such as breaking ties among possibly different optimal codes, as with bottom-merge Huffman coding.Comment: 22 pages, 3 figures, submitted to IEEE Trans. Inform. Theory, revised per suggestions of reader

arXiv.org e-Print Archive

CiteSeerX

Crossref

An implementation of Deflate in Coq

Author: AW Appel
B McMillan
D Huffman
G Klein
J Kelsey
JC Blanchette
NA Danielsson
R Affeldt
R Affeldt
V Vafeiadis
Publication venue
Publication date: 05/09/2016
Field of study

The widely-used compression format "Deflate" is defined in RFC 1951 and is based on prefix-free codings and backreferences. There are unclear points about the way these codings are specified, and several sources for confusion in the standard. We tried to fix this problem by giving a rigorous mathematical specification, which we formalized in Coq. We produced a verified implementation in Coq which achieves competitive performance on inputs of several megabytes. In this paper we present the several parts of our implementation: a fully verified implementation of canonical prefix-free codings, which can be used in other compression formats as well, and an elegant formalism for specifying sophisticated formats, which we used to implement both a compression and decompression algorithm in Coq which we formally prove inverse to each other -- the first time this has been achieved to our knowledge. The compatibility to other Deflate implementations can be shown empirically. We furthermore discuss some of the difficulties, specifically regarding memory and runtime requirements, and our approaches to overcome them

arXiv.org e-Print Archive

Crossref

The map equation

Author: Axelsson D.
Bergstrom C. T.
Rosvall M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Many real-world networks are so large that we must simplify their structure before we can extract useful information about the systems they represent. As the tools for doing these simplifications proliferate within the network literature, researchers would benefit from some guidelines about which of the so-called community detection algorithms are most appropriate for the structures they are studying and the questions they are asking. Here we show that different methods highlight different aspects of a network's structure and that the the sort of information that we seek to extract about the system must guide us in our decision. For example, many community detection algorithms, including the popular modularity maximization approach, infer module assignments from an underlying model of the network formation process. However, we are not always as interested in how a system's network structure was formed, as we are in how a network's extant structure influences the system's behavior. To see how structure influences current behavior, we will recognize that links in a network induce movement across the network and result in system-wide interdependence. In doing so, we explicitly acknowledge that most networks carry flow. To highlight and simplify the network structure with respect to this flow, we use the map equation. We present an intuitive derivation of this flow-based and information-theoretic method and provide an interactive on-line application that anyone can use to explore the mechanics of the map equation. We also describe an algorithm and provide source code to efficiently decompose large weighted and directed networks based on the map equation.Comment: 9 pages and 3 figures, corrected typos. For associated Flash application, see http://www.tp.umu.se/~rosvall/livemod/mapequation

arXiv.org e-Print Archive

CiteSeerX

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Recommended from our members

Parallel data compression

Author: Hirschberg Daniel S.
Stauffer Lynn M.
Publication venue: eScholarship, University of California
Publication date: 01/05/1991
Field of study

Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested

eScholarship - University of California