1,940 research outputs found
Arithmetic coding revisited
Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multisymbol alphabets because of its speed,
low storage requirements, and effectiveness of compression. This article describes a new implementation of arithmetic coding that incorporates several improvements over a widely used earlier version by Witten, Neal, and Cleary, which has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low-precision arithmetic, permitting implementation by fast shift/add operations. We also describe a modular structure that separates the coding, modeling, and probability estimation components of a compression system. To motivate the improved coder, we consider the needs of a word-based text compression program. We report a range of experimental results using this and other models. Complete source code is available
Convolutional Radio Modulation Recognition Networks
We study the adaptation of convolutional neural networks to the complex
temporal radio signal domain. We compare the efficacy of radio modulation
classification using naively learned features against using expert features
which are widely used in the field today and we show significant performance
improvements. We show that blind temporal learning on large and densely encoded
time series using deep convolutional neural networks is viable and a strong
candidate approach for this task especially at low signal to noise ratio
Distributed Binary Detection with Lossy Data Compression
Consider the problem where a statistician in a two-node system receives
rate-limited information from a transmitter about marginal observations of a
memoryless process generated from two possible distributions. Using its own
observations, this receiver is required to first identify the legitimacy of its
sender by declaring the joint distribution of the process, and then depending
on such authentication it generates the adequate reconstruction of the
observations satisfying an average per-letter distortion. The performance of
this setup is investigated through the corresponding rate-error-distortion
region describing the trade-off between: the communication rate, the error
exponent induced by the detection and the distortion incurred by the source
reconstruction. In the special case of testing against independence, where the
alternative hypothesis implies that the sources are independent, the optimal
rate-error-distortion region is characterized. An application example to binary
symmetric sources is given subsequently and the explicit expression for the
rate-error-distortion region is provided as well. The case of "general
hypotheses" is also investigated. A new achievable rate-error-distortion region
is derived based on the use of non-asymptotic binning, improving the quality of
communicated descriptions. Further improvement of performance in the general
case is shown to be possible when the requirement of source reconstruction is
relaxed, which stands in contrast to the case of general hypotheses.Comment: to appear on IEEE Trans. Information Theor
On Prediction Using Variable Order Markov Models
This paper is concerned with algorithms for prediction of discrete sequences
over a finite alphabet, using variable order Markov models. The class of such
algorithms is large and in principle includes any lossless compression
algorithm. We focus on six prominent prediction algorithms, including Context
Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic
Suffix Trees (PSTs). We discuss the properties of these algorithms and compare
their performance using real life sequences from three domains: proteins,
English text and music pieces. The comparison is made with respect to
prediction quality as measured by the average log-loss. We also compare
classification algorithms based on these predictors with respect to a number of
large protein classification tasks. Our results indicate that a "decomposed"
CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in
sequence prediction tasks. Somewhat surprisingly, a different algorithm, which
is a modification of the Lempel-Ziv compression algorithm, significantly
outperforms all algorithms on the protein classification problems
Information-Theoretic Foundations of Mismatched Decoding
Shannon's channel coding theorem characterizes the maximal rate of
information that can be reliably transmitted over a communication channel when
optimal encoding and decoding strategies are used. In many scenarios, however,
practical considerations such as channel uncertainty and implementation
constraints rule out the use of an optimal decoder. The mismatched decoding
problem addresses such scenarios by considering the case that the decoder
cannot be optimized, but is instead fixed as part of the problem statement.
This problem is not only of direct interest in its own right, but also has
close connections with other long-standing theoretical problems in information
theory. In this monograph, we survey both classical literature and recent
developments on the mismatched decoding problem, with an emphasis on achievable
random-coding rates for memoryless channels. We present two widely-considered
achievable rates known as the generalized mutual information (GMI) and the LM
rate, and overview their derivations and properties. In addition, we survey
several improved rates via multi-user coding techniques, as well as recent
developments and challenges in establishing upper bounds on the mismatch
capacity, and an analogous mismatched encoding problem in rate-distortion
theory. Throughout the monograph, we highlight a variety of applications and
connections with other prominent information theory problems.Comment: Published in Foundations and Trends in Communications and Information
Theory (Volume 17, Issue 2-3
An Assistive Multimedia Courseware for Dyslexics
One of the most promising areas of education is the development of computer-based
teaching materials, especially interactive multimedia programs. Interactive multimedia allows
independent and interactive learning, and yet presents the learning information to the learners in
newly engaging and meaningful ways. This paper delivers the theoretical concepts and design of a
multimedia courseware called âMyLexicâ. âMyLexicâ is the first learning tool to nurture interest on
Malay language basic reading among preschool dyslexic children in Malaysia. The theoretical
framework proposed in the study is based on research in dyslexia theory with Dual Coding Theory,
Structured Multi-sensory Phonic Teaching and Scaffolding instructional technique. Detail
explanations on its learning content are also discussed. The courseware is hoped to contribute a
significant idea to the development of technology in Malay language education for dyslexics in
Malaysia
- âŠ