97,366 research outputs found
Properties and Construction of Polar Codes
Recently, Ar{\i}kan introduced the method of channel polarization on which
one can construct efficient capacity-achieving codes, called polar codes, for
any binary discrete memoryless channel. In the thesis, we show that decoding
algorithm of polar codes, called successive cancellation decoding, can be
regarded as belief propagation decoding, which has been used for decoding of
low-density parity-check codes, on a tree graph. On the basis of the
observation, we show an efficient construction method of polar codes using
density evolution, which has been used for evaluation of the error probability
of belief propagation decoding on a tree graph. We further show that channel
polarization phenomenon and polar codes can be generalized to non-binary
discrete memoryless channels. Asymptotic performances of non-binary polar
codes, which use non-binary matrices called the Reed-Solomon matrices, are
better than asymptotic performances of the best explicitly known binary polar
code. We also find that the Reed-Solomon matrices are considered to be natural
generalization of the original binary channel polarization introduced by
Ar{\i}kan.Comment: Master thesis. The supervisor is Toshiyuki Tanaka. 24 pages, 3
figure
Tree-Based Construction of LDPC Codes Having Good Pseudocodeword Weights
We present a tree-based construction of LDPC codes that have minimum
pseudocodeword weight equal to or almost equal to the minimum distance, and
perform well with iterative decoding. The construction involves enumerating a
-regular tree for a fixed number of layers and employing a connection
algorithm based on permutations or mutually orthogonal Latin squares to close
the tree. Methods are presented for degrees and , for a
prime. One class corresponds to the well-known finite-geometry and finite
generalized quadrangle LDPC codes; the other codes presented are new. We also
present some bounds on pseudocodeword weight for -ary LDPC codes. Treating
these codes as -ary LDPC codes rather than binary LDPC codes improves their
rates, minimum distances, and pseudocodeword weights, thereby giving a new
importance to the finite geometry LDPC codes where .Comment: Submitted to Transactions on Information Theory. Submitted: Oct. 1,
2005; Revised: May 1, 2006, Nov. 25, 200
Less redundant codes for variable size dictionaries
We report on a family of variable-length codes with less redundancy than the flat code used in most of the variable size dictionary-based compression methods. The length of codes belonging to this family is still bounded above by [log_2/ |D|] where |D| denotes the dictionary size. We describe three of these codes, namely, the balanced code, the phase-in-binary code (PB), and the depth-span code (DS). As the name implies, the balanced code is constructed by a height balanced tree, so it has the shortest average codeword length. The corresponding coding tree for the PB code has an interesting property that it is made of full binary phases, and thus the code can be computed efficiently using simple binary shifting operations. The DS coding tree is maintained in such a way that the coder always finds the longest extendable codeword and extends it until it reaches the maximum length. It is optimal with respect to the code-length contrast. The PB and balanced codes have almost similar improvements, around 3% to 7% which is very close to the relative redundancy in flat code. The DS code is particularly good in dealing with files with a large amount of redundancy, such as a running sequence of one symbol. We also did some empirical study on the codeword distribution in the LZW dictionary and proposed a scheme called dynamic block shifting (DBS) to further improve the codes' performance. Experiments suggest that the DBS is helpful in compressing random sequences. From an application point of view, PB code with DBS is recommended for general practical usage
On the error probability of general tree and trellis codes with applications to sequential decoding
An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random binary tree codes is derived and shown to be independent of the length of the tree. An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random L-branch binary trellis codes of rate R = 1/n is derived which separates the effects of the tail length T and the memory length M of the code. It is shown that the bound is independent of the length L of the information sequence. This implication is investigated by computer simulations of sequential decoding utilizing the stack algorithm. These simulations confirm the implication and further suggest an empirical formula for the true undetected decoding error probability with sequential decoding
Multiresolution vector quantization
Multiresolution source codes are data compression algorithms yielding embedded source descriptions. The decoder of a multiresolution code can build a source reproduction by decoding the embedded bit stream in part or in whole. All decoding procedures start at the beginning of the binary source description and decode some fraction of that string. Decoding a small portion of the binary string gives a low-resolution reproduction; decoding more yields a higher resolution reproduction; and so on. Multiresolution vector quantizers are block multiresolution source codes. This paper introduces algorithms for designing fixed- and variable-rate multiresolution vector quantizers. Experiments on synthetic data demonstrate performance close to the theoretical performance limit. Experiments on natural images demonstrate performance improvements of up to 8 dB over tree-structured vector quantizers. Some of the lessons learned through multiresolution vector quantizer design lend insight into the design of more sophisticated multiresolution codes
Optimal Prefix Codes with Fewer Distinct Codeword Lengths are Faster to Construct
A new method for constructing minimum-redundancy binary prefix codes is
described. Our method does not explicitly build a Huffman tree; instead it uses
a property of optimal prefix codes to compute the codeword lengths
corresponding to the input weights. Let be the number of weights and be
the number of distinct codeword lengths as produced by the algorithm for the
optimum codes. The running time of our algorithm is . Following
our previous work in \cite{be}, no algorithm can possibly construct optimal
prefix codes in time. When the given weights are presorted our
algorithm performs comparisons.Comment: 23 pages, a preliminary version appeared in STACS 200
Solving Multiclass Learning Problems via Error-Correcting Output Codes
Multiclass learning problems involve finding a definition for an unknown
function f(x) whose range is a discrete set containing k > 2 values (i.e., k
``classes''). The definition is acquired by studying collections of training
examples of the form [x_i, f (x_i)]. Existing approaches to multiclass learning
problems include direct application of multiclass algorithms such as the
decision-tree algorithms C4.5 and CART, application of binary concept learning
algorithms to learn individual binary functions for each of the k classes, and
application of binary concept learning algorithms with distributed output
representations. This paper compares these three approaches to a new technique
in which error-correcting codes are employed as a distributed output
representation. We show that these output representations improve the
generalization performance of both C4.5 and backpropagation on a wide range of
multiclass learning tasks. We also demonstrate that this approach is robust
with respect to changes in the size of the training sample, the assignment of
distributed representations to particular classes, and the application of
overfitting avoidance techniques such as decision-tree pruning. Finally, we
show that---like the other methods---the error-correcting code technique can
provide reliable class probability estimates. Taken together, these results
demonstrate that error-correcting output codes provide a general-purpose method
for improving the performance of inductive learning programs on multiclass
problems.Comment: See http://www.jair.org/ for any accompanying file
- …