86 research outputs found
Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown Distributions
In this paper, we propose a source coding scheme that represents data from
unknown distributions through frequency and support information. Existing
encoding schemes often compress data by sacrificing computational efficiency or
by assuming the data follows a known distribution. We take advantage of the
structure that arises within the spatial representation and utilize it to
encode run-lengths within this representation using Golomb coding. Through
theoretical analysis, we show that our scheme yields an overall bit rate that
nears entropy without a computationally complex encoding algorithm and verify
these results through numerical experiments.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Design of sequences with good correlation properties
This thesis is dedicated to exploring sequences with good correlation properties. Periodic sequences with desirable correlation properties have numerous applications in communications. Ideally, one would like to have a set of sequences whose out-of-phase auto-correlation magnitudes and cross-correlation magnitudes are very small, preferably zero. However, theoretical bounds show that the maximum magnitudes of auto-correlation and cross-correlation of a sequence set are mutually constrained, i.e., if a set of sequences possesses good auto-correlation properties, then the cross-correlation properties are not good and vice versa. The design of sequence sets that achieve those theoretical bounds is therefore of great interest. In addition, instead of pursuing the least possible correlation values within an entire period, it is also interesting to investigate families of sequences with ideal correlation in a smaller zone around the origin. Such sequences are referred to as sequences with zero correlation zone or ZCZ sequences, which have been extensively studied due to their applications in 4G LTE and 5G NR systems, as well as quasi-synchronous code-division multiple-access communication systems.
Paper I and a part of Paper II aim to construct sequence sets with low correlation within a whole period. Paper I presents a construction of sequence sets that meets the Sarwate bound. The construction builds a connection between generalised Frank sequences and combinatorial objects, circular Florentine arrays. The size of the sequence sets is determined by the existence of circular Florentine arrays of some order. Paper II further connects circular Florentine arrays to a unified construction of perfect polyphase sequences, which include generalised Frank sequences as a special case. The size of a sequence set that meets the Sarwate bound, depends on a divisor of the period of the employed sequences, as well as the existence of circular Florentine arrays.
Paper III-VI and a part of Paper II are devoted to ZCZ sequences.
Papers II and III propose infinite families of optimal ZCZ sequence sets with respect to some bound, which are used to eliminate interference within a single cell in a cellular network. Papers V, VI and a part of Paper II focus on constructions of multiple optimal ZCZ sequence sets with favorable inter-set cross-correlation, which can be used in multi-user communication environments to minimize inter-cell interference. In particular, Paper~II employs circular Florentine arrays and improves the number of the optimal ZCZ sequence sets with optimal inter-set cross-correlation property in some cases.Doktorgradsavhandlin
The Weight Distributions of a Class of Cyclic Codes with Three Nonzeros over F3
Cyclic codes have efficient encoding and decoding algorithms. The decoding
error probability and the undetected error probability are usually bounded by
or given from the weight distributions of the codes. Most researches are about
the determination of the weight distributions of cyclic codes with few
nonzeros, by using quadratic form and exponential sum but limited to low
moments. In this paper, we focus on the application of higher moments of the
exponential sum to determine the weight distributions of a class of ternary
cyclic codes with three nonzeros, combining with not only quadratic form but
also MacWilliams' identities. Another application of this paper is to emphasize
the computer algebra system Magma for the investigation of the higher moments.
In the end, the result is verified by one example using Matlab.Comment: 10 pages, 3 table
Postings List Compression with Run-length and Zombit Encodings
Inverted indices is a core index structure for different low-level structures, like search engines and databases.
It stores a mapping from terms, numbers etc. to list of location in document, set of documents, database, table etc. and allows efficient full-text searches on indexed structure.
Mapping location in the inverted indicies is usually called a postings list.
In real life applications, scale of the inverted indicies size can grow huge.
Therefore efficient representation of it is needed, but at the same time, efficient queries must be supported.
This thesis explores ways to represent postings lists efficiently, while allowing efficient nextGEQ queries on the set.
Efficient nextGEQ queries is needed to implement inverted indicies.
First we convert postings lists into one bitvector, which concatenates each postings list's characteristic bitvector.
Then representing an integer set efficiently converts to representing this bitvector efficiently, which is expected to have long runs of 0s and 1s.
Run-length encoding of bitvector have recently led to promising results.
Therefore in this thesis we experiment two encoding methods (Top-k Hybrid coder, RLZ) that encode postings lists via run-length encodes of the bitvector.
We also investigate another new bitvector compression method (Zombit-vector), which encodes bitvectors by finding redundancies of runs of 0/1s.
We compare all encoding to current state-of-the-art Partitioned Elisa-Fano (PEF) coding.
Compression results on all encodings were more efficient than the current state-of-the-art PEF encoding.
Zombit-vector nextGEQ query results were slighty more efficient than PEF's, which make it more attractive with bitvectors that have long runs of 0s and 1s.
More work is needed with Top-k Hybrid coder and RLZ, so that those encodings nextGEQ can be compared to Zombit-vector and PEF
- …