86 research outputs found

    Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown Distributions

    Full text link
    In this paper, we propose a source coding scheme that represents data from unknown distributions through frequency and support information. Existing encoding schemes often compress data by sacrificing computational efficiency or by assuming the data follows a known distribution. We take advantage of the structure that arises within the spatial representation and utilize it to encode run-lengths within this representation using Golomb coding. Through theoretical analysis, we show that our scheme yields an overall bit rate that nears entropy without a computationally complex encoding algorithm and verify these results through numerical experiments.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Design of sequences with good correlation properties

    Get PDF
    This thesis is dedicated to exploring sequences with good correlation properties. Periodic sequences with desirable correlation properties have numerous applications in communications. Ideally, one would like to have a set of sequences whose out-of-phase auto-correlation magnitudes and cross-correlation magnitudes are very small, preferably zero. However, theoretical bounds show that the maximum magnitudes of auto-correlation and cross-correlation of a sequence set are mutually constrained, i.e., if a set of sequences possesses good auto-correlation properties, then the cross-correlation properties are not good and vice versa. The design of sequence sets that achieve those theoretical bounds is therefore of great interest. In addition, instead of pursuing the least possible correlation values within an entire period, it is also interesting to investigate families of sequences with ideal correlation in a smaller zone around the origin. Such sequences are referred to as sequences with zero correlation zone or ZCZ sequences, which have been extensively studied due to their applications in 4G LTE and 5G NR systems, as well as quasi-synchronous code-division multiple-access communication systems. Paper I and a part of Paper II aim to construct sequence sets with low correlation within a whole period. Paper I presents a construction of sequence sets that meets the Sarwate bound. The construction builds a connection between generalised Frank sequences and combinatorial objects, circular Florentine arrays. The size of the sequence sets is determined by the existence of circular Florentine arrays of some order. Paper II further connects circular Florentine arrays to a unified construction of perfect polyphase sequences, which include generalised Frank sequences as a special case. The size of a sequence set that meets the Sarwate bound, depends on a divisor of the period of the employed sequences, as well as the existence of circular Florentine arrays. Paper III-VI and a part of Paper II are devoted to ZCZ sequences. Papers II and III propose infinite families of optimal ZCZ sequence sets with respect to some bound, which are used to eliminate interference within a single cell in a cellular network. Papers V, VI and a part of Paper II focus on constructions of multiple optimal ZCZ sequence sets with favorable inter-set cross-correlation, which can be used in multi-user communication environments to minimize inter-cell interference. In particular, Paper~II employs circular Florentine arrays and improves the number of the optimal ZCZ sequence sets with optimal inter-set cross-correlation property in some cases.Doktorgradsavhandlin

    The Weight Distributions of a Class of Cyclic Codes with Three Nonzeros over F3

    Full text link
    Cyclic codes have efficient encoding and decoding algorithms. The decoding error probability and the undetected error probability are usually bounded by or given from the weight distributions of the codes. Most researches are about the determination of the weight distributions of cyclic codes with few nonzeros, by using quadratic form and exponential sum but limited to low moments. In this paper, we focus on the application of higher moments of the exponential sum to determine the weight distributions of a class of ternary cyclic codes with three nonzeros, combining with not only quadratic form but also MacWilliams' identities. Another application of this paper is to emphasize the computer algebra system Magma for the investigation of the higher moments. In the end, the result is verified by one example using Matlab.Comment: 10 pages, 3 table

    Postings List Compression with Run-length and Zombit Encodings

    Get PDF
    Inverted indices is a core index structure for different low-level structures, like search engines and databases. It stores a mapping from terms, numbers etc. to list of location in document, set of documents, database, table etc. and allows efficient full-text searches on indexed structure. Mapping location in the inverted indicies is usually called a postings list. In real life applications, scale of the inverted indicies size can grow huge. Therefore efficient representation of it is needed, but at the same time, efficient queries must be supported. This thesis explores ways to represent postings lists efficiently, while allowing efficient nextGEQ queries on the set. Efficient nextGEQ queries is needed to implement inverted indicies. First we convert postings lists into one bitvector, which concatenates each postings list's characteristic bitvector. Then representing an integer set efficiently converts to representing this bitvector efficiently, which is expected to have long runs of 0s and 1s. Run-length encoding of bitvector have recently led to promising results. Therefore in this thesis we experiment two encoding methods (Top-k Hybrid coder, RLZ) that encode postings lists via run-length encodes of the bitvector. We also investigate another new bitvector compression method (Zombit-vector), which encodes bitvectors by finding redundancies of runs of 0/1s. We compare all encoding to current state-of-the-art Partitioned Elisa-Fano (PEF) coding. Compression results on all encodings were more efficient than the current state-of-the-art PEF encoding. Zombit-vector nextGEQ query results were slighty more efficient than PEF's, which make it more attractive with bitvectors that have long runs of 0s and 1s. More work is needed with Top-k Hybrid coder and RLZ, so that those encodings nextGEQ can be compared to Zombit-vector and PEF
    • …
    corecore