22 research outputs found

    Multiset and Set Decipherable Codes

    Get PDF
    We extend some results of Lempel and Restivo on multiset decipherable codes to set decipherable codes

    Note on Decipherability of Three-Word Codes

    Get PDF
    The theory of uniquely decipherable (UD) codes has been widely developed in connection with automata theory, combinatorics on words, formal languages, and monoid theory. Recently, the concepts of multiset decipherable (MSD) and set decipherable (SD) codes were developed to handle some special problems in the transmission of information. Unique decipherability is a vital requirement in a wide range of coding applications where distinct sequences of code words carry different information. However, in several applications, it is necessary or desirable to communicate a description of a sequence of events where the information of interest is the set of possible events, including multiplicity, but where the order of occurrences is irrelevant. Suitable codes for these communication purposes need not possess the UD property, but the weaker MSD property. In other applications, the information of interest may be the presence or absence of possible events. The SD property is adequate for such codes. Lempel (1986) showed that the UD and MSD properties coincide for two-word codes and conjectured that every three-word MSD code is a UD code. Guzmán (1995) showed that the UD, MSD, and SD properties coincide for two-word codes and conjectured that these properties coincide for three-word codes. In an earlier paper (2001), Blanchet-Sadri answered both conjectures positively for all three-word codes {c1,c2,c3} satisfying |c1| = |c2| = |c3|. In this note, we answer both conjectures positively for other special three-word codes. Our procedures are based on techniques related to dominoes

    Testing decipherability of directed figure codes with domino graphs

    Get PDF
    Various kinds of decipherability of codes, weaker than unique decipherability, have been studied since mid-1980s. We consider decipherability of directed gure codes, where directed gures are de ned as labelled polyomi- noes with designated start and end points, equipped with catenation operation that may use a merging function to resolve possible con icts. This setting ex- tends decipherability questions from words to 2D structures. In the present paper we develop a (variant of) domino graph that will allow us to decide some of the decipherability kinds by searching the graph for speci c paths. Thus the main result characterizes directed gure decipherability by graph properties

    Note on decipherability of three-word codes

    Get PDF

    Testing decipherability of directed figure codes with domino graphs

    Get PDF
    Various kinds of decipherability of codes, weaker than unique de- cipherability, have been studied since mid-1980s. We consider decipherability of directed figure codes, where directed figures are defined as labelled polyomi- noes with designated start and end points, equipped with catenation operation that may use a merging function to resolve possible conflicts. This setting ex- tends decipherability questions from words to 2D structures. In the present paper we develop a (variant of) domino graph that will allow us to decide some of the decipherability kinds by searching the graph for specific paths. Thus the main result characterizes directed figure decipherability by graph properties

    Unique Decipherability in Formal Languages

    Get PDF
    We consider several language-theoretic aspects of various notions of unique decipherability (or unique factorization) in formal languages. Given a language L at some position within the Chomsky hierarchy, we investigate the language of words UD(L) in L^* that have unique factorization over L. We also consider similar notions for weaker forms of unique decipherability, such as numerically decipherable words ND(L), multiset decipherable words MSD(L) and set decipherable words SD(L). Although these notions of unique factorization have been considered before, it appears that the languages of words having these properties have not been positioned in the Chomsky hierarchy up until now. We show that UF(L), ND(L), MSD(L) and SD(L) need not be context-free if L is context-free. In fact ND(L) and MSD(L) need not be context-free even if L is finite, although UD(L) and SD(L) are regular in this case. We show that if L is context-sensitive, then so are UD(L), ND(L), MSD(L) and SD(L). We also prove that the membership problem (resp., emptiness problem) for these classes is PSPACE-complete (resp., undecidable). We finally determine upper and lower bounds on the length of the shortest word of L^* not having the various forms of unique decipherability into elements of L

    Optimal coding and the origins of Zipfian laws

    Full text link
    The problem of compression in standard information theory consists of assigning codes as short as possible to numbers. Here we consider the problem of optimal coding -- under an arbitrary coding scheme -- and show that it predicts Zipf's law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter. We apply this result to investigate optimal coding also under so-called non-singular coding, a scheme where unique segmentation is not warranted but codes stand for a distinct number. Optimal non-singular coding predicts that the length of a word should grow approximately as the logarithm of its frequency rank, which is again consistent with Zipf's law of abbreviation. Optimal non-singular coding in combination with the maximum entropy principle also predicts Zipf's rank-frequency distribution. Furthermore, our findings on optimal non-singular coding challenge common beliefs about random typing. It turns out that random typing is in fact an optimal coding process, in stark contrast with the common assumption that it is detached from cost cutting considerations. Finally, we discuss the implications of optimal coding for the construction of a compact theory of Zipfian laws and other linguistic laws.Comment: in press in the Journal of Quantitative Linguistics; definition of concordant pair corrected, proofs polished, references update

    On instantaneous codes

    Get PDF
    Maximal instantaneous codes are characterized by the property that they allow unique parsing of every infinite string. The sequence of codeword lengths of a maximal instantaneous code, sequenced in lexicographic order of the codewords, completely determines the code itself. Any increasing, decreasing or unimodal reordering of such a sequence again corresponds to a maximal instantaneous code. Lexicographic length sequences are characterized by a family of Kraft-type equalities

    Codes, orderings, and partial words

    Get PDF
    Codes play an important role in the study of the combinatorics of words. In this paper, we introduce pcodes that play a role in the study of combinatorics ofpartial words. Partial words are strings over a finite alphabet that may contain a number of “do not know” symbols. Pcodes are defined in terms of the compatibility relation that considers two strings over the same alphabet that are equal except for a number of insertions and/or deletions of symbols. We describe various ways of defining and analyzing pcodes. In particular, many pcodes can be obtained as antichains with respect to certain partial orderings. Using a technique related to dominoes, we show that the pcode property is decidable
    corecore