153 research outputs found

    Unique Decipherability in Formal Languages

    Get PDF
    We consider several language-theoretic aspects of various notions of unique decipherability (or unique factorization) in formal languages. Given a language L at some position within the Chomsky hierarchy, we investigate the language of words UD(L) in L^* that have unique factorization over L. We also consider similar notions for weaker forms of unique decipherability, such as numerically decipherable words ND(L), multiset decipherable words MSD(L) and set decipherable words SD(L). Although these notions of unique factorization have been considered before, it appears that the languages of words having these properties have not been positioned in the Chomsky hierarchy up until now. We show that UF(L), ND(L), MSD(L) and SD(L) need not be context-free if L is context-free. In fact ND(L) and MSD(L) need not be context-free even if L is finite, although UD(L) and SD(L) are regular in this case. We show that if L is context-sensitive, then so are UD(L), ND(L), MSD(L) and SD(L). We also prove that the membership problem (resp., emptiness problem) for these classes is PSPACE-complete (resp., undecidable). We finally determine upper and lower bounds on the length of the shortest word of L^* not having the various forms of unique decipherability into elements of L

    Compression and the origins of Zipf's law for word frequencies

    Get PDF
    Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding. The structure of the derivation is reminiscent of Mandelbrot's random typing model but it has multiple advantages over random typing: (1) it starts from realistic cognitive pressures (2) it does not require fine tuning of parameters and (3) it sheds light on the origins of other statistical laws of language and thus can lead to a compact theory of linguistic laws. Our findings suggest that the recurrence of Zipf's law in human languages could originate from pressure for easy and fast communication.Comment: arguments have been improved; in press in Complexity (Wiley

    Word Equations and Related Topics. Independence, Decidability and Characterizations

    Get PDF
    The three main topics of this work are independent systems and chains of word equations, parametric solutions of word equations on three unknowns, and unique decipherability in the monoid of regular languages. The most important result about independent systems is a new method giving an upper bound for their sizes in the case of three unknowns. The bound depends on the length of the shortest equation. This result has generalizations for decreasing chains and for more than three unknowns. The method also leads to shorter proofs and generalizations of some old results. Hmelevksii’s theorem states that every word equation on three unknowns has a parametric solution. We give a significantly simplified proof for this theorem. As a new result we estimate the lengths of parametric solutions and get a bound for the length of the minimal nontrivial solution and for the complexity of deciding whether such a solution exists. The unique decipherability problem asks whether given elements of some monoid form a code, that is, whether they satisfy a nontrivial equation. We give characterizations for when a collection of unary regular languages is a code. We also prove that it is undecidable whether a collection of binary regular languages is a code.Siirretty Doriast

    The entropy of Łukasiewicz-languages

    Get PDF
    The paper presents an elementary approach for the calculation of the entropy of a class of languages. This approach is based on the consideration of roots of a real polynomial and is also suitable for calculating the Bernoulli measure. The class of languages we consider here is a generalisation of the Łukasiewicz language

    Note on Decipherability of Three-Word Codes

    Get PDF
    The theory of uniquely decipherable (UD) codes has been widely developed in connection with automata theory, combinatorics on words, formal languages, and monoid theory. Recently, the concepts of multiset decipherable (MSD) and set decipherable (SD) codes were developed to handle some special problems in the transmission of information. Unique decipherability is a vital requirement in a wide range of coding applications where distinct sequences of code words carry different information. However, in several applications, it is necessary or desirable to communicate a description of a sequence of events where the information of interest is the set of possible events, including multiplicity, but where the order of occurrences is irrelevant. Suitable codes for these communication purposes need not possess the UD property, but the weaker MSD property. In other applications, the information of interest may be the presence or absence of possible events. The SD property is adequate for such codes. Lempel (1986) showed that the UD and MSD properties coincide for two-word codes and conjectured that every three-word MSD code is a UD code. Guzmán (1995) showed that the UD, MSD, and SD properties coincide for two-word codes and conjectured that these properties coincide for three-word codes. In an earlier paper (2001), Blanchet-Sadri answered both conjectures positively for all three-word codes {c1,c2,c3} satisfying |c1| = |c2| = |c3|. In this note, we answer both conjectures positively for other special three-word codes. Our procedures are based on techniques related to dominoes

    Introduction to Coding Theory for Flow Equations of Complex Systems Models

    Get PDF
    The modeling of complex dynamic systems depends on the solution of a differential equations system. Some problems appear because we do not know the mathematical expressions of the said equations. Enough numerical data of the system variables are known. The authors, think that it is very important to establish a code between the different languages to let them codify and decodify information. Coding permits us to reduce the study of some objects to others. Mathematical expressions are used to model certain variables of the system are complex, so it is convenient to define an alphabet code determining the correspondence between these equations and words in the alphabet. In this paper the authors begin with the introduction to the coding and decoding of complex structural systems modeling

    Rational, Recognizable, and Aperiodic Sets in the Partially Lossy Queue Monoid

    Get PDF
    Partially lossy queue monoids (or plq monoids) model the behavior of queues that can forget arbitrary parts of their content. While many decision problems on recognizable subsets in the plq monoid are decidable, most of them are undecidable if the sets are rational. In particular, in this monoid the classes of rational and recognizable subsets do not coincide. By restricting multiplication and iteration in the construction of rational sets and by allowing complementation we obtain precisely the class of recognizable sets. From these special rational expressions we can obtain an MSO logic describing the recognizable subsets. Moreover, we provide similar results for the class of aperiodic subsets in the plq monoid

    Methods for relativizing properties of codes

    Get PDF
    The usual setting for information transmission systems assumes that all words over the source alphabet need to be encoded. The demands on encodings of messages with respect to decodability, error-detection, etc. are thus relative to the whole set of words. In reality, depending on the information source, far fewer messages are transmitted, all belonging to some specific language. Hence the original demands on encodings can be weakened, if only the words in that language are to be considered. This leads one to relativize the properties of encodings or codes to the language at hand. We analyse methods of relativization in this sense. It seems there are four equally convincing notions of relativization. We compare those. Each of them has their own merits for specific code properties. We clarify the differences between the four approaches. We also consider the decidability of relativized properties. If P is a property defining a class of codes and L is a language, one asks, for a given language C, whether C satisfies P relative to L. We show that in the realm of regular languages this question is mostly decidable

    Rational, recognizable, and aperiodic sets in the partially lossy queue monoid

    Get PDF
    Partially lossy queue monoids (or plq monoids) model the behavior of queues that can forget arbitrary parts of their content. While many decision problems on recognizable subsets in the plq monoid are decidable, most of them are undecidable if the sets are rational. In particular, in this monoid the classes of rational and recognizable subsets do not coincide. By restricting multiplication and iteration in the construction of rational sets and by allowing complementation we obtain precisely the class of recognizable sets. From these special rational expressions we can obtain an MSO logic describing the recognizable subsets. Moreover, we provide similar results for the class of aperiodic subsets in the plq monoid
    corecore