153 research outputs found
Unique Decipherability in Formal Languages
We consider several language-theoretic aspects of various notions of unique decipherability (or unique factorization) in formal languages. Given a language L at some position within the Chomsky hierarchy, we investigate the language of words UD(L) in L^* that have unique factorization over L. We also consider similar notions for weaker forms of unique decipherability, such as numerically decipherable words ND(L), multiset decipherable words MSD(L) and set decipherable words SD(L). Although these notions of unique factorization have been considered before, it appears that the languages of words having these properties have not been positioned in the Chomsky hierarchy up until now. We show that UF(L), ND(L), MSD(L) and SD(L) need not be context-free if L is context-free. In fact ND(L) and MSD(L) need not be context-free even if L is finite, although UD(L) and SD(L) are regular in this case. We show that if L is context-sensitive, then so are UD(L), ND(L), MSD(L) and SD(L). We also prove that the membership problem (resp., emptiness problem) for these classes is PSPACE-complete (resp., undecidable). We finally determine upper and lower bounds on the length of the shortest word of L^* not having the various forms of unique decipherability into elements of L
Compression and the origins of Zipf's law for word frequencies
Here we sketch a new derivation of Zipf's law for word frequencies based on
optimal coding. The structure of the derivation is reminiscent of Mandelbrot's
random typing model but it has multiple advantages over random typing: (1) it
starts from realistic cognitive pressures (2) it does not require fine tuning
of parameters and (3) it sheds light on the origins of other statistical laws
of language and thus can lead to a compact theory of linguistic laws. Our
findings suggest that the recurrence of Zipf's law in human languages could
originate from pressure for easy and fast communication.Comment: arguments have been improved; in press in Complexity (Wiley
Word Equations and Related Topics. Independence, Decidability and Characterizations
The three main topics of this work are independent systems and chains of
word equations, parametric solutions of word equations on three unknowns,
and unique decipherability in the monoid of regular languages.
The most important result about independent systems is a new method
giving an upper bound for their sizes in the case of three unknowns. The
bound depends on the length of the shortest equation. This result has
generalizations for decreasing chains and for more than three unknowns.
The method also leads to shorter proofs and generalizations of some old
results.
Hmelevksii’s theorem states that every word equation on three unknowns
has a parametric solution. We give a significantly simplified proof for this
theorem. As a new result we estimate the lengths of parametric solutions
and get a bound for the length of the minimal nontrivial solution and for
the complexity of deciding whether such a solution exists.
The unique decipherability problem asks whether given elements of some
monoid form a code, that is, whether they satisfy a nontrivial equation. We
give characterizations for when a collection of unary regular languages is a
code. We also prove that it is undecidable whether a collection of binary
regular languages is a code.Siirretty Doriast
The entropy of Łukasiewicz-languages
The paper presents an elementary approach for the calculation of the entropy
of a class of languages. This approach is based on the consideration of
roots of a real polynomial and is also suitable for calculating the
Bernoulli measure. The class of languages we consider here is a
generalisation of the Łukasiewicz language
Note on Decipherability of Three-Word Codes
The theory of uniquely decipherable (UD) codes has been widely developed in connection
with automata theory, combinatorics on words, formal languages, and monoid theory.
Recently, the concepts of multiset decipherable (MSD) and set decipherable (SD) codes were
developed to handle some special problems in the transmission of information. Unique
decipherability is a vital requirement in a wide range of coding applications where distinct
sequences of code words carry different information. However, in several applications,
it is necessary or desirable to communicate a description of a sequence of events where
the information of interest is the set of possible events, including multiplicity, but where
the order of occurrences is irrelevant. Suitable codes for these communication purposes
need not possess the UD property, but the weaker MSD property. In other applications,
the information of interest may be the presence or absence of possible events. The SD
property is adequate for such codes. Lempel (1986) showed that the UD and MSD properties
coincide for two-word codes and conjectured that every three-word MSD code is a UD
code. Guzmán (1995) showed that the UD, MSD, and SD properties coincide for two-word
codes and conjectured that these properties coincide for three-word codes. In an earlier
paper (2001), Blanchet-Sadri answered both conjectures positively for all three-word codes
{c1,c2,c3} satisfying |c1| = |c2| = |c3|. In this note, we answer both conjectures positively
for other special three-word codes. Our procedures are based on techniques related to
dominoes
Introduction to Coding Theory for Flow Equations of Complex Systems Models
The modeling of complex dynamic systems depends on the solution of a differential equations system. Some problems appear because we do not know the mathematical expressions of the said equations. Enough numerical data of the system variables are known. The authors, think that it is very important to establish a code between the different languages to let them codify and decodify information. Coding permits us to reduce the study of some objects to others. Mathematical expressions are used to model certain variables of the system are complex, so it is convenient to define an alphabet code determining the correspondence between these equations and words in the alphabet. In this paper the authors begin with the introduction to the coding and decoding of complex structural systems modeling
Rational, Recognizable, and Aperiodic Sets in the Partially Lossy Queue Monoid
Partially lossy queue monoids (or plq monoids) model the behavior of queues that can forget arbitrary parts of their content. While many decision problems on recognizable subsets in the plq monoid are decidable, most of them are undecidable if the sets are rational. In particular, in this monoid the classes of rational and recognizable subsets do not coincide. By restricting multiplication and iteration in the construction of rational sets and by allowing complementation we obtain precisely the class of recognizable sets. From these special rational expressions we can obtain an MSO logic describing the recognizable subsets. Moreover, we provide similar results for the class of aperiodic subsets in the plq monoid
Methods for relativizing properties of codes
The usual setting for information transmission systems assumes that all words over the source alphabet need to be encoded. The demands on encodings of messages with respect to decodability, error-detection, etc. are thus relative to the whole set of words. In reality, depending on the information source, far fewer messages are transmitted, all belonging to some specific language. Hence the original demands on encodings can be weakened, if only the words in that language are to be considered. This leads one to relativize the properties of encodings or codes to the language at hand. We analyse methods of relativization in this sense. It seems there are four equally convincing notions of relativization. We compare those. Each of them has their own merits for specific code properties. We clarify the differences between the four approaches. We also consider the decidability of relativized properties. If P is a property defining a class of codes and L is a language, one asks, for a given language C, whether C satisfies P relative to L. We show that in the realm of regular languages this question is mostly decidable
Rational, recognizable, and aperiodic sets in the partially lossy queue monoid
Partially lossy queue monoids (or plq monoids) model the behavior of queues that can forget arbitrary parts of their content. While many decision problems on recognizable subsets in the plq monoid are decidable, most of them are undecidable if the sets are rational. In particular, in this monoid the classes of rational and recognizable subsets do not coincide. By restricting multiplication and iteration in the construction of rational sets and by allowing complementation we obtain precisely the class of recognizable sets. From these special rational expressions we can obtain an MSO logic describing the recognizable subsets. Moreover, we provide similar results for the class of aperiodic subsets in the plq monoid
- …