Language, which allows complex ideas to be communicated through symbolic
sequences, is a characteristic feature of our species and manifested in a
multitude of forms. Using large written corpora for many different languages
and scripts, we show that the occurrence probability distributions of signs at
the left and right ends of words have a distinct heterogeneous nature.
Characterizing this asymmetry using quantitative inequality measures, viz.
information entropy and the Gini index, we show that the beginning of a word is
less restrictive in sign usage than the end. This property is not simply
attributable to the use of common affixes as it is seen even when only word
roots are considered. We use the existence of this asymmetry to infer the
direction of writing in undeciphered inscriptions that agrees with the
archaeological evidence. Unlike traditional investigations of phonotactic
constraints which focus on language-specific patterns, our study reveals a
property valid across languages and writing systems. As both language and
writing are unique aspects of our species, this universal signature may reflect
an innate feature of the human cognitive phenomenon.Comment: 10 pages, 4 figures + Supplementary Information (15 pages, 8
figures), final corrected versio