5 research outputs found
Language Time Series Analysis
We use the Detrended Fluctuation Analysis (DFA) and the Grassberger-Proccacia
analysis (GP) methods in order to study language characteristics. Despite that
we construct our signals using only word lengths or word frequencies, excluding
in this way huge amount of information from language, the application of
Grassberger- Proccacia (GP) analysis indicates that linguistic signals may be
considered as the manifestation of a complex system of high dimensionality,
different from random signals or systems of low dimensionality such as the
earth climate. The DFA method is additionally able to distinguish a natural
language signal from a computer code signal. This last result may be useful in
the field of cryptography.Comment: 21 pages, 5 figures, accepted in Physica
Equilibrium (Zipf) and Dynamic (Grasseberg-Procaccia) method based analyses of human texts. A comparison of natural (english) and artificial (esperanto) languages
A comparison of two english texts from Lewis Carroll, one (Alice in
wonderland), also translated into esperanto, the other (Through a looking
glass) are discussed in order to observe whether natural and artificial
languages significantly differ from each other. One dimensional time series
like signals are constructed using only word frequencies (FTS) or word lengths
(LTS). The data is studied through (i) a Zipf method for sorting out
correlations in the FTS and (ii) a Grassberger-Procaccia (GP) technique based
method for finding correlations in LTS. Features are compared : different power
laws are observed with characteristic exponents for the ranking properties, and
the {\it phase space attractor dimensionality}. The Zipf exponent can take
values much less than unity ( 0.50 or 0.30) depending on how a sentence is
defined. This non-universality is conjectured to be a measure of the author
. Moreover the attractor dimension is a simple function of the so
called phase space dimension , i.e., , with . Such an exponent should also conjecture to be a measure of the author
. However, even though there are quantitative differences between
the original english text and its esperanto translation, the qualitative
differences are very minutes, indicating in this case a translation relatively
well respecting, along our analysis lines, the content of the author writing.Comment: 22 pages, 87 references, 5 tables, 8 figure