research

Entropy and Long range correlations in literary English

Abstract

Recently long range correlations were detected in nucleotide sequences and in human writings by several authors. We undertake here a systematic investigation of two books, Moby Dick by H. Melville and Grimm's tales, with respect to the existence of long range correlations. The analysis is based on the calculation of entropy like quantities as the mutual information for pairs of letters and the entropy, the mean uncertainty, per letter. We further estimate the number of different subwords of a given length nn. Filtering out the contributions due to the effects of the finite length of the texts, we find correlations ranging to a few hundred letters. Scaling laws for the mutual information (decay with a power law), for the entropy per letter (decay with the inverse square root of nn) and for the word numbers (stretched exponential growth with nn and with a power law of the text length) were found.Comment: 8 page

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 01/04/2019