Randomness versus specifics for word-frequency distributions

Minnhagen, Petter; Yan, Xiao-Yong

research

Randomness versus specifics for word-frequency distributions

Authors: Petter Minnhagen
Xiao-Yong Yan
Publication date: 10 November 2015
Publisher: 'Elsevier BV'
Doi

Abstract

The text-length-dependence of real word-frequency distributions can be connected to the general properties of a random book. It is pointed out that this finding has strong implications, when deciding between two conceptually different views on word-frequency distributions, i.e. the specific `Zipf's-view' and the non-specific `Randomness-view', as is discussed. It is also noticed that the text-length transformation of a random book does have an exact scaling property precisely for the power-law index

\gamma=1

, as opposed to the Zipf's exponent

\gamma=2

and the implication of this exact scaling property is discussed. However a real text has

\gamma>1

and as a consequence

\gamma

increases when shortening a real text. The connections to the predictions from the RGF(Random Group Formation) and to the infinite length-limit of a meta-book are also discussed. The difference between `curve-fitting' and `predicting' word-frequency distributions is stressed. It is pointed out that the question of randomness versus specifics for the distribution of outcomes in case of sufficiently complex systems has a much wider relevance than just the word-frequency example analyzed in the present work.Comment: 9 pages, 7 figure

Similar works

Full text

Available Versions

Swepub

oai:DiVA.org:umu-114601

Last time updated on 03/01/2025

Crossref

info:doi/10.1016%2Fj.physa.201...

Last time updated on 03/12/2019