Search CORE

3,120 research outputs found

Log-log Convexity of Type-Token Growth in Zipf's Systems

Author: Corral Alvaro
Font-Clos Francesc
Publication venue: 'American Physical Society (APS)'
Publication date: 15/12/2014
Field of study

It is traditionally assumed that Zipf's law implies the power-law growth of the number of different elements with the total number of elements in a system - the so-called Heaps' law. We show that a careful definition of Zipf's law leads to the violation of Heaps' law in random systems, and obtain alternative growth curves. These curves fulfill universal data collapses that only depend on the value of the Zipf's exponent. We observe that real books behave very much in the same way as random systems, despite the presence of burstiness in word occurrence. We advance an explanation for this unexpected correspondence

arXiv.org e-Print Archive

RECERCAT

Languages cool as they expand: Allometric scaling and the decreasing need for new words

Author: A Clauset
A Gnedin
A Vespignani
AA Tsonis
AL Barabási
AM Petersen
B Mandelbrot
B Podobnik
B Podobnik
D Fu
D Helbing
D Lazer
DC van Leijenhorst
E Alvarez-Lacalle
EA Altmann
EG Altmann
EG Altmann
GB West
GW Oehlert
HA Makse
HAJrJSA Makse
HD Rozenfeld
HD Rozenfeld
J Gao
J-B Michel
JA Evans
L Lü
LAN Amaral
LAN Amaral
LMA Bettencourt
M Batty
M Kleiber
M Markosova
M Riccaboni
M Sigman
M Steyvers
MA Montemurro
MEJ Newman
MHR Stanley
MÁ Serrano
R Ferrer i Cancho
R Ferrer i Cancho
R Ferrer i Cancho
RN Mantegna
S Bernhardsson
S Bernhardsson
S Karlin
SK Baek
X Gabaix
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/12/2012
Field of study

We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This ‘‘cooling pattern’’ forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

Digital library of University of Maribor

PubMed Central

IMT Institutional Repository

Zipf's Law Leads to Heaps' Law: Analyzing Their Relation in Finite-Size Systems

Author: Linyuan Lü
Olaf Sporns
Tao Zhou
Zi-Ke Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/05/2010
Field of study

Background: Zipf's law and Heaps' law are observed in disparate complex systems. Of particular interests, these two laws often appear together. Many theoretical models and analyses are performed to understand their co-occurrence in real systems, but it still lacks a clear picture about their relation. Methodology/Principal Findings: We show that the Heaps' law can be considered as a derivative phenomenon if the system obeys the Zipf's law. Furthermore, we refine the known approximate solution of the Heaps' exponent provided the Zipf's exponent. We show that the approximate solution is indeed an asymptotic solution for infinite systems, while in the finite-size system the Heaps' exponent is sensitive to the system size. Extensive empirical analysis on tens of disparate systems demonstrates that our refined results can better capture the relation between the Zipf's and Heaps' exponents. Conclusions/Significance: The present analysis provides a clear picture about the relation between the Zipf's law and Heaps' law without the help of any specific stochastic model, namely the Heaps' law is indeed a derivative phenomenon from Zipf's law. The presented numerical method gives considerably better estimation of the Heaps' exponent given the Zipf's exponent and the system size. Our analysis provides some insights and implications of real complex systems, for example, one can naturally obtained a better explanation of the accelerated growth of scale-free networks.Comment: 15 pages, 6 figures, 1 Tabl

arXiv.org e-Print Archive

Crossref

PubMed Central

Innovation and Nested Preferential Growth in Chess Playing Behavior

Author: Billoni Orlando V.
Jo Hang-Hyun
Perotti Juan I.
Schaigorodsky Ana L.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2013
Field of study

Complexity develops via the incorporation of innovative properties. Chess is one of the most complex strategy games, where expert contenders exercise decision making by imitating old games or introducing innovations. In this work, we study innovation in chess by analyzing how different move sequences are played at the population level. It is found that the probability of exploring a new or innovative move decreases as a power law with the frequency of the preceding move sequence. Chess players also exploit already known move sequences according to their frequencies, following a preferential growth mechanism. Furthermore, innovation in chess exhibits Heaps' law suggesting similarities with the process of vocabulary growth. We propose a robust generative mechanism based on nested Yule-Simon preferential growth processes that reproduces the empirical observations. These results, supporting the self-similar nature of innovations in chess are important in the context of decision making in a competitive scenario, and extend the scope of relevant findings recently discovered regarding the emergence of Zipf's law in chess.Comment: 8 pages, 4 figures, accepted for publication in Europhysics Letters (EPL

arXiv.org e-Print Archive

CONICET Digital

EDP Sciences OAI-PMH repository (1.2.0)

IMT Institutional Repository

COSMICAH 2005: workshop on verification of COncurrent Systems with dynaMIC Allocated Heaps (a Satellite event of ICALP 2005) - Informal Proceedings

Author: Distefano Dino
Iosif Radu
O'Hearn Peter
Publication venue
Publication date: 30/12/2013
Field of study

Lisboa Portugal, 10 July 200

Queen Mary Research Online

Do Neural Nets Learn Statistical Laws behind Natural Language?

Author: Takahashi Shuntaro
Tanaka-Ishii Kumiko
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 28/11/2017
Field of study

The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.Comment: 21 pages, 11 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Footprints in Local Reasoning

Author: Gardner Philippa
Raza Mohammad
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2008
Field of study

Local reasoning about programs exploits the natural local behaviour common in programs by focussing on the footprint - that part of the resource accessed by the program. We address the problem of formally characterising and analysing the footprint notion for abstract local functions introduced by Calcagno, O Hearn and Yang. With our definition, we prove that the footprints are the only essential elements required for a complete specification of a local function. We formalise the notion of small specifications in local reasoning and show that for well-founded resource models, a smallest specification always exists that only includes the footprints, and also present results for the non-well-founded case. Finally, we use this theory of footprints to investigate the conditions under which the footprints correspond to the smallest safe states. We present a new model of RAM in which, unlike the standard model, the footprints of every program correspond to the smallest safe states, and we also identify a general condition on the primitive commands of a programming language which guarantees this property for arbitrary models.Comment: LMCS 2009 (FOSSACS 2008 special issue

arXiv.org e-Print Archive

CiteSeerX

Episciences.org

Spiral - Imperial College Digital Repository