Search CORE

29 research outputs found

2005年度大学院文学研究科修士論文・文学部卒業論文題目一覧

Author: Francesc Font-Clos (786125)
Rosalba Garcia-Millan (3094962)
Álvaro Corral (176368)
Publication venue: チバダイガクブンガクブ
Publication date: 01/01/2016
Field of study

(a) Comparison of the exact probability of survival, ρ(L), given by <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.e032" target="_blank">Eq (17)</a>, with the approximations given by the scaling law <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.e038" target="_blank">Eq (22)</a> and by the scaling law with the first correction to scaling, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.e058" target="_blank">Eq (40)</a>, for different m and L. (b) The same taking the y–axis logarithmic. (c) The same data, taking the ratio between the approximation given by the scaling law [], <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.e038" target="_blank">Eq (22)</a>, and the exact value of ρ(L). Larger values of L are included in this case. The program used to draw the figure is provided as <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.s001" target="_blank">S1 File</a>.</p

Crossref

Directory of Open Access Journals

PubMed Central

Diposit Digital de Documents de la UAB

FigShare

Number of texts with p-value near zero (p < 0.01) in different ranges of L divided by the number of texts in the same ranges, for the fits of distributions f1 and f2.

Author: Francesc Font-Clos (786125)
Isabel Moreno-Sánchez (5666560)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

Values of L denote the geometric mean of ranges containing 1000 texts each. The higher value for the fit of f1 (except for L below about 13000 tokens) denotes its worst performance.</p

FigShare

The lower cut-off for the frequency distribution of lemmas (al) versus the lower cut-off for the frequency distribution of word forms (aw).

Author: Gemma Boleda (5664379)
Ramon Ferrer-i-Cancho (252015)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

The line al = aw is also shown (solid line).</p

FigShare

The fit of a linear model for the relationship between exponents (γw and γl) and the relationship between cut-offs (aw and al).

Author: Gemma Boleda (5664379)
Ramon Ferrer-i-Cancho (252015)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

c1 and c3 stand for slopes and c2 and c4 stand for intercepts. The error bars correspond to one standard deviation. A Student’s t-test is applied to investigate if the slopes are significantly different from one and if the intercepts are significantly different from zero. The resulting p-values indicate that in all cases the slopes are compatible with being equal to one. The intercepts are compatible with zero for the exponents, but seem to be incompatible for the cut-offs.</p

FigShare

Characteristics of the books analyzed.

Author: Gemma Boleda (5664379)
Ramon Ferrer-i-Cancho (252015)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

1Clarissa: Or the History of a Young Lady.2Moby-Dick; or, The Whale.3El ingenioso hidalgo don Quijote de la Mancha (1605)—The Ingenious Gentleman Don Quixote of La Mancha (title in English); including second part: El ingenioso caballero don Quijote de la Mancha (1615).4Artamène ou le Grand Cyrus—Artamène, or Cyrus the Great.5Le Vicomte de Bragelonne ou Dix ans plus tard—The Vicomte of Bragelonne: Ten Years Later.6Seven Brothers.7Spring and the Untimely Return of Winter.8The Story of my Parents.9Madeleine and Georges de Scudéry.The length of each book L is measured in millions of tokens.</p

FigShare

Same as Fig 1a, but replacing the order parameter ρ(L) by ρ(L)/[1 − ρ(L)].

Author: Francesc Font-Clos (786125)
Rosalba Garcia-Millan (3094962)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

The exact behavior is given by <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.e060" target="_blank">Eq (41)</a>, and the scaling law with the first correction to scaling is given by <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.e064" target="_blank">Eq (45)</a>. It becomes clear how the performance of the finite-size scaling law is even better than for ρ(L), in particular for m > 1. The program used to draw the figure is provided as <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0161586#pone.0161586.s001" target="_blank">S1 File</a>.</p

FigShare

Analysis of the association between random variables using Pearson and Spearman correlations as statistics.

Author: Gemma Boleda (5664379)
Ramon Ferrer-i-Cancho (252015)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

ρ is the value of the correlation statistic and p is the p-value of a two-sided test with null hypothesis ρ = 0, calculated through permutations of one of the variables (the results can be different if p is calculated from a t–test). The sample size is</p

FigShare

Estimated probability density of β for fits with p ≥ 0.05, in different length ranges.

Author: Francesc Font-Clos (786125)
Isabel Moreno-Sánchez (5666560)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

We have divided both groups of accepted texts into 4 percentiles according to L. As in the previous figure, the normal kernel smoothing method is applied. (a) For distribution f1. (b) For distribution f2.</p

FigShare

Zipf’s Law for Word Frequencies: Word Forms versus Lemmas in Long Texts - Fig 3

Author: Gemma Boleda (5664379)
Ramon Ferrer-i-Cancho (252015)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

(a) Probability mass functions f(n) of the absolute frequencies n of words and lemmas in La Regenta, together with their fits, under rescaling of both axis. The collapse of the tails indicates the compatibility of both power-law exponents. (b) The same for, from top to bottom, Artamène, Bragelonne (both in French), Seitsemän v., Kevät ja t., and Vanhempieni r. (all three in Finnish). The rescaled distributions are multiplied in addition by factors 1, 10−2, etc., for a clearer visualization.</p

FigShare

Power-law fitting results for words and lemmas, denoted respectively by subindices w and l.

Author: Gemma Boleda (5664379)
Ramon Ferrer-i-Cancho (252015)
Álvaro Corral (176368)
Publication venue
Publication date
Field of study

V is the number of types (vocabulary size), nm is the maximum frequency of the distribution, Na is the number of types in the power-law tail, i.e., with n ≥ a, a is the minimum value for which the power-law fit holds, and γ and σ are the power-law exponent and its standard deviation, respectively. 2σd, the double of the standard deviation σd is also given. σd is the standard deviation of γl−γw assuming independence, which is <mi>σ</mi><mi>d</mi><mo>=</mo><mi>σ</mi><mi>w</mi><mn>2</mn><mo>+</mo><mi>σ</mi><mi>l</mi><mn>2</mn>. The last column provides ℓ1, the number of lemmas associated to only one word form. Notice that the lemma exponent is very close to the one found in Ref. [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0129031#pone.0129031.ref029" target="_blank">29</a>] for the tail of a double power-law fitting, except for Moby-Dick and Ulysses.</p

FigShare

2005年度大学院文学研究科修士論文・文学部卒業論文題目一覧

Number of texts with <i>p</i>-value near zero (<i>p</i> < 0.01) in different ranges of <i>L</i> divided by the number of texts in the same ranges, for the fits of distributions <i>f</i><sub>1</sub> and <i>f</i><sub>2</sub>.

The lower cut-off for the frequency distribution of lemmas (<i>a</i><sub><i>l</i></sub>) versus the lower cut-off for the frequency distribution of word forms (<i>a</i><sub><i>w</i></sub>).

The fit of a linear model for the relationship between exponents (<i>γ</i><sub><i>w</i></sub> and <i>γ</i><sub><i>l</i></sub>) and the relationship between cut-offs (<i>a</i><sub><i>w</i></sub> and <i>a</i><sub><i>l</i></sub>).

Characteristics of the books analyzed.

Same as Fig 1a, but replacing the order parameter <i>ρ</i>(<i>L</i>) by <i>ρ</i>(<i>L</i>)/[1 − <i>ρ</i>(<i>L</i>)].

Analysis of the association between random variables using Pearson and Spearman correlations as statistics.

Estimated probability density of <i>β</i> for fits with <i>p</i> ≥ 0.05, in different length ranges.

Zipf’s Law for Word Frequencies: Word Forms versus Lemmas in Long Texts - Fig 3

Power-law fitting results for words and lemmas, denoted respectively by subindices <i>w</i> and <i>l</i>.