10 research outputs found

    The challenges of statistical patterns of language: the case of Menzerath's law in genomes

    Get PDF
    The importance of statistical patterns of language has been debated over decades. Although Zipf's law is perhaps the most popular case, recently, Menzerath's law has begun to be involved. Menzerath's law manifests in language, music and genomes as a tendency of the mean size of the parts to decrease as the number of parts increases in many situations. This statistical regularity emerges also in the context of genomes, for instance, as a tendency of species with more chromosomes to have a smaller mean chromosome size. It has been argued that the instantiation of this law in genomes is not indicative of any parallel between language and genomes because (a) the law is inevitable and (b) non-coding DNA dominates genomes. Here mathematical, statistical and conceptual challenges of these criticisms are discussed. Two major conclusions are drawn: the law is not inevitable and languages also have a correlate of non-coding DNA. However, the wide range of manifestations of the law in and outside genomes suggests that the striking similarities between non-coding DNA and certain linguistics units could be anecdotal for understanding the recurrence of that statistical law.Comment: Title changed, abstract and introduction improved and little corrections on the statistical argument

    When is Menzerath-Altmann law mathematically trivial? A new approach

    Get PDF
    Menzerath’s law, the tendency of Z (the mean size of the parts) to decrease as X (the number of parts) increases, is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z = Y/X, which would imply that Z scales with X as Z~1/X. That scaling is a very particular case of Menzerath-Altmann law that has been rejected by means of a correlation test between X and Y in genomes, being X the number of chromosomes of a species, Y its genome size in bases and Z the mean chromosome size. Here we review the statistical foundations of that test and consider three non-parametric tests based upon different correlation metrics and one parametric test to evaluate if Z~1/X in genomes. The most powerful test is a new non-parametric one based upon the correlation ratio, which is able to reject Z~1/X in nine out of 11 taxonomic groups and detect a borderline group. Rather than a fact, Z~1/X is a baseline that real genomes do not meet. The view of Menzerath-Altmann law as inevitable is seriously flawed.Peer ReviewedPostprint (author’s final draft

    Tackling the Toolkit. Plotting Poetry through Computational Literary Studies

    Get PDF
    In Tackling the Toolkit, we focus on the methodological innovations, challenges, obstacles and even shortcomings associated with applying quantitative methods to poetry specifically and poetics more broadly. Using tools including natural language processing, web ontologies, similarity detection devices and machine learning, our contributors explore not only metres, stanzas, stresses and rhythms but also genres, subgenres, lexical material and cognitive processes. Whether they are testing old theories and laws, making complex concepts machine-readable or developing new lines of textual analysis, their works challenge standard descriptions of norms and variations

    Quantifying Interpreting Types: Language Sequence Mirrors Cognitive Load Minimization in Interpreting Tasks

    Get PDF
    Most interpreting theories claim that different interpreting types should involve varied processing mechanisms and procedures. However, few studies have examined their underlying differences. Even though some previous results based on quantitative approaches show that different interpreting types yield outputs of varying lexical and syntactic features, the grammatical parsing approach is limited. Language sequences that form without relying on parsing or processing with a specific linguistic approach or grammar excel other quantitative approaches at revealing the sequential behavior of language production. As a non-grammatically-bound unit of language sequences, frequency motif can visualize the local distribution of content and function words, and can also statistically classify languages and identify text types. Thus, the current research investigates the distribution, length and position-dependent properties of frequency motifs across different interpreting outputs in pursuit of the sequential generation behaviors. It is found that the distribution, the length and certain position-dependent properties of the specific language sequences differ significantly across simultaneous interpreting and consecutive interpreting output. The features of frequency motifs manifest that both interpreting output is produced in the manner that abides by the least effort principle. The current research suggests that interpreting types can be differentiated through this type of language sequential unit and offers evidence for how the different task features mediate the sequential organization of interpreting output under different demand to achieve cognitive load minimization

    Lengths and L-motifs of rhythmical units in formal British speech

    Get PDF
    The lengths of rhythmical units (as defined by Karl Marbe in 1904) were identified, and their frequencies counted, in twelve complete texts from the Aix-MARSEC database of formal spoken British English. The texts all belonged to the genre of current affairs com¬mentary. L-motifs (i.e. maximal monotone non-decreasing sequences) of the rhythmical unit lengths were also identified, and the frequencies of the different L-motif lengths were count¬ed. The frequencies of both rhythmical unit lengths and L-motif lengths were modelled using a continuous approach with the Zipf-Alekseev function. Good qualities of fit were obtained for both kinds of unit on all texts. The parameters a and b of the Zipf-Alekseev function for the rhythmical unit lengths (though not for the L-motif lengths) were also found to be related in the form of a further Zipf-Alekseev function. Further research should aim to extend the application of the motif approach to rhythmical units

    Melodic segmentation: structure, cognition, algorithms

    Get PDF
    Segmentation of melodies into smaller units (phrases, themes, motifs, etc.) is an important process in both music analysis and music cognition. Also, segmentation is a necessary preprocessing step for various tasks in music information retrieval. Several algorithms for automatic segmentation have been proposed, based on different music-theoretical backgrounds and computing approaches. Rule-based models operate on a given set of logical conditions. Learning-based models, originating in linguistics, compute segmentation criteria on the basis of statistical parameters of a training corpus and/or of the given composition. The aim of this preliminary study is to propose and describe a new segmentation algorithm that is rule-based, parsimonious, and unambiguous

    On Musical Self-Similarity : Intersemiosis as Synecdoche and Analogy

    Get PDF
    Self-similarity, a concept borrowed from mathematics, is gradually becoming a keyword in musicology. Although a polysemic term, self-similarity often refers to the multi-scalar feature repetition in a set of relationships, and it is commonly valued as an indication for musical ‘coherence’ and ‘consistency’. In this study, Gabriel Pareyon presents a theory of musical meaning formation in the context of intersemiosis, that is, the translation of meaning from one cognitive domain to another cognitive domain (e.g. from mathematics to music, or to speech or graphic forms). From this perspective, the degree of coherence of a musical system relies on a synecdochic intersemiosis: a system of related signs within other comparable and correlated systems. The author analyzes the modalities of such correlations, exploring their general and particular traits, and their operational bounds. Accordingly, the notion of analogy is used as a rich concept through its two definitions quoted by the Classical literature—proportion and paradigm, enormously valuable in establishing measurement, likeness and affinity criteria. At the same time, original arguments by Benoît B. Mandelbrot (1924–2010) are revised, alongside a systematic critique of the literature on the subject. In fact, connecting Charles S. Peirce’s ‘synechism’ with Mandelbrot’s ‘fractality’ is one of the main developments of the present study

    The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE)

    Get PDF
    corecore