78 research outputs found

    Alien symbols for alien language:Iterated learning in a unique, novel signal space

    Get PDF

    Evolving linguistic divergence on polarizing social media

    Full text link
    Language change is influenced by many factors, but often starts from synchronic variation, where multiple linguistic patterns or forms coexist, or where different speech communities use language in increasingly different ways. Besides regional or economic reasons, communities may form and segregate based on political alignment. The latter, referred to as political polarization, is of growing societal concern across the world. Here we map and quantify linguistic divergence across the partisan left-right divide in the United States, using social media data. We develop a general methodology to delineate (social) media users by their political preference, based on which (potentially biased) news media accounts they do and do not follow on a given platform. Our data consists of 1.5M short posts by 10k users (about 20M words) from the social media platform Twitter (now "X"). Delineating this sample involved mining the platform for the lists of followers (n=422M) of 72 large news media accounts. We quantify divergence in topics of conversation and word frequencies, messaging sentiment, and lexical semantics of words and emoji. We find signs of linguistic divergence across all these aspects, especially in topics and themes of conversation, in line with previous research. While US American English remains largely intelligible within its large speech community, our findings point at areas where miscommunication may eventually arise given ongoing polarization and therefore potential linguistic divergence. Our methodology - combining data mining, lexicostatistics, machine learning, large language models and a systematic human annotation approach - is largely language and platform agnostic. In other words, while we focus here on US political divides and US English, the same approach is applicable to other countries, languages, and social media platforms

    Mappings between linguistic sound and motion

    Get PDF
    This paper provides an overview of the possible function of non-arbitrary mappings between linguistic form and meaning, and presents new empirical evidence showing that shared cross-modal associations may underlie motion sound-symbolism in particular. In terms of function, several lines of empirical and theoretical evidence suggest that non-arbitrary form-meaning connections could have played a crucial role in lexical emergence during language evolution. Furthermore, the persistence of such non-arbitrariness in some areas of modern language may also be highly functional, as recent data has shown that non-arbitrary forms may help to bootstrap learning in children (Imai, Kita, Nagumo, and Okada, 2008) and adults (Nielsen and Rendall, 2012). Given the functional role of these non-arbitrary mappings between linguistic form and meaning, this paper describes new experimental data demonstrating shared mappings between non-sense words and visual motion using a direct matching task. Participants were given nonsense words that varied in terms of their voicing, reduplication, and vowel quality, and asked to change the movement of a ball to match a given word. Results show that back vowels are mapped onto slower speeds, and consonant reduplication with vowel alternation is mapped onto faster speeds. These results show a shared cross-modal association between linguistic sound and motion, which is likely leveraged in sound-symbolic systems found in natural language

    Shared cross-modal associations and the emergence of the lexicon

    Get PDF
    This thesis centres around a sensory theory of protolanguage emergence, or STP. The STP proposes that shared biases to make associations between sensory modalities provided the basis for the emergence of a shared protolinguistic lexicon. Crucially, this lexicon would have been grounded in our perceptual systems, and thus fundamentally non-arbitrary. The foundation of such a lexicon lies in shared cross-modal associations: biases shared among language users to map properties in one modality (e.g., visual size) onto another (e.g., vowel sounds). While there is broad evidence that we make associations between a variety of modalities (Spence, 2011), this thesis focuses specifically on associations involving linguistic sound, arguing that these associations would have been most important in language emergence. Early linguistic utterances, by virtue of their grounding in shared cross-modal associations, could be formed and understood with high mutual intelligibility. The first chapter of the thesis will outline this theory in detail, addressing the nature of the proposed protolanguage system, arguing for the utility of non-arbitrariness at the point of language emergence, and proposing evidence for the likely transition form a non-arbitrary protolanguage to the predominantly arbitrary language systems we observe today. The remainder of the thesis will focus on providing empirical evidence to support this theory in two ways: (i) presenting experimental data showing evidence of shared associations between linguistic sound and other modalities, and (ii) providing evidence that such associations are evident cross-linguistically, despite the predominantly arbitrary nature of modern languages. Chapter two will examine well-documented associations between vowel quality and physical size (e.g., /i/ is small, and /a/ is large; Sapir, 1929). This chapter presents a new experimental approach which fails to find robust associations between vowel quality and size absent the use of a forced choice paradigm. Chapter three turns to associations between linguistic sound and shape angularity, taking a critical perspective on the classic takete/maluma experiment (Kohler, 1929). New empirical evidence shows that the acquisition of visual word forms plays a highly influential role in mediating associations between linguistic sound and angularity, but that associations between linguistic sound and visual form also play a minor role in auditory tasks. Chapter four will examine a relatively unexplored modality: taste. A simple survey which asks participants to choose non-words to match representative tastes shows that certain linguistic sounds are preferred for certain food items. In a more detailed study, we use a more direct perceptual matching task with actual tastants and synthesises speech sounds, further showing that people make robust shared associations between linguistic sound and taste. Chapter five returns to the visual modality, considering previously unexmained associations between linguistic sound and motion, specifically the feature of speed. This study demonstrates that people do make robust associations between the two modalities, particularly for vowel quality. Chapter six will aim to take a different empirical approach, considering non-arbitrariness in natural language. Motivated by the experimental data from the previous chapters, we turn to corpus analyses to assess the presence of non-arbitrariness in natural language which concurs with behavioural data showing linguistic cross-modal associations. First, a corpus analysis of taste synonyms in English shows small but significant correlations between form and meaning. With the goal of addressing the universality of specific sound-meaning associations, we examine cross-linguistic corpora of taste and motion terms, showing that particular phonological features tend to connect to certain tastes and types of motion across genetically and geographically distinct languages. Lastly, the thesis will conclude by considering the STP in light of the empirical evidence presented, and suggesting possible future empirical directions to explore the theory more broadly

    Double-blind reviewing and gender biases at EvoLang conferences

    Get PDF
    A previous study of reviewing at the Evolution of Language conferences found effects that suggested that gender bias against female authors was alleviated under double-blind review at EvoLang 11. We update this analysis in two specific ways. First, we add data from the most recent EvoLang 12 conference, providing a comprehensive picture of the conference over five iterations. Like EvoLang 11, EvoLang 12 used double-blind review, but EvoLang 12 showed no significant difference in review scores between genders. We discuss potential explanations for why there was a strong effect in EvoLang 11, which is largely absent in EvoLang 12. These include testing whether readability differs between genders, though we find no evidence to support this. Although gender differences seem to have declined for EvoLang 12, we suggest that double-blind review provides a more equitable evaluation process

    Cross-modal associations and synaesthesia:Categorical perception and structure in vowel-colour mappings in a large online sample

    Get PDF
    We report associations between vowel sounds, graphemes, and colours collected online from over 1000 Dutch speakers. We provide open materials including a Python implementation of the structure measure, and code for a single page web application to run simple cross-modal tasks. We also provide a full dataset of colour-vowel associations from 1164 participants, including over 200 synaesthetes identified using consistency measures. Our analysis reveals salient patterns in cross-modal associations, and introduces a novel measure of isomorphism in cross-modal mappings. We find that while acoustic features of vowels significantly predict certain mappings (replicating prior work), both vowel phoneme category and grapheme category are even better predictors of colour choice. Phoneme category is the best predictor of colour choice overall, pointing to the importance of phonological representations in addition to acoustic cues. Generally, high/front vowels are lighter, more green, and more yellow than low/back vowels. Synaesthetes respond more strongly on some dimensions, choosing lighter and more yellow colours for high and mid front vowels than non-synaesthetes. We also present a novel measure of cross-modal mappings adapted from ecology, which uses a simulated distribution of mappings to measure the extent to which participants' actual mappings are structured isomorphically across modalities. Synaesthetes have mappings that tend to be more structured than non-synaesthetes, and more consistent colour choices across trials correlate with higher structure scores. Nevertheless, the large majority (~70%) of participants produce structured mappings, indicating that the capacity to make isomorphically structured mappings across distinct modalities is shared to a large extent, even if the exact nature of mappings varies across individuals. Overall, this novel structure measure suggests a distribution of structured cross-modal association in the population, with synaesthetes on one extreme and participants with unstructured associations on the other

    The regularity game:Investigating linguistic rule dynamics in a population of interacting agents

    Get PDF
    Abstract Rules are an efficient feature of natural languages which allow speakers to use a finite set of instructions to generate a virtually infinite set of utterances. Yet, for many regular rules, there are irregular exceptions. There has been lively debate in cognitive science about how individual learners acquire rules and exceptions; for example, how they learn the past tense of preach is preached, but for teach it is taught. However, for most population or language-level models of language structure, particularly from the perspective of language evolution, the goal has generally been to examine how languages evolve stable structure, and neglects the fact that in many cases, languages exhibit exceptions to structural rules. We examine the dynamics of regularity and irregularity across a population of interacting agents to investigate how, for example, the irregular teach coexists beside the regular preach in a dynamic language system. Models show that in the absence of individual biases towards either regularity or irregularity, the outcome of a system is determined entirely by the initial condition. On the other hand, in the presence of individual biases, rule systems exhibit frequency dependent patterns in regularity reminiscent of patterns found in natural language. We implement individual biases towards regularity in two ways: through ‘child’ agents who have a preference to generalise using the regular form, and through a memory constraint wherein an agent can only remember an irregular form for a finite time period. We provide theoretical arguments for the prediction of a critical frequency below which irregularity cannot persist in terms of the duration of the finite time period which constrains agent memory. Further, within our framework we also find stable irregularity, arguably a feature of most natural languages not accounted for in many other cultural models of language structure

    General three state model with biased population replacement:Analytical solution and application to language dynamics

    Get PDF
    Empirical evidence shows that the rate of irregular usage of English verbs exhibits discontinuity as a function of their frequency: the most frequent verbs tend to be totally irregular. We aim to qualitatively understand the origin of this feature by studying simple agent--based models of language dynamics, where each agent adopts an inflectional state for a verb and may change it upon interaction with other agents. At the same time, agents are replaced at some rate by new agents adopting the regular form. In models with only two inflectional states (regular and irregular), we observe that either all verbs regularize irrespective of their frequency, or a continuous transition occurs between a low frequency state where the lemma becomes fully regular, and a high frequency one where both forms coexist. Introducing a third (mixed) state, wherein agents may use either form, we find that a third, qualitatively different behavior may emerge, namely, a discontinuous transition in frequency. We introduce and solve analytically a very general class of three--state models that allows us to fully understand these behaviors in a unified framework. Realistic sets of interaction rules, including the well-known Naming Game (NG) model, result in a discontinuous transition, in agreement with recent empirical findings. We also point out that the distinction between speaker and hearer in the interaction has no effect on the collective behavior. The results for the general three--state model, although discussed in terms of language dynamics, are widely applicable.Comment: 14 pages, 6 figures. Final published versio
    • …
    corecore