6,056 research outputs found

    Empirical approaches for investigating the origins of structure in speech

    Get PDF
    © John Benjamins Publishing Company. In language evolution research, the use of computational and experimental methods to investigate the emergence of structure in language is exploding. In this review, we look exclusively at work exploring the emergence of structure in speech, on both a categorical level (what drives the emergence of an inventory of individual speech sounds), and a combinatorial level (how these individual speech sounds emerge and are reused as part of larger structures). We show that computational and experimental methods for investigating population-level processes can be effectively used to explore and measure the effects of learning, communication and transmission on the emergence of structure in speech. We also look at work on child language acquisition as a tool for generating and validating hypotheses for the emergence of speech categories. Further, we review the effects of noise, iconicity and production effects

    Contrastive Multimodal Learning for Emergence of Graphical Sensory-Motor Communication

    Full text link
    In this paper, we investigate whether artificial agents can develop a shared language in an ecological setting where communication relies on a sensory-motor channel. To this end, we introduce the Graphical Referential Game (GREG) where a speaker must produce a graphical utterance to name a visual referent object while a listener has to select the corresponding object among distractor referents, given the delivered message. The utterances are drawing images produced using dynamical motor primitives combined with a sketching library. To tackle GREG we present CURVES: a multimodal contrastive deep learning mechanism that represents the energy (alignment) between named referents and utterances generated through gradient ascent on the learned energy landscape. We demonstrate that CURVES not only succeeds at solving the GREG but also enables agents to self-organize a language that generalizes to feature compositions never seen during training. In addition to evaluating the communication performance of our approach, we also explore the structure of the emerging language. Specifically, we show that the resulting language forms a coherent lexicon shared between agents and that basic compositional rules on the graphical productions could not explain the compositional generalization

    ARE EMOTIONAL DISPLAYS AN EVOLUTIONARY PRECURSOR TO COMPOSITIONALITY IN LANGUAGE?

    Get PDF
    Compositionality is a basic property of language, spoken and signed, according to which the meaning of a complex structure is determined by the meanings of its constituents and the way they combine (e.g., Jackendoff, 2011 for spoken language; Sandler 2012 for constituents conveyed by face and body signals in sign language; Kirby & Smith, 2012 for emergence of compositionality). Here we seek the foundations of this property in a more basic, and presumably prior, form of communication: the spontaneous expression of emotion. To this end, we ask whether features of facial expressions and body postures are combined and recombined to convey different complex meanings in extreme displays of emotions. There is evidence that facial expressions are processed in a compositional fashion (Chen & Chen, 2010). In addition, facial components such as nose wrinkles or eye opening elicit systematic confusion while decoding facial expressions of disgust and anger and fear and surprise, respectively (Jack et al., 2014), suggesting that other co-occurring signals contribute to their interpretation. In spontaneous emotional displays of athletes, the body – and not the face – better predicts participants’ correct assessments of victory and loss pictures, as conveying positive or negative emotions (Aviezer et al., 2012), suggesting at least that face and body make different contributions to interpretations of the displays. Taken together, such studies lead to the hypothesis that emotional displays are compositional - that each signal component, or possibly specific clusters of components (Du et al., 2014), may have their own interpretations, and make a contribution to the complex meaning of the whole. On the assumption that emotional displays are older than language in evolution, our research program aims to determine whether the crucial property of compositionality is indeed present in communicative displays of emotion

    Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel

    Full text link
    While multi-agent reinforcement learning has been used as an effective means to study emergent communication between agents, existing work has focused almost exclusively on communication with discrete symbols. Human communication often takes place (and emerged) over a continuous acoustic channel; human infants acquire language in large part through continuous signalling with their caregivers. We therefore ask: Are we able to observe emergent language between agents with a continuous communication channel trained through reinforcement learning? And if so, what is the impact of channel characteristics on the emerging language? We propose an environment and training methodology to serve as a means to carry out an initial exploration of these questions. We use a simple messaging environment where a "speaker" agent needs to convey a concept to a "listener". The Speaker is equipped with a vocoder that maps symbols to a continuous waveform, this is passed over a lossy continuous channel, and the Listener needs to map the continuous signal to the concept. Using deep Q-learning, we show that basic compositionality emerges in the learned language representations. We find that noise is essential in the communication channel when conveying unseen concept combinations. And we show that we can ground the emergent communication by introducing a caregiver predisposed to "hearing" or "speaking" English. Finally, we describe how our platform serves as a starting point for future work that uses a combination of deep reinforcement learning and multi-agent systems to study our questions of continuous signalling in language learning and emergence.Comment: 12 pages, 6 figures, 3 tables; under review as a conference paper at ICLR 202

    Evaluating the role of quantitative modeling in language evolution

    No full text
    Models are a flourishing and indispensable area of research in language evolution. Here we highlight critical issues in using and interpreting models, and suggest viable approaches. First, contrasting models can explain the same data and similar modelling techniques can lead to diverging conclusions. This should act as a reminder to use the extreme malleability of modelling parsimoniously when interpreting results. Second, quantitative techniques similar to those used in modelling language evolution have proven themselves inadequate in other disciplines. Cross-disciplinary fertilization is crucial to avoid mistakes which have previously occurred in other areas. Finally, experimental validation is necessary both to sharpen models' hypotheses, and to support their conclusions. Our belief is that models should be interpreted as quantitative demonstrations of logical possibilities, rather than as direct sources of evidence. Only an integration of theoretical principles, quantitative proofs and empirical validation can allow research in the evolution of language to progress

    The Translocal Event and the Polyrhythmic Diagram

    Get PDF
    This thesis identifies and analyses the key creative protocols in translocal performance practice, and ends with suggestions for new forms of transversal live and mediated performance practice, informed by theory. It argues that ontologies of emergence in dynamic systems nourish contemporary practice in the digital arts. Feedback in self-organised, recursive systems and organisms elicit change, and change transforms. The arguments trace concepts from chaos and complexity theory to virtual multiplicity, relationality, intuition and individuation (in the work of Bergson, Deleuze, Guattari, Simondon, Massumi, and other process theorists). It then examines the intersection of methodologies in philosophy, science and art and the radical contingencies implicit in the technicity of real-time, collaborative composition. Simultaneous forces or tendencies such as perception/memory, content/ expression and instinct/intellect produce composites (experience, meaning, and intuition- respectively) that affect the sensation of interplay. The translocal event is itself a diagram - an interstice between the forces of the local and the global, between the tendencies of the individual and the collective. The translocal is a point of reference for exploring the distribution of affect, parameters of control and emergent aesthetics. Translocal interplay, enabled by digital technologies and network protocols, is ontogenetic and autopoietic; diagrammatic and synaesthetic; intuitive and transductive. KeyWorx is a software application developed for realtime, distributed, multimodal media processing. As a technological tool created by artists, KeyWorx supports this intuitive type of creative experience: a real-time, translocal “jamming” that transduces the lived experience of a “biogram,” a synaesthetic hinge-dimension. The emerging aesthetics are processual – intuitive, diagrammatic and transversal
    • …
    corecore