236 research outputs found

    Genetics and language: a neurobiological perspective on the missing link (-ing hypotheses)

    Get PDF
    The paper argues that both evolutionary and genetic approaches to studying the biological foundations of speech and language could benefit from fractionating the problem at a finer grain, aiming not to map genetics to “language”—or even subdomains of language such as “phonology” or “syntax”—but rather to link genetic results to component formal operations that underlie processing the comprehension and production of linguistic representations. Neuroanatomic and neurophysiological research suggests that language processing is broken down in space (distributed functional anatomy along concurrent pathways) and time (concurrent processing on multiple time scales). These parallel neuronal pathways and their local circuits form the infrastructure of speech and language and are the actual targets of evolution/genetics. Therefore, investigating the mapping from gene to brain circuit to linguistic phenotype at the level of generic computational operations (subroutines actually executable in these circuits) stands to provide a new perspective on the biological foundations in the healthy and challenged brain

    Phonological (un)certainty weights lexical activation

    Full text link
    Spoken word recognition involves at least two basic computations. First is matching acoustic input to phonological categories (e.g. /b/, /p/, /d/). Second is activating words consistent with those phonological categories. Here we test the hypothesis that the listener's probability distribution over lexical items is weighted by the outcome of both computations: uncertainty about phonological discretisation and the frequency of the selected word(s). To test this, we record neural responses in auditory cortex using magnetoencephalography, and model this activity as a function of the size and relative activation of lexical candidates. Our findings indicate that towards the beginning of a word, the processing system indeed weights lexical candidates by both phonological certainty and lexical frequency; however, later into the word, activation is weighted by frequency alone.Comment: 6 pages, 4 figures, accepted at: Cognitive Modeling and Computational Linguistics (CMCL) 201

    Mental Imagery of Speech and Movement Implicates the Dynamics of Internal Forward Models

    Get PDF
    The classical concept of efference copies in the context of internal forward models has stimulated productive research in cognitive science and neuroscience. There are compelling reasons to argue for such a mechanism, but finding direct evidence in the human brain remains difficult. Here we investigate the dynamics of internal forward models from an unconventional angle: mental imagery, assessed while recording high temporal resolution neuronal activity using magnetoencephalography. We compare two overt and covert tasks; our covert, mental imagery tasks are unconfounded by overt input/output demands – but in turn necessitate the development of appropriate multi-dimensional topographic analyses. Finger tapping (studies 1 and 2) and speech experiments (studies 3–5) provide temporally constrained results that implicate the estimation of an efference copy. We suggest that one internal forward model over parietal cortex subserves the kinesthetic feeling in motor imagery. Secondly, observed auditory neural activity ~170 ms after motor estimation in speech experiments (studies 3–5) demonstrates the anticipated auditory consequences of planned motor commands in a second internal forward model in imagery of speech production. Our results provide neurophysiological evidence from the human brain in favor of internal forward models deploying efference copies in somatosensory and auditory cortex, in finger tapping and speech production tasks, respectively, and also suggest the dynamics and sequential updating structure of internal forward models

    Cortical Oscillations in Auditory Perception and Speech: Evidence for Two Temporal Windows in Human Auditory Cortex

    Get PDF
    Natural sounds, including vocal communication sounds, contain critical information at multiple time scales. Two essential temporal modulation rates in speech have been argued to be in the low gamma band (∼20–80 ms duration information) and the theta band (∼150–300 ms), corresponding to segmental and diphonic versus syllabic modulation rates, respectively. It has been hypothesized that auditory cortex implements temporal integration using time constants closely related to these values. The neural correlates of a proposed dual temporal window mechanism in human auditory cortex remain poorly understood. We recorded MEG responses from participants listening to non-speech auditory stimuli with different temporal structures, created by concatenating frequency-modulated segments of varied segment durations. We show that such non-speech stimuli with temporal structure matching speech-relevant scales (∼25 and ∼200 ms) elicit reliable phase tracking in the corresponding associated oscillatory frequencies (low gamma and theta bands). In contrast, stimuli with non-matching temporal structure do not. Furthermore, the topography of theta band phase tracking shows rightward lateralization while gamma band phase tracking occurs bilaterally. The results support the hypothesis that there exists multi-time resolution processing in cortex on discontinuous scales and provide evidence for an asymmetric organization of temporal analysis (asymmetrical sampling in time, AST). The data argue for a mesoscopic-level neural mechanism underlying multi-time resolution processing: the sliding and resetting of intrinsic temporal windows on privileged time scales

    Analysis by Synthesis: A (Re-)Emerging Program of Research for Language and Vision

    Get PDF
    This contribution reviews (some of) the history of analysis by synthesis, an approach to perception and comprehension articulated in the 1950s. Whereas much research has focused on bottom-up, feed-forward, inductive mechanisms, analysis by synthesis as a heuristic model emphasizes a balance of bottom-up and knowledge-driven, top-down, predictive steps in speech perception and language comprehension. This idea aligns well with contemporary Bayesian approaches to perception (in language and other domains), which are illustrated with examples from different aspects of perception and comprehension. Results from psycholinguistics, the cognitive neuroscience of language, and visual object recognition suggest that analysis by synthesis can provide a productive way of structuring biolinguistic research. Current evidence suggests that such a model is theoretically well motivated, biologically sensible, and becomes computationally tractable borrowing from Bayesian formalizations

    TopoToolbox: Using Sensor Topography to Calculate Psychologically Meaningful Measures from Event-Related EEG/MEG

    Get PDF
    The open-source toolbox “TopoToolbox” is a suite of functions that use sensor topography to calculate psychologically meaningful measures (similarity, magnitude, and timing) from multisensor event-related EEG and MEG data. Using a GUI and data visualization, TopoToolbox can be used to calculate and test the topographic similarity between different conditions (Tian and Huber, 2008). This topographic similarity indicates whether different conditions involve a different distribution of underlying neural sources. Furthermore, this similarity calculation can be applied at different time points to discover when a response pattern emerges (Tian and Poeppel, 2010). Because the topographic patterns are obtained separately for each individual, these patterns are used to produce reliable measures of response magnitude that can be compared across individuals using conventional statistics (Davelaar et al. Submitted and Huber et al., 2008). TopoToolbox can be freely downloaded. It runs under MATLAB (The MathWorks, Inc.) and supports user-defined data structure as well as standard EEG/MEG data import using EEGLAB (Delorme and Makeig, 2004)

    Successes and critical failures of neural networks in capturing human-like speech recognition

    Full text link
    Natural and artificial audition can in principle evolve different solutions to a given problem. The constraints of the task, however, can nudge the cognitive science and engineering of audition to qualitatively converge, suggesting that a closer mutual examination would improve artificial hearing systems and process models of the mind and brain. Speech recognition - an area ripe for such exploration - is inherently robust in humans to a number transformations at various spectrotemporal granularities. To what extent are these robustness profiles accounted for by high-performing neural network systems? We bring together experiments in speech recognition under a single synthesis framework to evaluate state-of-the-art neural networks as stimulus-computable, optimized observers. In a series of experiments, we (1) clarify how influential speech manipulations in the literature relate to each other and to natural speech, (2) show the granularities at which machines exhibit out-of-distribution robustness, reproducing classical perceptual phenomena in humans, (3) identify the specific conditions where model predictions of human performance differ, and (4) demonstrate a crucial failure of all artificial systems to perceptually recover where humans do, suggesting a key specification for theory and model building. These findings encourage a tighter synergy between the cognitive science and engineering of audition
    corecore