6,168 research outputs found

    The phonetics of second language learning and bilingualism

    Get PDF
    This chapter provides an overview of major theories and findings in the field of second language (L2) phonetics and phonology. Four main conceptual frameworks are discussed and compared: the Perceptual Assimilation Model-L2, the Native Language Magnet Theory, the Automatic Selection Perception Model, and the Speech Learning Model. These frameworks differ in terms of their empirical focus, including the type of learner (e.g., beginner vs. advanced) and target modality (e.g., perception vs. production), and in terms of their theoretical assumptions, such as the basic unit or window of analysis that is relevant (e.g., articulatory gestures, position-specific allophones). Despite the divergences among these theories, three recurring themes emerge from the literature reviewed. First, the learning of a target L2 structure (segment, prosodic pattern, etc.) is influenced by phonetic and/or phonological similarity to structures in the native language (L1). In particular, L1-L2 similarity exists at multiple levels and does not necessarily benefit L2 outcomes. Second, the role played by certain factors, such as acoustic phonetic similarity between close L1 and L2 sounds, changes over the course of learning, such that advanced learners may differ from novice learners with respect to the effect of a specific variable on observed L2 behavior. Third, the connection between L2 perception and production (insofar as the two are hypothesized to be linked) differs significantly from the perception-production links observed in L1 acquisition. In service of elucidating the predictive differences among these theories, this contribution discusses studies that have investigated L2 perception and/or production primarily at a segmental level. In addition to summarizing the areas in which there is broad consensus, the chapter points out a number of questions which remain a source of debate in the field today.https://drive.google.com/open?id=1uHX9K99Bl31vMZNRWL-YmU7O2p1tG2wHhttps://drive.google.com/open?id=1uHX9K99Bl31vMZNRWL-YmU7O2p1tG2wHhttps://drive.google.com/open?id=1uHX9K99Bl31vMZNRWL-YmU7O2p1tG2wHAccepted manuscriptAccepted manuscrip

    PRESENCE: A human-inspired architecture for speech-based human-machine interaction

    No full text
    Recent years have seen steady improvements in the quality and performance of speech-based human-machine interaction driven by a significant convergence in the methods and techniques employed. However, the quantity of training data required to improve state-of-the-art systems seems to be growing exponentially and performance appears to be asymptotic to a level that may be inadequate for many real-world applications. This suggests that there may be a fundamental flaw in the underlying architecture of contemporary systems, as well as a failure to capitalize on the combinatorial properties of human spoken language. This paper addresses these issues and presents a novel architecture for speech-based human-machine interaction inspired by recent findings in the neurobiology of living systems. Called PRESENCE-"PREdictive SENsorimotor Control and Emulation" - this new architecture blurs the distinction between the core components of a traditional spoken language dialogue system and instead focuses on a recursive hierarchical feedback control structure. Cooperative and communicative behavior emerges as a by-product of an architecture that is founded on a model of interaction in which the system has in mind the needs and intentions of a user and a user has in mind the needs and intentions of the system

    Recognizing Speech in a Novel Accent: The Motor Theory of Speech Perception Reframed

    Get PDF
    The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory

    Transfer Effect of Speech-sound Learning on Auditory-motor Processing of Perceived Vocal Pitch Errors

    Get PDF
    Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production

    Written sentence context effects on acoustic-phonetic perception: fMRI reveals cross-modal semantic-perceptual interactions

    Get PDF
    Available online 3 October 2019.This study examines cross-modality effects of a semantically-biased written sentence context on the perception of an acoustically-ambiguous word target identifying neural areas sensitive to interactions between sentential bias and phonetic ambiguity. Of interest is whether the locus or nature of the interactions resembles those previously demonstrated for auditory-only effects. FMRI results show significant interaction effects in right mid-middle temporal gyrus (RmMTG) and bilateral anterior superior temporal gyri (aSTG), regions along the ventral language comprehension stream that map sound onto meaning. These regions are more anterior than those previously identified for auditory-only effects; however, the same cross-over interaction pattern emerged implying similar underlying computations at play. The findings suggest that the mechanisms that integrate information across modality and across sentence and phonetic levels of processing recruit amodal areas where reading and spoken lexical and semantic access converge. Taken together, results support interactive accounts of speech and language processing.This work was supported in part by the National Institutes of Health, NIDCD grant RO1 DC006220

    Cortical representations for phonological quantity

    Get PDF
    Different languages use temporal speech cues in different linguistic functions. In Finnish, speech-sound duration is used as the primary cue for the phonological quantity distinction ― i.e., a distinction between short and long phonemes. For the second-language (L2) learners of Finnish, quantity is often difficult to master if speech-sound duration plays a less important role in the phonology of their native language (L1). The present studies aimed to investigate the cortical representations for phonological quantity in native speakers and L2 users of Finnish by using behavioral and electrophysiological methods. Since long-term memory representations for different speech units have been previously shown to participate in the elicitation of the mismatch negativity (MMN) brain response, MMN was used to compare the neural representation for quantity between native speakers and L2 users of Finnish. The results of the studies suggested that native Finnish speakers' MMN response to quantity was determined by the activation of native-language phonetic prototypes rather than by phoneme boundaries. In addition, native speakers seemed to process phoneme quantity and quality independently from each other by separate brain representations. The cross-linguistic MMN studies revealed that, in native speakers of Finnish, the MMN response to duration or quantity-degree changes was enhanced in amplitude selectively in speech sounds, whereas this pattern was not observed in L2 users. Native speakers' MMN enhancement is suggested to be due to the pre-attentive activation of L1 prototypes for quantity. In L2 users, the activation of L2 prototypes or other L2 learning effects were not reflected in the MMN, with one exception. Even though L2 users failed to show native-like brain responses to duration changes in a vowel that was similar in L1 and L2, their duration MMN response was native-like for an L2 vowel with no counterpart in L1. Thus, the pre-attentive activation of L2 users' representations was determined by the degree of similarity of L2 sounds to L1 sounds. In addition, behavioral experiments suggested that the establishment of representations for L2 quantity may require several years of language exposure.Eri kielet käyttävät äänteiden kestovihjeitä erilaisissa kielellisissä tehtävissä. Suomessa äänteen kesto on kvantiteetin eli lyhyiden ja pitkien foneemien erottamisen tärkein vihje. Suomea toisena kielenä puhuvien on usein vaikea oppia kvantiteetti, jos äänteen kestolla on vähäisempi merkitys heidän äidinkielessään. Tämän väitöskirjan tutkimuksissa tarkasteltiin aivovastemittausten ja foneettisten testien avulla kvantiteettikategorioiden muistijälkiä äidinkieleltään suomenkielisten ja suomea toisena kielenä puhuvien aivokuorella. Koska äänteiden pitkäkestoisten muistijälkien on aiemmin todettu osallistuvan MMN-aivovasteen (engl. mismatch negativity) syntymiseen, sitä käytettiin selvittämään, miten äidinkieleltään suomenkielisten ja suomea toisena kielenä puhvien kvantiteettikategorioiden muistijäljet poikkeavat toisistaan. Tulosten mukaan äidinkieleltään suomenkielisten MMN-vasteen voimakkuus määräytyi äidinkielen foneettisten prototyyppien mukaan eikä foneemien rajojen mukaan. Lisäksi suomenkieliset näyttivät käsittelevän foneemin laadun ja kvantiteetin toisistaan riippumattomasti, eri kategorioiden kautta. Kieliryhmiä vertailevat MMN-tutkimukset puolestaan osoittivat, että äidinkieleltään suomenkielisillä MMN-vaste keston muutokselle oli nimenomaan puheäänteiden kohdalla voimakkaampi kuin toisen kielen oppijoilla. Tämä saattaa johtua foneettisten prototyyppien esitietoisesta aktivoitumisesta äidinkieleltään suomenkielisten aivokuorella. Toisen kielen oppijoilla toisen kielen prototyyppien aktivoituminen tai kielenoppiminen ylipäänsä eivät näkyneet MMN-vasteessa yhtä poikkeusta lukuun ottamatta. Vaikka suomea toisena kielenä puhuvien aivovasteiden voimakkuus äänteen keston muutokselle ei saavuttanut äidinkielisten tasoa vokaalissa, joka oli samankaltainen heidän äidinkielessään ja toisessa kielessään, he saavuttivat suomenkielisten tason vokaalissa, jolla ei ollut vastinetta heidän äidinkielessään. Näin ollen äidinkielen ja toisen kielen äänteiden samankaltaisuus näyttäisi vaikuttavan suomea toisena kielenä puhuvien muistijälkien esitietoiseen aktivoitumiseen. Tutkimuksen foneettisten kokeiden mukaan kategorioiden syntyminen toisen kielen kvantiteetille saattaa vaatia useiden vuosien altistusta toiselle kielelle

    The Resonant Dynamics of Speech Perception: Interword Integration and Duration-Dependent Backward Effects

    Full text link
    How do listeners integrate temporally distributed phonemic information into coherent representations of syllables and words? During fluent speech perception, variations in the durations of speech sounds and silent pauses can produce different pereeived groupings. For exarnple, increasing the silence interval between the words "gray chip" may result in the percept "great chip", whereas increasing the duration of fricative noise in "chip" may alter the percept to "great ship" (Repp et al., 1978). The ARTWORD neural model quantitatively simulates such context-sensitive speech data. In AHTWORD, sequential activation and storage of phonemic items in working memory provides bottom-up input to unitized representations, or list chunks, that group together sequences of items of variable length. The list chunks compete with each other as they dynamically integrate this bottom-up information. The winning groupings feed back to provide top-down supportto their phonemic items. Feedback establishes a resonance which temporarily boosts the activation levels of selected items and chunks, thereby creating an emergent conscious percept. Because the resonance evolves more slowly than wotking memory activation, it can be influenced by information presented after relatively long intervening silence intervals. The same phonemic input can hereby yield different groupings depending on its arrival time. Processes of resonant transfer and competitive teaming help determine which groupings win the competition. Habituating levels of neurotransmitter along the pathways that sustain the resonant feedback lead to a resonant collapsee that permits the formation of subsequent. resonances.Air Force Office of Scientific Research (F49620-92-J-0225); Defense Advanced Research projects Agency and Office of Naval Research (N00014-95-1-0409); National Science Foundation (IRI-97-20333); Office of Naval Research (N00014-92-J-1309, NOOO14-95-1-0657

    A Mechanistic Approach to Cross-Domain Perceptual Narrowing in the First Year of Life

    Get PDF
    Language and face processing develop in similar ways during the first year of life. Early in the first year of life, infants demonstrate broad abilities for discriminating among faces and speech. These discrimination abilities then become tuned to frequently experienced groups of people or languages. This process of perceptual development occurs between approximately 6 and 12 months of age and is largely shaped by experience. However, the mechanisms underlying perceptual development during this time, and whether they are shared across domains, remain largely unknown. Here, we highlight research findings across domains and propose a top-down/bottom-up processing approach as a guide for future research. It is hypothesized that perceptual narrowing and tuning in development is the result of a shift from primarily bottom-up processing to a combination of bottom-up and top-down influences. In addition, we propose word learning as an important top-down factor that shapes tuning in both the speech and face domains, leading to similar observed developmental trajectories across modalities. Importantly, we suggest that perceptual narrowing/tuning is the result of multiple interacting factors and not explained by the development of a single mechanism

    Building phonetic categories: an argument for the role of sleep

    Get PDF
    The current review provides specific predictions for the role of sleep-mediated memory consolidation in the formation of new speech sound representations. Specifically, this discussion will highlight selected literature on the different ideas concerning category representation in speech, followed by a broad overview of memory consolidation and how it relates to human behavior, as relevant to speech/perceptual learning. In combining behavioral and physiological accounts from animal models with insights from the human consolidation literature on auditory skill/word learning, we are in the early stages of understanding how the transfer of experiential information between brain structures during sleep manifests in changes to online perception. Arriving at the conclusion that this process is crucial in perceptual learning and the formation of novel categories, further speculation yields the adjacent claim that the habitual disruption in this process leads to impoverished quality in the representation of speech sounds
    corecore