640 research outputs found

    Cortical tracking of speech in noise accounts for reading strategies in children

    Get PDF
    Humans’ propensity to acquire literacy relates to several factors, including the ability to understand speech in noise (SiN). Still, the nature of the relation between reading and SiN perception abilities remains poorly understood. Here, we dissect the interplay between (1) reading abilities, (2) classical behavioral predictors of reading (phonological awareness, phonological memory, and rapid automatized naming), and (3) electrophysiological markers of SiN perception in 99 elementary school children (26 with dyslexia). We demonstrate that, in typical readers, cortical representation of the phrasal content of SiN relates to the degree of development of the lexical (but not sublexical) reading strategy. In contrast, classical behavioral predictors of reading abilities and the ability to benefit from visual speech to represent the syllabic content of SiN account for global reading performance (i.e., speed and accuracy of lexical and sublexical reading). In individuals with dyslexia, we found preserved integration of visual speech information to optimize processing of syntactic information but not to sustain acoustic/phonemic processing. Finally, within children with dyslexia, measures of cortical representation of the phrasal content of SiN were negatively related to reading speed and positively related to the compromise between reading precision and reading speed, potentially owing to compensatory attentional mechanisms. These results clarify the nature of the relation between SiN perception and reading abilities in typical child readers and children with dyslexia and identify novel electrophysiological markers of emergent literacy

    Dichotic integration of acoustic-phonetic information: Competition from extraneous formants increases the effect of second-formant attenuation on intelligibility

    Get PDF
    Differences in ear of presentation and level do not prevent effective integration of concurrent speech cues such as formant frequencies. For example, presenting the higher formants of a consonant-vowel syllable in the opposite ear to the first formant protects them from upward spread of masking, allowing them to remain effective speech cues even after substantial attenuation. This study used three-formant (F1+F2+F3) analogues of natural sentences and extended the approach to include competitive conditions. Target formants were presented dichotically (F1+F3; F2), either alone or accompanied by an extraneous competitor for F2 (i.e., F1±F2C+F3; F2) that listeners must reject to optimize recognition. F2C was created by inverting the F2 frequency contour and using the F2 amplitude contour without attenuation. In experiment 1, F2C was always absent and intelligibility was unaffected until F2 attenuation exceeded 30 dB; F2 still provided useful information at 48-dB attenuation. In experiment 2, attenuating F2 by 24 dB caused considerable loss of intelligibility when F2C was present, but had no effect in its absence. Factors likely to contribute to this interaction include informational masking from F2C acting to swamp the acoustic-phonetic information carried by F2, and interaural inhibition from F2C acting to reduce the effective level of F2

    The Interplay Between Interference Control and L2 Proficiency in L2 Auditory Sentence Comprehension in the Presence of Verbal and Non-Verbal Masking

    Full text link
    Speech perception and comprehension in the presence of interfering auditory stimuli is a challenge for bilingual listeners (e.g., Ezzatian, Avivi-Reich, & Schneider, 2010; Krizman, Bradlow, Lam, & Kraus, 2017). How efficiently and skillfully listeners manage auditory interference may also be closely related to their ability to pay attention to a target and suppress irrelevant information. Based on Friedman and Miyake’s (2004) framework of interference control, this dissertation investigated the underlying mechanisms of late Korean-English bilingual individuals’ auditory interference control in the presence of auditory verbal and nonverbal masking and evaluated the potential interaction between L2 proficiency and interference control. Two groups of late bilingual listeners with high and mid L2 proficiency participated in three experiments. Experiment 1 investigated the interplay between interference control and L2 proficiency in bilingual listeners. Seventy Korean-English bilingual participants with high- and mid-L2 proficiency levels were recruited and tested with an L2 auditory sentence comprehension test. In this task, participants listened to English target sentences with and without masking (i.e., an auditory distractor). For each sentence, they judged semantic plausibility. Three masking conditions were presented: nonverbal speech-modulated noise, L1 verbal, and L2 verbal masking, to see the effect of different types of auditory interference during L2 listening. The results of the plausibility task indicated that the effect of the verbal masking was dependent on the listeners’ L2 proficiency. The effects of the L1 and L2 masking did not differ significantly for the high-proficiency L2 listeners when they listened in L2. However, the L1 masking had a significantly greater interference effect than the L2 masking on L2 listening among the mid-proficiency listeners. This suggests an interaction between L2 proficiency and interference control. Experiment 2 examined to see whether there is interference effect beyond L2 proficiency effect found in Experiment 1. The participants from Experiment 1 engaged in an auditory sentence comprehension task, called word selection task, whereby they listened to English target sentences presented with L1 or L2 verbal masking. The procedure and the type of stimuli of this task were exactly the same as those in Experiment 1. In particular, the participants were still asked to pay attention to the target sentence based on a given picture cue. However, instead of making plausibility judgments, they were asked to select all the words that they had heard. . From the list, they were asked to select all the words that they had heard. The word list included two words extracted from the target sentence, two words from the masking sentence, and four words that were not presented but had a semantic or phonological association with the other four words from the two presented sentences. The results showed an interaction between proficiency and interference effect. The high-proficiency group identified a similar number of target words in both L1 and L2 masking conditions whereas the mid-proficiency group identified more target words in the L2 masking condition, suggesting a proficiency effect. On the other hand, both groups identified more content words from the non-target sentence in the L1 masking sentence than from the L2 masking sentence, suggesting a greater interference effect from L1 masking than L2 masking. Interestingly, both groups identified a similar number of non-target words in the L1 masking condition. Nonetheless, the high-proficiency group identified more content words from the target sentence than the mid-proficiency group. These findings suggest that the high-proficiency L2 listeners have better attentional control on the target stimuli than the mid-proficiency group. The results that L2 listeners identified content words in both the target and masking sentences, particularly in L1, suggest that the major difference between the groups lies in their ability to divide their attention and orient it to the target signal. Experiment 3 employed a nonverbal auditory interference control task to investigate whether the group difference in the verbal task reflects primarily language-specific or domain-general cognitive-control systems. The same participant groups listened to and simultaneously counted two types of target animal sounds masked by various other animal sounds. The results showed that the high and mid-proficiency groups performed similarly on this nonverbal task, unlike the verbal task in Experiment 2. This suggests that the suppression of nonverbal interference involves a domain-general interference control system whereas verbal interference may require an additional domain-specific control ability above and beyond domain-general cognitive control. This study provides novel evidence that the effects of auditory interference in a bilingual’s two languages on L2 listening differ according to the listeners’ L2 proficiency. Second language listening with accompanying auditory interference requires interference control. This control ability is subserved – at least in part - by a different control system from the one for nonverbal materials. Sentence comprehension in L2 was more adversely affected by L1 interference than by L2 interference, particularly in bilingual individuals with mid-proficiency in L2. However, L2 listeners with high L2 proficiency exhibited a better control ability in suppressing the L1 interference than the mid-proficiency listeners. These findings highlight the importance of considering both language proficiency and interference control abilities combined regarding L2 listening comprehension for individuals who listen in an L2

    Formant-frequency variation and informational masking of speech by extraneous formants:evidence against dynamic and speech-specific acoustical constraints

    Get PDF
    How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints

    Effects of spatial separation on across-frequency grouping in narrowband speech

    Full text link
    Thesis (M.S.)--Boston UniversityUnderstanding how we perceive speech in the face of competing sound sources coming from a variety of directions is an important goal in psychoacoustics. In everyday situations, noisy interference can obscure the content of a conversation and require listeners to integrate speech information across different frequency regions. Two studies will be explained that investigate the effects of spatial separation on the grouping of two spectrally separated, narrow bands of target speech with a variety of filler stimuli centered in between these bands. Target sentences taken from the IEEE corpus were broken into two 3/4-octave bands with the lowest centered around 370 Hz and the highest centered around 6kHz. The first study explored the spatial influences of spectral restoration. The primary experiment measured speech intelligibility of the speech bands (presented diotically) with a single band of noise between 700 Hz and 3 kHz used as the filler and then with the same noise band modulated by the target speech envelope as the filler. These fillers were presented diotically as well as with an ITD of 600 s leading to the left ear. Performance was worse for the unmodulated noise condition when the filler was separated spatially from the speech bands. Across-frequency grouping was not observed with the modulated noise conditions. The second study explored the effect of attention on intelligibility of speech bands presented from the left with related fillers. The filler objects used in this study were dual bands of vocoded or narrowband speech presented either from left or right. The fillers were derived from either the same target speech token (matched) or an independent sentence (conflicting). In a key experimental block, listeners were instructed to attend to the target speech on the left while either conflicting bands or, infrequently, matched bands were presented on the right. The infrequently presented matching trials were physically identical to trials in another block where listeners were instructed to attend to both ears. Results showed that splitting the target and filler across the ears degraded intelligibility, however, directed spatial attention had no effect on performance. These results demonstrate that speech elements group together strongly, overcoming spatial attention, even for degraded speech

    The directional effect of target position on spatial selective auditory attention

    Get PDF
    Spatial selective auditory attention plays a crucial role in listening in a mixture of competing speech sounds. Previous neuroimaging studies have reported alpha band neural activity modulated by auditory attention, along with the alpha lateralization corresponding to attentional focus. A greater cortical representation of the attended speech envelope compared to the ignored speech envelope was also found, a phenomenon known as \u27neural speech tracking’. However, little is known about the neural activities when attentional focus is directed on speech sounds from behind the listener, even though understanding speech from behind is a common and essential aspect of daily life. The objectives of this study are to investigate the impact of four distinct target positions (left, right, front, and particularly, behind) on spatial selective auditory attention by concurrently assessing 1) spatial selective speech identification, 2) oscillatory alpha-band power, and 3) neural speech tracking. Fifteen young adults with normal hearing (NH) were enrolled in this study (M = 21.40, ages 18-29; 10 females). The selective speech identification task indicated that the target position presented at back was the most challenging condition, followed by the front condition, with the lateral condition being the least demanding. The normalized alpha power was modulated by target position and the power was significantly lateralized to either the left or right side, not the front and back. The parieto-occipital alpha power in front-back configuration was significantly lower than the results for left-right listening configuration and the normalized alpha power in the back condition was significantly higher than in the front condition. The speech tracking function of to-be-attended speech envelope was affected by the direction of ix target stream. The behavioral outcome (selective speech identification) was correlated with parieto-occipital alpha power and neural speech tracking correlation coefficient as neural correlates of auditory attention, but there was no significant correlation between alpha power and neural speech tracking. The results suggest that in addition to existing mechanism theories, it might be necessary to consider how our brain responds depending on the location of the sound in order to interpret the neural correlates and behavioral consequences in a meaningful way as well as a potential application of neural speech tracking in studies on spatial selective hearing

    Decoding auditory attention and neural language processing in adverse conditions and different listener groups

    Get PDF
    This thesis investigated subjective, behavioural and neurophysiological (EEG) measures of speech processing in various adverse conditions and with different listener groups. In particular, this thesis focused on different neural processing stages and their relationship with auditory attention, effort, and measures of speech intelligibility. Study 1 set the groundwork by establishing a toolbox of various neural measures to investigate online speech processing, from the frequency following response (FFR) and cortical measures of speech processing, to the N400, a measure of lexico-semantic processing. Results showed that peripheral processing is heavily influenced by stimulus characteristics such as degradation, whereas central processing units are more closely linked to higher-order phenomena such as speech intelligibility. In Study 2, a similar experimental paradigm was used to investigate differences in neural processing between a hearing-impaired and a normal-hearing group. Subjects were presented with short stories in different levels of multi-talker babble noise, and with different settings on their hearing aids. Findings indicate that, particularly at lower noise levels, the hearing-impaired group showed much higher cortical entrainment than the normal- hearing group, despite similar levels of speech recognition. Intersubject correlation, another global neural measure of auditory attention, however, was similarly affected by noise levels in both the hearing-impaired and the normal-hearing group. This finding indicates extra processing in the hearing-impaired group only on the level of the auditory cortex. Study 3, in contrast to Studies 1 and 2 (which both investigated the effects of bottom-up factors on neural processing), examined the links between entrainment and top-down factors, specifically motivation; as well as reasons for the 5 higher entrainment found in hearing-impaired subjects in Study 2. Results indicated that, while behaviourally there was no difference between incentive and non-incentive conditions, neurophysiological measures of attention such as intersubject correlation were affected by the presence of an incentive to perform better. Moreover, using a specific degradation type resulted in subjects’ increased cortical entrainment under degraded conditions. These findings support the hypothesis that top-down factors such as motivation influence neurophysiological measures; and that higher entrainment to degraded speech might be triggered specifically by the reduced availability of spectral detail contained in speech

    A microscopic analysis of consistent word misperceptions.

    Get PDF
    162 p.Speech misperceptions have the potential to help us understand the mechanisms involved in human speech processing. Consistent misperceptions are especially helpful in this regard, eliminating the variability stemming from individual differences, which in turn, makes it easier to analyse confusion patterns at higher levels of speech inits such as the word. In this thesis, we haver a conducter an analysis of consistens word misperceptions from a "microscopic" perspective. Starting with a large-scale elicitation experiment, we collected over 3200 consistent misperceptions from over 170 listeners. We investigated the obtained misperceptions from signal-idependent and a signal-dependent perspective. In the former, we have analysed error trends between the target and misperceived words across multiple levels of speech units. We have shown that the error patterns observed are highly dependent on the eliciting masker type and contrasted our results to previous findings. In the latter, We attempted to explain misperceptions based on the underlying speech noise interaction. Using tools from automatic speech recognition, we have conducted an automatic classification of confusions based on their origin and quantified the role misallocation of speech fragments played in the generation of misperceptions. Finally, we introduced modifications to the original confusion eliciting stimuli to try to recover the original utterance by providing release from either themasker`s energetic or informational component. ListenersÂżpercepts were reevaluated in response to the modified stimuli which revealed the origin of many confusions regarding energetic or informational masking

    Simultaneous Bilinguals’ Comprehension of Accented Speech

    Get PDF
    L2-accented speech recognition has typically been studied with monolingual listeners or late L2-learners, but simultaneous bilinguals may have a different experience: their two phonologies offer flexibility in phonological-lexical mapping (Samuel and Larraza, 2015), which may be advantageous. On the other hand, the two languages cause greater lexical competition (Marian & Spivey, 2003), which may impede successful L2-accented speech recognition. The competition between a bilinguals’ two languages is the oft-cited explanation, for example, as to why bilinguals underperform monolinguals in native-accented speech-in-noise tasks (Rogers et al., 2006). To investigate the effect of bilingualism on L2-accented speech recognition, the current studies compare monolingual and simultaneous bilingual listeners in three separate experiments. In the first study, both groups repeated sentences produced by speakers of Mandarin-accented English whose English proficiencies varied. In the second study, the stimuli were presented in varying levels and types of noise, and a native-accented speaker was included. In each of these first two studies, the sentences were semantically anomalous (i.e., nonsensical). In the third study, the stimuli were meaningful sentences, presented in a single noise condition, and spoken by either a native speaker or an L2-accented speaker. Mixed effects models revealed differences in L2-accented speech recognition measures driven by listeners’ language backgrounds only in Experiments 2 and 3; in Experiment 1, performance between groups was statistically identical. Results in Experiments 2 and 3 also replicated the prior finding that bilinguals perform worse for native-accented speech in noise. We propose that neither a flexible phonological-lexical mapping system nor increased lexical competition can alone sufficiently explain the deficit (relative to monolinguals) that simultaneous bilinguals exhibit when faced with L2-accented speech in real-world listening conditions. We discuss the possible implications of processing capacity and cognitive load, and suggest that these two factors are more likely to contribute to experimental outcomes. Future studies with pupillometry to explore these hypotheses are also discussed
    • …
    corecore