176 research outputs found

    Acoustic Correlates and Adult Perceptions of Distress in Infant Speech-Like Vocalizations and Cries

    Get PDF
    Prior research has not evaluated acoustic features contributing to perception of human infant vocal distress or lack thereof on a continuum. The present research evaluates perception of infant vocalizations along a continuum ranging from the most prototypical intensely distressful cry sounds (“wails”) to the most prototypical of infant sounds that typically express no distress (non-distress “vocants”). Wails are deemed little if at all related to speech while vocants are taken to be clear precursors to speech. We selected prototypical exemplars of utterances representing the whole continuum from 0 and 1 month-olds. In this initial study of the continuum, our goals are to determine (1) listener agreement on level of vocal distress across the continuum, (2) acoustic parameters predicting ratings of distress, (3) the extent to which individual listeners maintain or change their acoustic criteria for distress judgments across the study, (4) the extent to which different listeners use similar or different acoustic criteria to make judgments, and (5) the role of short-term experience among the listeners in judgments of infant vocalization distress. Results indicated that (1) both inter-rater and intra-rater listener agreement on degree of vocal distress was high, (2) the best predictors of vocal distress were number of vibratory regimes within utterances, utterance duration, spectral ratio (spectral concentration) in vibratory regimes within utterances, and mean pitch, (3) individual listeners significantly modified their acoustic criteria for distress judgments across the 10 trial blocks, (4) different listeners, while showing overall similarities in ratings of the 42 stimuli, also showed significant differences in acoustic criteria used in assigning the ratings of vocal distress, and (5) listeners who were both experienced and inexperienced in infant vocalizations coding showed high agreement in rating level of distress, but differed in the extent to which they relied on the different acoustic cues in making the ratings. The study provides clearer characterization of vocal distress expression in infants based on acoustic parameters and a new perspective on active adult perception of infant vocalizations. The results also highlight the importance of vibratory regime segmentation and analysis in acoustically based research on infant vocalizations and their perception

    Transcranial Direct Current Stimulation Combined With Listening to Preferred Music Alters Cortical Speech Processing in Older Adults

    Get PDF
    Emerging evidence suggests transcranial direct current stimulation (tDCS) can improve cognitive performance in older adults. Similarly, music listening may improve arousal and stimulate subsequent performance on memory-related tasks. We examined the synergistic effects of tDCS paired with music listening on auditory neurobehavioral measures to investigate causal evidence of short-term plasticity in speech processing among older adults. In a randomized sham-controlled crossover study, we measured how combined anodal tDCS over dorsolateral prefrontal cortex (DLPFC) paired with listening to autobiographically salient music alters neural speech processing in older adults compared to either music listening (sham stimulation) or tDCS alone. EEG assays included both frequency-following responses (FFRs) and auditory event-related potentials (ERPs) to trace neuromodulation-related changes at brainstem and cortical levels. Relative to music without tDCS (sham), we found tDCS alone (without music) modulates the early cortical neural encoding of speech in the time frame of ∼100–150 ms. Whereas tDCS by itself appeared to largely produce suppressive effects (i.e., reducing ERP amplitude), concurrent music with tDCS restored responses to those of the music+sham levels. However, the interpretation of this effect is somewhat ambiguous as this neural modulation could be attributable to a true effect of tDCS or presence/absence music. Still, the combined benefit of tDCS+music (above tDCS alone) was correlated with listeners’ education level suggesting the benefit of neurostimulation paired with music might depend on listener demographics. tDCS changes in speech-FFRs were not observed with DLPFC stimulation. Improvements in working memory pre to post session were also associated with better speech-in-noise listening skills. Our findings provide new causal evidence that combined tDCS+music relative to tDCS-alone (i) modulates the early (100–150 ms) cortical encoding of speech and (ii) improves working memory, a cognitive skill which may indirectly bolster noise-degraded speech perception in older listeners

    The role of the auditory brainstem in processing musically-relevant pitch

    Get PDF
    Neuroimaging work has shed light on the cerebral architecture involved in processing the melodic and harmonic aspects of music. Here, recent evidence is reviewed illustrating that subcortical auditory structures contribute to the early formation and processing of musically-relevant pitch. Electrophysiological recordings from the human brainstem and population responses from the auditory nerve reveal that nascent features of tonal music (e.g., consonance/dissonance, pitch salience, harmonic sonority) are evident at early, subcortical levels of the auditory pathway. The salience and harmonicity of brainstem activity is strongly correlated with listeners’ perceptual preferences and perceived consonance for the tonal relationships of music. Moreover, the hierarchical ordering of pitch intervals/chords described by the Western music practice and their perceptual consonance is well-predicted by the salience with which pitch combinations are encoded in subcortical auditory structures. While the neural correlates of consonance can be tuned and exaggerated with musical training, they persist even in the absence of musicianship or long-term enculturation. As such, it is posited that the structural foundations of musical pitch might result from innate processing performed by the central auditory system. A neurobiological predisposition for consonant, pleasant sounding pitch relationships may be one reason why these pitch combinations have been favored by composers and listeners for centuries. It is suggested that important perceptual dimensions of music emerge well before the auditory signal reaches cerebral cortex and prior to attentional engagement. While cortical mechanisms are no doubt critical to the perception, production, and enjoyment of music, the contribution of subcortical structures implicates a more integrated, hierarchically organized network underlying music processing within the brain

    Subcortical sources dominate the neuroelectric auditory frequency-following response to speech

    No full text
    Frequency-following responses (FFRs) are neurophonic potentials that provide a window into the encoding of complex sounds (e.g., speech/music), auditory disorders, and neuroplasticity. While the neural origins of the FFR remain debated, renewed controversy has reemerged after demonstration that FFRs recorded via magnetoencephalography (MEG) are dominated by cortical rather than brainstem structures as previously assumed. Here, we recorded high-density (64 ch) FFRs via EEG and applied state-of-the art source imaging techniques to multichannel data (discrete dipole modeling, distributed imaging, independent component analysis, computational simulations). Our data confirm a mixture of generators localized to bilateral auditory nerve (AN), brainstem inferior colliculus (BS), and bilateral primary auditory cortex (PAC). However, frequency-specific scrutiny of source waveforms showed the relative contribution of these nuclei to the aggregate FFR varied across stimulus frequencies. Whereas AN and BS sources produced robust FFRs up to ∼700 Hz, PAC showed weak phase-locking with little FFR energy above the speech fundamental (100 Hz). Notably, CLARA imaging further showed PAC activation was eradicated for FFRs \u3e150 Hz, above which only subcortical sources remained active. Our results show (i) the site of FFR generation varies critically with stimulus frequency; and (ii) opposite the pattern observed in MEG, subcortical structures make the largest contribution to electrically recorded FFRs (AN ≥ BS \u3e PAC). We infer that cortical dominance observed in previous neuromagnetic data is likely due to the bias of MEG to superficial brain tissue, underestimating subcortical structures that drive most of the speech-FFR. Cleanly separating subcortical from cortical FFRs can be achieved by ensuring stimulus frequencies are \u3e150–200 Hz, above the phase-locking limit of cortical neurons

    Objective information-theoretic algorithm for detecting brainstem-evoked responses to complex stimuli

    No full text
    Background: The scalp-recorded frequency-following response (FFR), an auditory-evoked potential with putative neural generators in the rostral brainstem, provides a robust representation of the neurophysiologic encoding of complex stimuli. The FFR is rapidly becoming a valuable tool for understanding the neural transcription of speech and music, language-related processing disorders, and brain plasticity at initial stages of the auditory pathway. Despite its potential clinical and empirical utility, determining the presence of a response is still dependent on the subjective interpretation by an experimenter/ clinician. Purpose: The purpose of the present work was to develop and validate a fully objective procedure for the automatic detection of FFRs elicited by complex auditory stimuli, including speech. Research Design: Mutual information (MI) was computed between the spectrographic representation of neural FFRs and their evoking acoustic stimuli to quantify the amount of shared time-frequency information between electrophysiologic responses and stimulus acoustics. To remove human subjectivity associated with typical response evaluation, FFRs were first simulated at known signal-to-noise ratios using a computational model of the auditory periphery. The MI at which model FFRs contained +3 dB Signal-to-noise ratio was taken as the criterion threshold (θMI) for the presence of a response. θMI was then applied as a binary classifier on actual neurophysiologic responses recorded previously in human participants (n = 35). Sham recordings, in which no stimulus was presented to participants, allowed us to determine the receiver operating characteristics of the MI metric and the capabilities of the algorithm to segregate true evoked responses from sham recordings. Results: Results showed high overall accuracy (93%) in the metric\u27s ability to identify true responses from sham recordings. The metric\u27s overall performance was considerably better than trained human observers who, on average, accurately identified only z75% of the true neural responses. Complementary results were found in the metric\u27s receiver operating characteristic test performance characteristics with a sensitivity and specificity of 97% and 85%, respectively. Additionally, MI increased monotonically and was asymptotic with increasing trials (i.e., sweeps) contributing to the averaged FFR and, thus, can be used as a stopping criteria for signal averaging. Conclusions: The present results demonstrate that the mutual information between a complex acoustic stimulus and its corresponding brainstem response can provide a completely objective and robust method for automated FFR detection. Application of the MI metric to evoked potential speech audiometry testing may provide clinicians with a more robust tool to quantitatively evaluate the presence and quality of speech-evoked brainstem responses ultimately minimizing subjective interpretation and human error

    Relative contribution of envelope and fine structure to the subcortical encoding of noise-degraded speech

    No full text
    Brainstem frequency-following responses (FFR) were elicited to the speech token /ama/ in noise containing only envelope (ENV) or fine structure (TFS) cues to assess the relative contribution of these temporal features to the neural encoding of degraded speech. Successive cue removal weakened FFRs with noise having the most deleterious effect on TFS coding. Neuro-acoustic and response-to-response correlations revealed speech-FFRs are dominated by stimulus ENV for clean speech, with TFS making a stronger contribution in moderate noise levels. Results suggest that the relative weighting of temporal ENV and TFS cues to the neural transcription of speech depends critically on the degree of noise in the soundscape

    Sonification of scalp-recorded frequency-following responses (FFRs) offers improved response detection over conventional statistical metrics

    No full text
    Background The human frequency-following response (FFR) is a neurophonic potential used to examine the brain\u27s encoding of complex sounds (e.g., speech) and monitor neuroplastic changes in auditory processing. Given the FFR\u27s low amplitude (order of nanovolts), current conventions in literature recommend collecting several thousand trials to obtain a robust evoked response with adequate signal-to-noise ratio. New method By exploiting the spectrotemporal fidelity of the response, we examined whether auditory playbacks (i.e., “sonifications”) of the neural FFR could be used to assess the quality of running recordings and provide a stopping rule for signal averaging. Results and comparison with existing method In a listening task over headphones, naïve listeners detected speech-evoked FFRs within ∼500 sweeps based solely on their perception of the presence/absence of a tonal quality to the response. Moreover, response detection based on aural sonifications offered similar and in some cases a 2–3× improvement over objective statistical techniques proposed in the literature (i.e., MI, SNR, MSC, F-test, Corr). Conclusions Our findings suggest that simply listening to FFR responses (sonifications) might offer a rapid technique to monitor real-time EEG recordings and provide a stopping rule to terminate signal averaging that performs comparably or better than current approaches

    Induced neural beta oscillations predict categorical speech perception abilities

    No full text
    Neural oscillations have been linked to various perceptual and cognitive brain operations. Here, we examined the role of these induced brain responses in categorical speech perception (CP), a phenomenon in which similar features are mapped to discrete, common identities despite their equidistant/continuous physical spacing. We recorded neuroelectric activity while participants rapidly classified sounds along a vowel continuum (/u/to/a/). Time-frequency analyses applied to the EEG revealed distinct temporal dynamics in induced (non-phase locked) oscillations; increased β (15-30. Hz) coded prototypical vowel sounds carrying well-defined phonetic categories whereas increased γ (50-70. Hz) accompanied ambiguous tokens near the categorical boundary. Notably, changes in β activity were strongly correlated with the slope of listeners\u27 psychometric identification functions, a measure of the steepness of their categorical percept. Our findings demonstrate that in addition to previously observed evoked (phase-locked) correlates of CP, induced brain activity in theβ-band codes the ambiguity and strength of categorical speech percepts

    Multichannel recordings of the human brainstem frequency-following response: Scalp topography, source generators, and distinctions from the transient ABR

    No full text
    Brainstem frequency-following responses (FFRs) probe the neural transcription of speech/music, auditory disorders, and plasticity in subcortical auditory function. Despite clinical and empirical interest, the response\u27s neural basis remains poorly understood. The current study aimed to more fully characterize functional properties of the human FFR (topography, source locations, generation). Speech-evoked FFRs were recorded using a high-density (64 channel) electrode montage. Source dipole modeling and 3-channel Lissajous analysis was used to localize the most likely FFR generators and their orientation trajectories. Additionally, transient auditory brainstem responses (ABRs), recorded in the same listeners, were used to predict FFRs and test the long-held assumption that the sustained potential reflects a series of overlapping onset responses. Results showed that FFRs were maximal at frontocentral scalp locations with obliquely oriented sources from putative generators in the midbrain (i.e., upper brainstem). Comparisons between derived and actual recordings revealed the FFR is not a series of repeated ABR wavelets and thus, represents a functionally distinct brainstem response. FFRs recorded at temporal electrode sites showed larger amplitudes and contained higher frequency components than vertex channels (Fz, Cz) suggesting that FFRs measured near the mastoid are generated more peripherally (auditory nerve) than measurements at frontocentral scalp locations. Furthermore, this reveals the importance of choice in reference electrode location for FFR interpretation. Our findings provide non-invasive evidence that (i) FFRs reflect sustained neural activity whose sources are consistent with rostral brainstem generators and (ii) FFRs are functionally distinct from the onset ABR response

    Sensitivity of the cortical pitch onset response to height, time-variance, and directionality of dynamic pitch

    No full text
    Event-related brain potentials (ERPs) demonstrate that human auditory cortical responses are sensitive to changes in static pitch as indexed by the pitch onset response (POR), a negativity generated at the initiation of acoustic periodicity. Yet, it is still unclear if this brain signature is sensitive to dynamic, time-varying properties of pitch more characteristic of those found in naturalistic speech and music. Neuroelectric PORs were recorded in response to contrastive pitch patterns differing in their pitch height, time-variance, and directionality (i.e., rise vs. fall). Broadband noise followed by contiguous iterated rippled noise (producing salient pitch sweeps) was used to temporally separate neural activity coding the onset of acoustic energy from the onset of time-varying pitch. Analysis of PORs revealed distinct modulations in response latency that distinguished static from time-varying pitch contours (steady-state \u3c dynamic) and pitch height (high \u3c low). However, PORs were insensitive to the direction of pitch sweeps (rise = fall). Our findings suggest that the POR signature provides a useful neural index of auditory cortical pitch processing for some, but not all pitch-evoking stimuli
    • …
    corecore