10 research outputs found

    Language specific listening of Japanese geminate consonants: a cross-linguistic study

    Get PDF
    Various aspects of linguistic experience influence the way we segment, represent, and process speech signals. The Japanese phonetic and orthographic systems represent geminate consonants (double consonants, e.g., /ss/, /kk/) in a unique way compared to other languages: one abstract representation is used to characterize the first part of geminate consonants despite the acoustic difference between two distinct realizations of geminate consonants (silence in the case of e.g., stop consonants and elongation in the case of fricative consonants). The current study tests whether this discrepancy between abstract representations and acoustic realizations influences how native speakers of Japanese perceive geminate consonants. The experiments used pseudo words containing either the geminate consonant /ss/ or a manipulated version in which the first part was replaced by silence /_s/. The sound /_s/ is acoustically similar to /ss/, yet does not occur in everyday speech. Japanese listeners demonstrated a bias to group these two types into the same category while Italian and Dutch listeners distinguished them. The results thus confirmed that distinguishing fricative geminate consonants with silence from those with sustained frication is not crucial for Japanese native listening. Based on this observation, we propose that native speakers of Japanese tend to segment geminated consonants into two parts and that the first portion of fricative geminates is perceptually similar to a silent duration. This representation is compatible with both Japanese orthography and phonology. Unlike previous studies that were inconclusive in how native speakers segment geminate consonants, our study demonstrated a relatively strong effect of Japanese specific listening. Thus the current experimental methods may open up new lines of investigation into the relationship between development of phonological representation, orthography and speech perception

    Large-scale network dynamics of beta-band oscillations underlie auditory perceptual decision-making

    No full text
    Perceptual decisions vary in the speed at which we make them. Evidence suggests that translating sensory information into perceptual decisions relies on distributed interacting neural populations, with decision speed hinging on power modulations of the neural oscillations. Yet the dependence of perceptual decisions on the large-scale network organization of coupled neural oscillations has remained elusive. We measured magnetoencephalographic signals in human listeners who judged acoustic stimuli composed of carefully titrated clouds of tone sweeps. These stimuli were used in two task contexts, in which the participants judged the overall pitch or direction of the tone sweeps. We traced the large-scale network dynamics of the source-projected neural oscillations on a trial-by-trial basis using power-envelope correlations and graph-theoretical network discovery. In both tasks, faster decisions were predicted by higher segregation and lower integration of coupled beta-band (∼16–28 Hz) oscillations. We also uncovered the brain network states that promoted faster decisions in either lower-order auditory or higher-order control brain areas. Specifically, decision speed in judging the tone sweep direction critically relied on the nodal network configurations of anterior temporal, cingulate, and middle frontal cortices. Our findings suggest that global network communication during perceptual decision-making is implemented in the human brain by large-scale couplings between beta-band neural oscillations. The speed at which we make perceptual decisions varies. This translation of sensory information into perceptual decisions hinges on dynamic changes in neural oscillatory activity. However, the large-scale neural-network embodiment supporting perceptual decision-making is unclear. We addressed this question by experimenting two auditory perceptual decision-making situations. Using graph-theoretical network discovery, we traced the large-scale network dynamics of coupled neural oscillations to uncover the brain network states that support the speed of auditory perceptual decisions. We found that higher network segregation of coupled beta-band oscillations supports faster auditory perceptual decisions over trials. Moreover, when auditory perceptual decisions are relatively difficult, the decision speed benefits from higher segregation of frontal cortical areas, but lower segregation and greater integration of auditory cortical areas

    Word identification using phonetic features : towards a method to support multivariate fMRI speech decoding

    No full text
    Nowadays, using state of the art multivariate machine learning approaches, researchers are able to classify brain states from brain data. One of the applications of this technique is decoding phonemes that are being produced from brain data in order to decode produced words. However, this approach has been only moderately successful. Instead, decoding articulatory features from brain data may be more feasible. As a first step towards this approach, we propose a word decoding method that is based on the detection of articulatory features (words are identified from a sequence of articulatory class labels). In essence, we investigated how the lexical ambiguity is reduced as a function of confusion between articulatory features, and a function of the confusion between phonemes after feature decoding. We created a number of models based on different combinations of articulatory features and tested word identification on an English corpus with approximately 70,000 words. The most promising model used only 11 classes and identified 71% of words correctly. The results confirmed that it is possible to decode words based on articulatory features, and this offers the opportunity for multivariate fMRI speech decoding

    Classification across multiple trials for the 19 ms VOT condition.

    No full text
    <p>Multi-trial performance for individual participants in both the native-English and native-Dutch participant groups is shown using colored lines (sorted according to mean individual performance), while the average for each group is shown using a thick black line. On average, performance increased when including additional trials. Participants with relatively high single-trial classification rates tended to show additional improvement when decisions were based on additional trials, while participants with low single-trial classification rates showed less benefit from the inclusion of additional trials.</p

    Group level behavioral and ERP responses.

    No full text
    <p>a) Mean behavioral identification scores for native and non-native speakers for the three deviant stimuli. b) Group-level ERPs for both the standard and deviant stimuli are presented in each of the three measurement conditions for both native-English and native-Dutch participants. Responses are averaged across nine fronto-central electrode locations, indicated by the large dots in the scalp map presented above (see also <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0068261#pone-0068261-t001" target="_blank">Table 1</a>). In addition, difference waves have been derived for each language group by subtracting the grand-average responses to the standard stimulus from that of the deviant stimulus in each of the measurement conditions. c) Area under the ROC-curve scores for spatio-temporal features across the three deviant conditions for both native and non-native participants. The relative locations of four midline electrodes are indicated for reference.</p

    Within-participant classification analyses.

    No full text
    <p>a) Classification rates for native and non-native participants for each of the three stimulus conditions along with group averages (shown with error bars). Participants are sorted based on the averaged results of the three analyses, as indicated by the horizontal lines. Asterisk size indicates the significance level of the result in each of the three conditions. b) Scatter plot of classifier performance with respect to the mean amplitude of the MMN component of individual ERPs measured in the study by Brandmeyer, Desain and McQueen <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0068261#pone.0068261-Korpilahti1" target="_blank">[16]</a>. c) Scatter plot of mean classifier decision rates per condition with respect to behavioral decisions in the identification task reported in that study.</p

    Experimental stimuli waveforms.

    No full text
    <p>A recording of the English CV syllable/pa/with a voice onset time of 85 ms was used as the standard stimulus during EEG recordings. The three deviant stimuli were created by removing successive 22 ms portions of the aspirated period prior to voice onset in the original 85 ms VOT standard stimulus, and by inserting additional periods of voicing to preserve the duration of each stimulus. The onset of the initial plosive burst was preserved for all of the stimuli.</p

    Prediction of native language on the basis of electrophysiological and behavioral data.

    No full text
    <p>Five decoding analyses aimed at predicting the native-language of a given-participant on the basis of their measured data were carried out. Two analyses made use of concatenated single-trial EEG data from each of the three measurement conditions. The first of these analyses determined single-trial classification rates using this data set, while the second combined the single-trial predictions (70 total trials) for each participant’s data obtained when it was used as a test-set during the classification analysis. Two additional analyses made used of concatenated individual grand-averaged ERPs. One utilized both standard and deviant stimulus ERPs collected in all of the three measurement conditions, while the other included only the deviant ERPs measured using the 63 and 41 ms VOT stimuli. A final analysis was performed using a vector of seven mean behavioral identification scores collected for each participant in the original study by Brandmeyer et al. Significance levels shown using asterisks (, , , ), and are based on the number of observations available for each of the five data sets. For the single-trial analysis, 1540 data points (70 per participant) were available, while for the remaining four analyses, 22 data points (one per participant) were available.</p

    Cross-participant classification analyses.

    No full text
    <p>Classification rates for native and non-native participants for the two classifiers trained on cross-participant data sets using the 19 ms VOT deviant, along with individual rates from the within-participant classification analysis of the same deviant condition. Results for each of the three datasets are indicated using different colored bars. Participants are sorted based on the averaged results of the three analyses, as indicated by the horizontal lines. Group averages are also shown with error bars. Asterisk size indicates the significance level of a given ndividual result.</p
    corecore