20,375 research outputs found
Native Language Identification on Text and Speech
This paper presents an ensemble system combining the output of multiple SVM
classifiers to native language identification (NLI). The system was submitted
to the NLI Shared Task 2017 fusion track which featured students essays and
spoken responses in form of audio transcriptions and iVectors by non-native
English speakers of eleven native languages. Our system competed in the
challenge under the team name ZCD and was based on an ensemble of SVM
classifiers trained on character n-grams achieving 83.58% accuracy and ranking
3rd in the shared task.Comment: Proceedings of the Workshop on Innovative Use of NLP for Building
Educational Applications (BEA
Language and Culture
Language pervades social life. It is a primary means by which we gain access to the contents of others\u27 minds and establish shared understanding of the reality. Meanwhile, there is an enormous amount of linguistic diversity among human populations. Depending on what counts as a language, there are 3,000 to 10,000 living languages in the world, although a quarter of the world’s languages have fewer than 1,000 speakers and half have fewer than 10,000 (Crystal, 1997). Not surprisingly, a key question in culture and psychology research concerns the role of language in cultural processes. The present chapter focuses on two issues that have received by far the greatest amount of research attention from cultural researchers. First, how does language and human cultures co-evolve? Second, what are the non-linguistic cognitive effects of using a certain language? Does speaking different languages orient individuals to see and experience the external reality differently? The scope of the present chapter does not permit a comprehensive review of all pertinent research; only a selected sample of studies will be used to illustrate the main ideas in the present chapter
Recommended from our members
Comparison of word-, sentence, and phoneme-based training strategies in improving the perception of spectrally-distorted speech
Purpose: To compare the effectiveness of three self-administered strategies for auditory training that might improve speech perception by adult users of cochlear implants. The strategies are based, respectively, on discriminating isolated words, words in sentences, and phonemes in nonsense syllables. Method: Participants were 18 normally-hearing adults who listened to speech processed by a noise-excited vocoder to simulate the information provided by a cochlear implant. They were assigned randomly to word-, sentence-, or phoneme-based training and underwent nine 20-minute training sessions on separate days over a 2-3-week period. The effectiveness of training was assessed as the improvement in accuracy of discriminating vowels and consonants, and identifying words in sentences, relative to participants’ best performance in repeated tests prior to training. Results: Word- and sentence-based training led to significant improvements in the ability to identify words in sentences that were significantly larger than the improvements produced by phoneme-based training. There were no significant differences between the effectiveness of word- and sentence-based training. No significant improvements in consonant or vowel discrimination were found for the sentence- or phoneme-based training groups, but some improvements were found for the word-based training group. Conclusions: The word- and sentence-based training strategies were more effective than the phoneme-based strategy at improving the perception of spectrally-distorted speech
Engaging the articulators enhances perception of concordant visible speech movements
PURPOSE
This study aimed to test whether (and how) somatosensory feedback signals from the vocal tract affect concurrent unimodal visual speech perception.
METHOD
Participants discriminated pairs of silent visual utterances of vowels under 3 experimental conditions: (a) normal (baseline) and while holding either (b) a bite block or (c) a lip tube in their mouths. To test the specificity of somatosensory-visual interactions during perception, we assessed discrimination of vowel contrasts optically distinguished based on their mandibular (English /ɛ/-/æ/) or labial (English /u/-French /u/) postures. In addition, we assessed perception of each contrast using dynamically articulating videos and static (single-frame) images of each gesture (at vowel midpoint).
RESULTS
Engaging the jaw selectively facilitated perception of the dynamic gestures optically distinct in terms of jaw height, whereas engaging the lips selectively facilitated perception of the dynamic gestures optically distinct in terms of their degree of lip compression and protrusion. Thus, participants perceived visible speech movements in relation to the configuration and shape of their own vocal tract (and possibly their ability to produce covert vowel production-like movements). In contrast, engaging the articulators had no effect when the speaking faces did not move, suggesting that the somatosensory inputs affected perception of time-varying kinematic information rather than changes in target (movement end point) mouth shapes.
CONCLUSIONS
These findings suggest that orofacial somatosensory inputs associated with speech production prime premotor and somatosensory brain regions involved in the sensorimotor control of speech, thereby facilitating perception of concordant visible speech movements.
SUPPLEMENTAL MATERIAL
https://doi.org/10.23641/asha.9911846R01 DC002852 - NIDCD NIH HHSAccepted manuscrip
Learning unfamiliar words and perceiving non-native vowels in a second language: Insights from eye tracking
One of the challenges in second-language learning is learning unfamiliar word forms, especially when this involves novel phoneme contrasts. The present study examines how real-time processing of newly-learned words and phonemes in a second language is impacted by the structure of learning (discrimination training) and whether asking participants to complete the same task after a 16–21 h delay favours subsequent word recognition. Specifically, using a visual world eye tracking paradigm, we assessed how English listeners processed newly-learned words containing non-native French front-rounded [y] compared to native-sounding vowels, both immediately after training and the following day. Some learners were forced to discriminate between vowels that are perceptually similar for English listeners, [y]-[u], while others were not. We found significantly better word-level processing on a variety of indices after an overnight delay. We also found that training [y] words paired with [u] words (vs. [y]-Control pairs) led to a greater decrease in reaction times during the word recognition task over the two testing sessions. Discrimination training using perceptually similar sounds had facilitative effects on second language word learning with novel phonemic information, and real-time processing measures such as eyetracking provided valuable insights into how individuals learn words and phonemes in a second language
Can children with speech difficulties process an unfamiliar accent?
This study explores the hypothesis that children identified as having phonological processing problems may have particular difficulty in processing a different accent. Children with speech difficulties (n = 18) were compared with matched controls on four measures of auditory processing. First, an accent auditory lexical decision task was administered. In one condition, the children made lexical decisions about stimuli presented in their own accent (London). In the second condition, the stimuli were spoken in an unfamiliar accent (Glaswegian). The results showed that the children with speech difficulties had a specific deficit on the unfamiliar accent. Performance on the other auditory discrimination tasks revealed additional deficits at lower levels of input processing. The wider clinical implications of the findings are considered
Thai lexical tone perception in native speakers of Thai, English and Mandarin Chinese: An event-related potentials training study
<p>Abstract</p> <p>Background</p> <p>Tone languages such as Thai and Mandarin Chinese use differences in fundamental frequency (F<sub>0</sub>, pitch) to distinguish lexical meaning. Previous behavioral studies have shown that native speakers of a non-tone language have difficulty discriminating among tone contrasts and are sensitive to different F<sub>0 </sub>dimensions than speakers of a tone language. The aim of the present ERP study was to investigate the effect of language background and training on the non-attentive processing of lexical tones. EEG was recorded from 12 adult native speakers of Mandarin Chinese, 12 native speakers of American English, and 11 Thai speakers while they were watching a movie and were presented with multiple tokens of low-falling, mid-level and high-rising Thai lexical tones. High-rising or low-falling tokens were presented as deviants among mid-level standard tokens, and vice versa. EEG data and data from a behavioral discrimination task were collected before and after a two-day perceptual categorization training task.</p> <p>Results</p> <p>Behavioral discrimination improved after training in both the Chinese and the English groups. Low-falling tone deviants versus standards elicited a mismatch negativity (MMN) in all language groups. Before, but not after training, the English speakers showed a larger MMN compared to the Chinese, even though English speakers performed worst in the behavioral tasks. The MMN was followed by a late negativity, which became smaller with improved discrimination. The High-rising deviants versus standards elicited a late negativity, which was left-lateralized only in the English and Chinese groups.</p> <p>Conclusion</p> <p>Results showed that native speakers of English, Chinese and Thai recruited largely similar mechanisms when non-attentively processing Thai lexical tones. However, native Thai speakers differed from the Chinese and English speakers with respect to the processing of late F<sub>0 </sub>contour differences (high-rising versus mid-level tones). In addition, native speakers of a non-tone language (English) were initially more sensitive to F<sub>0 </sub>onset differences (low-falling versus mid-level contrast), which was suppressed as a result of training. This result converges with results from previous behavioral studies and supports the view that attentive as well as non-attentive processing of F<sub>0 </sub>contrasts is affected by language background, but is malleable even in adult learners.</p
Speaker-normalized sound representations in the human auditory cortex
The acoustic dimensions that distinguish speech sounds (like the vowel differences in “boot” and “boat”) also differentiate speakers’ voices. Therefore, listeners must normalize across speakers without losing linguistic information. Past behavioral work suggests an important role for auditory contrast enhancement in normalization: preceding context affects listeners’ perception of subsequent speech sounds. Here, using intracranial electrocorticography in humans, we investigate whether and how such context effects arise in auditory cortex. Participants identified speech sounds that were preceded by phrases from two different speakers whose voices differed along the same acoustic dimension as target words (the lowest resonance of the vocal tract). In every participant, target vowels evoke a speaker-dependent neural response that is consistent with the listener’s perception, and which follows from a contrast enhancement model. Auditory cortex processing thus displays a critical feature of normalization, allowing listeners to extract meaningful content from the voices of diverse speakers
Structural Stability of Lexical Semantic Spaces: Nouns in Chinese and French
Many studies in the neurosciences have dealt with the semantic processing of
words or categories, but few have looked into the semantic organization of the
lexicon thought as a system. The present study was designed to try to move
towards this goal, using both electrophysiological and corpus-based data, and
to compare two languages from different families: French and Mandarin Chinese.
We conducted an EEG-based semantic-decision experiment using 240 words from
eight categories (clothing, parts of a house, tools, vehicles,
fruits/vegetables, animals, body parts, and people) as the material. A
data-analysis method (correspondence analysis) commonly used in computational
linguistics was applied to the electrophysiological signals.
The present cross-language comparison indicated stability for the following
aspects of the languages' lexical semantic organizations: (1) the
living/nonliving distinction, which showed up as a main factor for both
languages; (2) greater dispersion of the living categories as compared to the
nonliving ones; (3) prototypicality of the \emph{animals} category within the
living categories, and with respect to the living/nonliving distinction; and
(4) the existence of a person-centered reference gradient. Our
electrophysiological analysis indicated stability of the networks at play in
each of these processes. Stability was also observed in the data taken from
word usage in the languages (synonyms and associated words obtained from
textual corpora).Comment: 17 pages, 4 figure
- …