9,640 research outputs found
English-learning infantsâ perception of word stress patterns
Adult speakers of different free stress languages (e.g., English, Spanish) differ both in their sensitivity to lexical stress and in their processing of suprasegmental and vowel quality cues to stress. In a head-turn preference experiment with a familiarization phase, both 8-month-old and 12-month-old English-learning infants discriminated between initial stress and final stress among lists of Spanish-spoken disyllabic nonwords that were segmentally varied (e.g. [Ënila, Ëtuli] vs [luËta, puËki]). This is evidence that English-learning infants are sensitive to lexical stress patterns, instantiated primarily by suprasegmental cues, during the second half of the first year of life
Spoken content retrieval: A survey of techniques and technologies
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
Recommended from our members
Lexical stress constrains English-learning infants' segmentation in a non-native language.
Infants' ability to segment words in fluent speech is affected by their language experience. In this study we investigated the conditions under which infants can segment words in a non-native language. Using the Head-turn Preference Procedure, we found that monolingual English-learning 8-month-olds can segment bisyllabic words in Spanish (trochees and iambs) but not French (iambs). Our results are incompatible with accounts that rely on distributional learning, language rhythm similarity, or target word prosodic shape alone. Instead, we show that monolingual English-learning infants are able to segment words in a non-native language as long as words have stress, as is the case in English. More specifically, we show that even in a rhythmically different non-native language, English-learning infants can find words by detecting stressed syllables and treating them as word onsets or offsets
Simulating vocal learning of spoken language: Beyond imitation
Computational approaches have an important role to play in understanding the complex process of speech acquisition, in general, and have recently been popular in studies of vocal learning in particular. In this article we suggest that two significant problems associated with imitative vocal learning of spoken language, the speaker normalisation and phonological correspondence problems, can be addressed by linguistically grounded auditory perception. In particular, we show how the articulation of consonant-vowel syllables may be learnt from auditory percepts that can represent either individual utterances by speakers with different vocal tract characteristics or ideal phonetic realisations. The result is an optimisation-based implementation of vocal exploration â incorporating semantic, auditory, and articulatory signals â that can serve as a basis for simulating vocal learning beyond imitation
Rhythmic unit extraction and modelling for automatic language identification
International audienceThis paper deals with an approach to Automatic Language Identification based on rhythmic modelling. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, even if its extraction and modelling are not a straightforward issue. Actually, one of the main problems to address is what to model. In this paper, an algorithm of rhythm extraction is described: using a vowel detection algorithm, rhythmic units related to syllables are segmented. Several parameters are extracted (consonantal and vowel duration, cluster complexity) and modelled with a Gaussian Mixture. Experiments are performed on read speech for 7 languages (English, French, German, Italian, Japanese, Mandarin and Spanish) and results reach up to 86 ± 6% of correct discrimination between stress-timed mora-timed and syllable-timed classes of languages, and to 67 ± 8% percent of correct language identification on average for the 7 languages with utterances of 21 seconds. These results are commented and compared with those obtained with a standard acoustic Gaussian mixture modelling approach (88 ± 5% of correct identification for the 7-languages identification task)
Speech Communication
Contains reports on five research projects.C.J. LeBel FellowshipKurzweil Applied IntelligenceNational Institutes of Health (Grant 5 T32 NS07040)National Institutes of Health (Grant 5 R01 NS04332)National Science Foundation (Grant 1ST 80-17599)Systems Development FoundationU.S. Navy - Office of Naval Research (Contract N00014-82-K-0727
- âŠ