This study utilizes phonotactic and pitch pattern model-ing for automatic assessment of toddlers ’ language background from short vocalization segments. The experiments are con-ducted on audio recordings of twelve 25–31 months old US-born and Shanghainese toddlers. Each recording captures a whole-day sound track of an ordinary day in the toddlers ’ life spent in their natural environment. In a preliminary study, we observed that in spite of the limited presence of linguistic con-tent in the early age child vocalizations, certain phonotactic and prosodic patterns were correlated with the child’s language background. In the current effort, we analyze to what extent these language-salient cues can be leveraged in the context of automatic language background classification. Besides a tradi-tional parallel phone recognition with statistical language mod-eling (PPRLM) and phone recognition with support vector ma-chines (PRSVM), a novel scheme that utilizes pitch patterns (PPSVM) is proposed. The classification results on very short vocalizations (on average less than 3 seconds long) confirm that both phonotactic and prosodic features capture a language-specific content, reaching equal error rates (EER) of 32.45 % for PRSVM, 31.33 % for PPSVM, and 29.97 % in a fusion of PRSVM and PPSVM systems. The competitive performance of PPSVM suggests that pitch contours carry a significant portion of the language-specific information in toddlers ’ vocalizations
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.