Learning the surface structure of wh-questions in English and French with a non-parametric Bayesian model

Abstract

The overt structure of wh-questions varies across and within languages. How does a child learn the number of wh-question types that are present in her language and the surface properties of each type? We propose a non-parametric Bayesian model of this aspect of language acquisition, focusing on discrete morphosyntactic properties of questions such as displacement and continuous prosodic properties such as wh-word duration, and apply it to data based on child-directed speech in English and French. The model successfully infers that English has fewer wh-question types than French, identifies the properties of the main question types in each language, and achieves reasonable classification accuracy on naturalistic test utterances. Non-parametric Bayesian inference is a promising method for addressing cross-linguistic and language-internal syntactic variation

    Similar works