research

Getting the measure of derivational morphology in adult speech a corpus analysis using MorphoQuantics

Abstract

This paper describes the methodology used to compile a corpus called MorphoQuantics that contains a comprehensive set of 17,943 complex word types extracted from the spoken component of the British National Corpus (BNC). The categorisation of these complex words was derived primarily from the classification of Prefixes, Suffixes and Combining Forms proposed by Stein (2007). The MorphoQuantics corpus has been made available on a website of the same name; it lists 554 word-initial and 281 word-final morphemes in English, their etymology and meaning, and records the type and token frequencies of all the associated complex words containing these morphemes from the spoken element of the BNC, together with their Part of Speech. The results show that, although the number of word-initial affixes is nearly double that of word-final affixes, the relative number of each observed in the BNC is very similar; however, word-final affixes are more productive in that, on average, the frequency with which they attach to different bases is three times that of word-initial affixes. Finally, this paper considers how linguists, psycholinguists and psychologists may use MorphoQuantics to support their empirical work in first and second language acquisition, and clinical and educational research

    Similar works