14 research outputs found
Recommended from our members
A spoken corpus of Cameroon Pidgin English: pilot study
This resource is a 240,000-word corpus of spoken Cameroon Pidgin English (CPE), a widely-used yet stigmatised and largely uncodified pidgin/creole variety.
The corpus consists of transcriptions of private and public dialogues and monologues, with mark-up and POS-tagging, together with accompanying sound files. The recordings were conducted in five different locations in Cameroon (Bamenda, Buea, Douala, Kumba and Yaounde), allowing some insights into regional variation. Text categories and the proportions of monologue and dialogue are guided by those of the International Corpus of English (ICE) project, which makes the corpus immediately comparable with existing corpora of post-colonial varieties of English.
• Spelling: since there is no standardised orthography for CPE, the orthography adopted for this project is based on that developed by Ayafor (2014), which was kept under review during the course of the project.
• Annotation was added to the transcriptions based on ICE guidelines for the annotation of spoken texts: standard mark-up symbols were used to denote text unit, speaker identification, overlapping speech, unclear words, uncertain transcriptions, anthropo-phonics, editorial comments, foreign words and indigenous language words.
• Tagging: a tagset for CPE was devised based on CLAWS 5. Initially tagging was conducted manually, and then by means of TreeTagger. A third of the corpus has been post-checked, with accuracy rates at 94%.
The corpus is aimed at providing a resource for linguistic description and comparison. It allows linguists to identify and describe recurring grammatical patterns, as well as the phonology of the language (given the availability of sound files deposited with the text files). It also allows comparison of CPE with other pidgin/creole languages, other Cameroonian and West African languages, and other varieties of post-colonial English. Furthermore, the corpus provides an exceptional resource for the study of general/theoretical linguistics, creolistics, typology, language contact and change, sociolinguistics and discourse analysis.
The corpus contains 80 sound recordings of monologues (scripted and unscripted) and dialogues (public and private). Each sound file (in .wav format) is 10-15 minutes in length. These recordings have been transcribed (each approximately 3,000 words in length) and annotated. Transcriptions are submitted in two formats: (a) plain transcription (with basic markup indicating speaker turns, overlaps, etc.), and (b) a POS-tagged version, which adds POS-tags to the plain version of the transcription.
The language of the monologues is Cameroon Pidgin English, with codeswitching into English, French, and indigenous Cameroonian languages
Teenage literacy problems in Northern Ireland : causes and possible remedies
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Recommended from our members
Cameroon Pidgin English: a comprehensive grammar
Cameroon Pidgin English (CPE) is an English-lexified Atlantic expanded pidgin/creole spoken in some form by an estimated 50% of Cameroon’s population, primarily in the anglophone west regions, but also in urban centres throughout the country. Primarily a spoken language, CPE enjoys a vigorous oral presence in Cameroon, and the linguistic examples illustrating this description are drawn from a spoken corpus consisting of a range of text types, including oral narratives, radio broadcasts and spontaneous conversation. The authors’ typologically-framed investigation of the features of the language, from its phonetics, phonology and lexicon to its syntax and discourse structure, allows the reader a clear view of the linguistic character of CPE, offering a comprehensive description of the language that will be of interest to creolists as well as linguists interested in African languages, contact linguistics and comparative linguistics