1,192 research outputs found
Factors in the persistence or decline of ethnic group mobilisation: a conceptual review and case study of cultural group responses among Afrikaners in post-apartheid South Africa
The candidate has two major linked interests. One is to reconcile competing explanations of ethnicity, and the other is to explore the factors underlying ethnicity in the light of a case study of the rise and decline of ethnic mobilisation among white Afrikaners in South Africa. For many observers the recent apparent "decomposition" of Afrikaner nationalist mobilisation has been surprising, and the factors associated with this trend were expected to contain insights relevant to the theoretical debate. The first part of the thesis is a review of key aspects of literature which offers alternative explanations of ethnic attachments and mobilisation. It commences with a theme-setting example of a reconciliation of alternative viewpoints. At the end of the literature review a series of propositions is offered, suggesting the utility of an integration of alternative perspectives. The case study of Afrikaner ethnic mobilisation commences with a historical overview of the emergence of Afrikaner ethnic nationalism, from the early colonial settlement up to the present. Thereafter a wide range of empirical, survey-based evidence is presented, including exploratory factor analyses, covering patterns in the cultural, racial, socio-economic and political attitudes of Afrikaners, comparing their responses with those of other South Africans. An account of recent political change and the responses of Afrikaners to the events is given. In the final chapter conclusions drawn from the evidence are presented as further propositions in a broader theoretical context. The fragmentation of Afrikaner ethnic nationalism is found to be associated with the bureaucratization of ethnicity during the period of apartheid rule, ambivalence on group boundaries, the usurpation of cultural identity by race, and a breakdown of internal coordination processes which ethnic mobilisation appears to require. At the same time a core of ethnic commitment, substantially independent of its material and political utility, is found to persist, surrounded by a wider compound of racial, cultural and political consciousness. Alternative scenarios of probable future developments are tentatively offered. The analysis appears to support the initial argument that ethnic mobilisation involves full combinations of the processes which competing theories usually pit against one another. The process of ethnic mobilisation involves a variable incorporation of elements of class, group status and honour and political activation, in which identity commitment, co-ordinating agencies and ethnic boundary-construction interact as defining and integrating elements
CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models
While many languages possess processes of joining two or more words to create
compound words, previous studies have been typically limited only to languages
with excessively productive compound formation (e.g., German, Dutch) and there
is no public dataset containing compound and non-compound words across a large
number of languages. In this work, we systematically study decompounding, the
task of splitting compound words into their constituents, at a wide scale. We
first address the data gap by introducing a dataset of 255k compound and
non-compound words across 56 diverse languages obtained from Wiktionary. We
then use this dataset to evaluate an array of Large Language Models (LLMs) on
the decompounding task. We find that LLMs perform poorly, especially on words
which are tokenized unfavorably by subword tokenization. We thus introduce a
novel methodology to train dedicated models for decompounding. The proposed
two-stage procedure relies on a fully self-supervised objective in the first
stage, while the second, supervised learning stage optionally fine-tunes the
model on the annotated Wiktionary data. Our self-supervised models outperform
the prior best unsupervised decompounding models by 13.9% accuracy on average.
Our fine-tuned models outperform all prior (language-specific) decompounding
tools. Furthermore, we use our models to leverage decompounding during the
creation of a subword tokenizer, which we refer to as CompoundPiece.
CompoundPiece tokenizes compound words more favorably on average, leading to
improved performance on decompounding over an otherwise equivalent model using
SentencePiece tokenization.Comment: EMNLP 202
Compounding in Namagowab and English: (exploring meaning creation in compounds)
This essay investigates compounding in Namagowab and English, which belong to two widely divergent groups of languages, the Khoesan and Indo-European, respectively. The first motive is to investigate how and why new words are created from existing ones. The reading and data interpretation seeks an understanding of word formation and an overview of semantic compositionality, structure and productivity, within the broad context of cognitive, lexicalist and distributed morphology paradigms. This coupled with history reading about the languages and its people, is used to speculate about why compounds feature in lexical creation. Compounding is prevalent in both languages and their distance in terms of phylogenetic relationships should allow limited generalizing about these processes of formation. Word lists taken from dictionaries in both languages were analyzed by entering the words in Excel spreadsheets so that various attributes of these words, such as word type, compound class (Noun, Verb, Preposition, Adjective and Adverb) and constituent class could be counted, and described with formulae, and compound and constituent meaning analyzed. The conclusion was that socio historical factors such as language contact, and aspects of cognition such as memory and transparency, account for compounding in a language in addition to typology
Multilingualism and the structure of code-mixing
Non peer reviewe
Recognition, Regulation, Revitalisation
Recognition, Regulation, Revitalisation: Place Names and Indigenous Languages is a selection of double-blind peer-reviewed papers from the 5th International Symposium on Place Names that took place 18-20 September 2020 in Clarens, South Africa. The symposium celebrated 2019 as the International Year of Indigenous Languages as declared by the United Nations
- …