    A critical analysis of multilingual dictionaries

    This article evaluates the lexicographic value of multilingual dictionaries. Dictionaries covering three or more languages spoken in South Africa are taken as a case in point. An attempt will be made to reflect on their merits and shortcomings as reference works and learning tools but the focus will be on presumed shortcomings in the macro and micro structures of such dictionaries with special attention to the lemmatisation of common words, quality of the dictionary articles and consistency in presentation.In hierdie artikel word die leksikografiese waarde van veeltalige woordeboeke geĂ«valueer. Woordeboeke wat drie of meer tale wat in Suid-Afrika gepraat word leksikografies bewerk, dien as voorbeeld. Daar sal gepoog word om hulle sterk punte en tekortkominge as naslaanbronne en aanleerhulpmiddels te omskryf. Klem sal gelĂª word op veronderstelde tekortkominge in die makro- en mikrostrukture van sulke woordeboeke met spesifieke verwysing na die lemmatisering van algemeen gebruikte woorde, kwaliteit van die woordeboekartikels en konsekwentheid in die aanbieding.A grant from the German Ministry for Education and Research and in part by the National Research Foundation of South Africa (grant-specific unique reference number (UID) 8576.http://lexikos.journals.ac.zaam2016African Language

    A writing assistant en route to a full computational grammar for Sepedi

    A detailed user study and observations by lecturers indicate that the correct compilation of sentences in the eight verbal moods as well as for a number of similar grammatically complicated constructions in Sepedi poses a challenge in any text production situation. Feedback from target users indicated that there is a need for a computational writing assistant for the compilation and verification of correct constructions. To address this need, an extended computational sentence builder for verbal moods, adjectives and possessive constructions in Sepedi was designed and built to assist in text production and to serve as a model for other African languages. This article introduces a prototype of such an extended computational sentence builder for verbal moods, adjectives and possessive constructions in Sepedi. The emphasis throughout is on the grammatical complexity of Sepedi and how the Sepedi Helper can assist users to produce correct sentences. In contrast to typical traditional grammars of Sepedi, the tool also provides the required cognitive information through basic clickable help screens. The Sepedi Helper is a dynamic lightweight tool aimed at combining user knowledge with a text production tool, i.e. user-involved, step-by-step production of Sepedi phrases. The emphasis in the design is on simplicity.https://www.tandfonline.com/loi/rjal20hj2022African Language

    Beryl T. (Sue) Atkins leksikograaf van A tot Z

    No abstract available.http://lexikos.journals.ac.zaam2022African Language

    Korpusgebaseerde leksikografie vir hulpbronbeperkte tale - die maksimalisering van die beperkte korpus

    This article focuses on lesser-resourced languages for which only very limited corpora are available and how such relatively small and often unbalanced, raw corpora could be maximally utilized for lexicographic purposes to obtain similar results as for bigger corpora. Sepedi and Afri-kaans will be studied in this regard. The aim is to determine to what extent enlarging a corpus from e.g. one to 10 million, and from 10 million to 100 million words enhances its potential for (a) macro-structure compilation, (b) sourcing information on the most important microstructural aspects and (c) the creation of lexicographic tools. It will be argued that valuable and even sufficient data for the compilation of a specific dictionary can be extracted from a relatively small corpus of approxi-mately one million words but that "bigger" in some instances indeed means "better".Die fokus in hierdie artikel is op hulp-bronbeperkte tale waarvoor slegs baie beperkte korpusse beskikbaar is en hoe sodanige relatief klein en dikwels ongebalanseerde, rou korpusse maksimaal benut kan word vir leksikografiese doeleindes om soortgelyke resultate as van groter korpusse te verkry. Sepedi en Afrikaans, word in hierdie verband bestudeer. Die doel is om te bepaal tot watter mate die vergroting van 'n korpus van byvoorbeeld een na 10 miljoen, en van 10 miljoen na 100 miljoen woorde die potensiaal sal ver-hoog vir (a) makrostruktuur samestelling, (b) die inwin van inligting omtrent die belangrikste mikrostrukturele aspekte en (c) die ontwerp van leksikografiese hulpmiddels. Daar sal aangevoer word dat waardevolle en selfs voldoende data vir die samestelling van 'n spesifieke woordeboek onttrek kan word uit 'n relatief klein korpus van ongeveer een miljoen woorde maar dat "groter" wel in sekere omstandighede "beter" is.A grant from the German Ministry for Education and Research and supported in part by the National Research Foundation of South Africa (Grant specific unique reference number (UID) 85763).http://lexikos.journals.ac.zahttp://www.wat.co.za/index.php/en/publications/lexikosam201

    A critical examination of dictionaries with amalgamated lemmalists

    Die onlangs gepubliseerde Groot Woordeboek (Afrikaans en Nederlands), ook bekend as ANNA, is die eerste woordeboek met 'n geamalgameerde lemmalys gebaseer op die model van Martin en Gouws (2000). ANNA baan ook die weg vir 'n soortgelyke benadering vir ander nouverwante tale soos die Sothotale en die Ngunitale van Suid-Afrika. Die voor- en nadele van vyf aspekte te wete (a) vergelyking en kontras, (b) gebruikersvriendelikheid, (c) ruimtebesparing, (d) ordening van betekenisonderskeidings en (e) die aanbod van 'n afsonderlike grammatikale kompendium word krities beskou. Die beginsels waarop amalgamering berus, die tersaaklike lemmatipes, asook enkele kenmerke van die model word vooraf kortliks bespreek.The recently published Groot Woordeboek (Afrikaans en Nederlands), also known as ANNA, is the first dictionary with an amalgamated lemmalist based on the model of Martin and Gouws (2000). ANNA also paves the way for a similar approach for other closely related languages such as the Sotho languages and the Nguni languages of South Africa. The advantages and disadvantages of five aspects namely (a) comparison and contrast, (b) user-friendliness, (c) space saving, (d) ordering of senses and (e) provision of a separate grammatical compendium are critically evaluated. The principles of amalgamation, the relevant lemma types, as well as certain characteristics of the model are briefly discussed beforehand.This work is based on the research supported in part by the National Research Foundation of South Africa (Grant specific unique reference number (UID) 85763).http://lexikos.journals.ac.zaam201

    Kritiese evaluering van die paradigmabenadering tot Sepedi-lemmatisering - die Groot Noord-Sotho Woordeboek as voorbeeld

    This article gives a critical evaluation of the paradigm approach of the Groot Noord-Sotho Woordeboek to the lemmatisation of verbs and nouns derived from verbs. The verb stem -roba 'break' with its complicated system of derivations will be taken as a case in point. The paradigm presented for -roba will be evaluated in terms of structure, occurrence in Sepedi corpora and dictionaries, actual use by mother-tongue speakers, user-friendliness, contextualisation versus decontextualisation in relation to the cross-referencing system and space utilisation. Bringing together, and lexicographically treating all these forms for a single verb surely is a lexicographic achievement. The question, however, is to what extent such an approach is useful in respect of forms likely to be looked up by dictionary users, whether all of these forms actually exist, how user-friendly the approach and presentation is, if comment on semantics is sufficient and consistent and whether such a lumping approach actually saves space in contrast to entering derivations as main lemmas in a splitting approach.Hierdie artikel gee 'n kritiese evaluering van die paradigmabenadering tot die Groot Noord-Sotho Woordeboek tot die lemmatisering van werkwoorde en naamwoorde wat van werkwoorde afgelei is. Die werkwoordstam -roba 'breek' met sy komplekse sisteem van afleidings word as voorbeeld geneem. Die paradigma wat vir -roba aangebied word, sal in terme van struktuur, werklike gebruik deur moedertaalsprekers, voorkoms in Sepedikorpusse, gebruikersvriendelikheid, kontekstualisering versus dekontekstualisering ten opsigte van die kruisverwysingstelsel en ruimtebenutting geĂƒÂ«valueer word. Die byeenbring, en leksikografiese bewerking van al hierdie vorme vir 'n enkele werkwoord is sonder twyfel 'n leksikografiese prestasie. Die vraag is egter tot watter mate dit nuttig is ten opsigte van vorme wat waarskynlik deur woordeboekgebruikers opgesoek sal word, of al hierdie vorme werklik bestaan, hoe gebruikersvriendelik die benadering en aanbieding is, of semantiese kommentaar voldoende en konsekwent is, en of so 'n saamgevoegde benadering werklik ruimte bespaar in teenstelling met die aanbieding van afleidings as afsonderlike hooflemmas in 'n opgedeelde benadering.This article was the basis of a shortened version presented as a paper at the Nineteenth Annual International Conference of the African Association for Lexicography (AFRILEX), which was hosted by the Research Unit for Language and Literature in the SA Context, North-West University, Potchefstroom Campus, Potchefstroom, South Africa, 1–3 July 2014.Grant from the German Ministry for Education and Research, administered by the DAAD and (b) supported in part by the National Research Foundation of South Africa (grant specific unique reference number (UID) 85763).http://lexikos.journals.ac.zaam201

    The lexicographical treatment of kinship terminology in Sepedi

    Die verwantskapsterminologie van Sepedi is omvangryk en vorm 'n komplekse sisteem. In teenstelling met tale soos Afrikaans, Engels en Duits word groter uitdagings aan die Sepedileksikograaf gestel ten opsigte van die identifisering van verwantskapsterme en die lemmatisering en bewerking daarvan. Voorafstudie van die verwantskapsterminologiestelsel is 'n voorvereiste vir suksesvolle gebruikersleiding. Die Sepedi leksikograaf is die tussenganger tussen veral die onervare woordeboekgebruiker en hierdie ingewikkelde verwantskapsterminologiesisteem, en moet derhalwe sorg vir effektiewe lemmatisering en voldoende bewerking van verwantskapsterme. Die aard en omvang van verwantskapsterminologie in Sepedi word ontleed en daar word aangetoon dat verwantskapsterminologie in Sepedi problematies is, veral ten opsigte van die lemmatisering van samestellings, en in besonder, afgeleide enkelwoordvorme en frases, soos byvoorbeeld veelvuldige besitskonstruksies. Ten einde toegang tot verwantskapsterme in die woordeboek te vergemaklik word 'n leksikografiese konvensie vir die lemmatisering van verwantskapsterme voorgehou. Korpusvoorkomste van verwantskapsterme word aangegee en ruimte word afgestaan vir 'n kritiese evaluering van die lemmatisering en bewerking van verwantskapsterme in Sepediwoordeboeke.Kinship terminology in Sepedi is extensive and forms a complex system. In contrast to languages such as Afrikaans, English and German the lexicographer faces greater challenges in respect of the identification of kinship terms and the lemmatisation and treatment thereof in Sepedi dictionaries. Preparational studies of the kinship terminology system are a prerequisite for successful user guidance. The Sepedi lexicographer is the mediator, especially between the inexperienced dictionary user and this complicated kinship terminology system, and therefore has to provide for effective lemmatisation and sufficient treatment of kinship terms. The nature and extent of kinship terminology in Sepedi are analysed and it is indicated that kinship terminology in Sepedi is problematic, especially in respect of the lemmatisation of compounds, and in particular, derived single-word forms and phrases, such as for example multiple possessive constructions. In order to ease access to kinship terms in dictionaries, a lexicographic convention for the lemmatisation of kinship terms is suggested. Corpus occurrences of kinship terms are indicated and space is allocated for a critical evaluation of the lemmatisation and treatment of kinship terms in Sepedi dictionaries.http://lexikos.journals.ac.za/am201

    Lexicographic treatment of salient features and challenges in the creation of paper and electronic dictionaries

    This paper focuses on the need for lexicographers to study and to treat the salient features of languages satisfactorily and the challenges faced by lexicographers. The focus is on the challenges facing compilers of African language dictionaries and the lack of dictionaries for these languages. It will be argued that lexicographers are expected to fulfil the role of mediators between complicated grammatical structures, on the one hand, and the target users’ needs and expectations, on the other. Dictionaries are expected to be inclusive, e.g., providing for and fulfilling user expectations by giving all the required information in the dictionary in order to reduce the need for consultation of external sources. Expectations for future compilation of paper and electronic dictionaries are discussed. It is expected that paper dictionaries will be used in Africa for many years to come but that paper and electronic dictionaries of high lexicographic quality should be compiled simultaneously. The discussion is presented against the background of the transition of African lexicography from Euro-centred dictionary compilation to Afro-centric compilation. African language dictionaries are continuously compiled in Africa, by Africans for Africans.https://euralex.org/publicationsam2022African Language

    Leksikografiese hantering van verwantskapsterme in 'n Engels/Sepedi-Setswana-Sesotho-woordeboek met 'n geamalgameerde lemmalys

    This article describes the lemmatisation and treatment of kinship terms in a proposed English-Sotho, Sotho-English dictionary with an amalgamated lemmalist. The first requirement is to build a list of kinship terminology for the Sotho languages. Secondly, it is necessary in terms of space restriction to determine the most frequently used forms to be lemmatised in such a dictionary. Thirdly, the macrostructure and microstructure of the dictionary should be planned in terms of an amalgamated approach. A short explanation of the amalgamated model will be presented and a schematic illustration of the paternal family tree structure in the Sotho languages is given in the appendix. Specific attention is given to the compilation of the amalgamated lemmalist focusing on absolute cognates and absolute cognates with a difference in form. Finally, where the reduction of huge quantities of terms, e.g. all derived forms of a specific term in all three Sotho languages are at stake, a lexicographic convention will be suggested to sensibly reduce the number of lemmas and to combat redundancy.Hierdie artikel beskryf die lemmatisering en bewerking van verwantskapsterme in 'n voor-gestelde Engels–Sotho, Sotho–Engels woordeboek met 'n geamalgameerde lemmalys. Die eerste vereiste is die samestelling van 'n lys van verwantskapsterminologie vir die Sothotale. Tweedens is dit nodig om ter wille van ruimtebesparing die mees gebruiklike vorme te bepaal wat in so 'n woordeboek gelemmatiseer moet word. Derdens moet die makro- en mikrostruktuur van die woordeboek beplan word in terme van 'n geamalgameerde benadering. 'n Kort verduideliking van die geamalgameerde model sal aangebied word en 'n skematiese voorstelling van die paterne stamboomstruktuur in die Sothotale word in die bylaag aangegee. Spesifieke aandag word gegee aan die samestelling van die geamalgameerde lemmalys met die fokus op absolute kognate en absolute kognate met 'n vormverskil. Ten slotte, waar die vermindering van groot hoeveelhede van die terme, byvoorbeeld alle afgeleide vorme van 'n spesifieke term in al drie Sothotale ter sake is, sal 'n leksikografiese konvensie voorgestel word om die aantal lemmas sinvol te verminder en redundansie te bestry.A grant from the German Ministry for Education and Research, administered by the DAAD and (b) supported in part by the National Research Foundation of South Africa (grant specific unique reference number (UID) 85763).http://lexikos.journals.ac.zaam201

    Bilingual dictionary for a specific user group : supporting Setswana speakers in the production and reception of English

    The aim of this article is to discuss the design of a new English to Setswana dictionary for two narrowly defined target user groups of Setswana learners, i.e. Upper Primary (10 to 12 years old); and Junior Secondary (13 to 15 years old). The dictionary is intended to be a guide to text and speech production in the foreign language L2 (English) and the reception of English text and speech in L1 (Setswana). A general consideration concerns the relationship between the treatment of English and that of Setswana. As English is to be treated as a priority, many more data types will be made available for English than for Setswana. Furthermore, we will assess the possibility of producing a bilingual learners’ dictionary with rather imbalanced parts: for production in (and translation into) English, a detailed description of individual items is needed, whereas for reception (and translation) from English, a large list of treatment units is necessary, albeit with less elaborate descriptive detail. The design also aims at strong guidance through the mother tongue, Setswana. Two possible scenarios are considered, namely separate dictionaries for the two target groups or a single dictionary to serve the lexicographic needs of both target groups. Socio-economic circumstances of most of the learners are such that buying more than one dictionary is not a realistic option, and they find themselves in a pre-dictionary culture with the resultant lack of dictionary using skills. The dictionary/dictionaries will be for paper dictionaries. The focus is on bilingualized (BLD) and extended bilingual (EBL) dictionaries as basic design options. We envisage an imbalanced design where guidance in terms of both production and reception is focused on English. Our design aims at the maximum utilization of the physical space in a paper dictionary with coverage of at least 90% of both English and Setswana.http://www.alasa.org.za/nf201