6 research outputs found

    Evaluation of a Runyankore grammar engine for healthcare messages

    Get PDF
    Natural Language Generation (NLG) can be used to generate personalized health information, which is especially useful when provided in one's own language. However, the NLG technique widely used in different domains and languages---templates---was shown to be inapplicable to Bantu languages, due to their characteristic agglutinative structure. We present here our use of the grammar engine NLG technique to generate text in Runyankore, a Bantu language indigenous to Uganda. Our grammar engine adds to previous work in this field with new rules for cardinality constraints, prepositions in roles, the passive, and phonological conditioning. We evaluated the generated text with linguists and non-linguists, who regarded most text as grammatically correct and understandable; and over 60\% of them regarded all the text generated by our system to have been authored by a human being

    Ontology verbalization in agglutinating Bantu languages: a study of Runyankore and its generalizability

    Get PDF
    Natural Language Generation (NLG) systems have been developed to generate text in multiple domains, including personalized patient information. However, their application is limited in Africa because they generate text in English, yet indigenous languages are still predominantly spoken throughout the continent, especially in rural areas. The existing healthcare NLG systems cannot be reused for Bantu languages due to the complex grammatical structure, nor can the generated text be used in machine translation systems for Bantu languages because they are computationally under-resourced. This research aimed to verbalize ontologies in agglutinating Bantu languages. We had four research objectives: (1) noun pluralization and verb conjugation in Runyankore; (2) Runyankore verbalization patterns for the selected description logic constructors; (3) combining the pluralization, conjugation, and verbalization components to form a Runyankore grammar engine; and (4) generalizing the Runyankore and isiZulu approaches to ontology verbalization to other agglutinating Bantu languages. We used an approach that combines morphology with syntax and semantics to develop a noun pluralizer for Runyankore, and used Context-Free Grammars (CFGs) for verb conjugation. We developed verbalization algorithms for eight constructors in a description logic. We then combined these components into a grammar engine developed as a Protégé5X plugin. The investigation into generalizability used the bootstrap approach, and investigated bootstrapping for languages in the same language zone (intra-zone bootstrappability) and languages across language zones (inter-zone bootstrappability). We obtained verbalization patterns for Luganda and isiXhosa, in the same zones as Runyankore and isiZulu respectively, and chiShona, Kikuyu, and Kinyarwanda from different zones, and used the bootstrap metric that we developed to identify the most efficient source—target bootstrap pair. By regrouping Meinhof’s noun class system we were able to eliminate non-determinism during computation, and this led to the development of a generic noun pluralizer. We also showed that CFGs can conjugate verbs in the five additional languages. Finally, we proposed the architecture for an API that could be used to generate text in agglutinating Bantu languages. Our research provides a method for surface realization for an under-resourced and grammatically complex family of languages, Bantu languages. We leave the development of a complete NLG system based on the Runyankore grammar engine and of the API as areas for future work

    Natural Language Generation Requirements for Social Robots in Sub-Saharan Africa

    Get PDF
    Robots are deployed in Africa mainly in manufacturing, yet they may assist in society as future oriented technologies as well. They may ameliorate, e.g., service delivery issues and skills shortages. In this discussion paper, several uses and use cases relevant to Sub-Saharan Africa are described and requirements identified. We zoom in on human-robot interaction in Niger-Congo B (‘bantu’) languages. Use cases for healthcare and education elucidate specific requirements for the natural language generation component of robots in society. In contrast to typical generation systems, it demands i) combining data-to-text and knowledge-to-text in one system, ii) generating different types of sentences so as to switch between written and spoken language, and iii) processing non-trivial numbers

    Surface realization architecture for low-resourced African languages

    Get PDF
    There has been growing interest in building surface realization systems to support the automatic generation of text in African languages. Such tools focus on converting abstract representations of meaning to text. Since African languages are low-resourced, economical use of resources and general maintainability are key considerations. However, there is no existing surface realizer architecture that possesses most of the maintainability characteristics (e.g., modularity, reusability, and analyzability) that will lead to maintainable software that can be used for the languages. Moreover, there is no consensus surface realization architecture created for other languages that can be adapted for the languages in question. In this work, we solve this by creating a novel surface realizer architecture suitable for low-resourced African languages that abide by the features of maintainable software. Its design comes after a granular analysis, classification, and comparison of the architectures used by 77 existing NLG systems. We compare our architecture to existing architectures and show that it supports the most features of a maintainable software product.Hasso Plattner Institute for Digital Engineering through the HPI Research School at UCT and the National Research Foundation (NRF) of South Africahttps://dl.acm.org/journal/tallipInformatic

    6th Critical Approaches to Discourse Analysis across Disciplines Conference (CADAAD 2016) - Book of Abstracts

    Get PDF
    Book of Abstracts of the 6th Critical Approaches to Discourse Analysis across Disciplines Conference (CADAAD 2016

    Evaluation of a Runyankore grammar engine for healthcare messages

    No full text
    corecore