9 research outputs found

    Surface realization architecture for low-resourced African languages

    Get PDF
    There has been growing interest in building surface realization systems to support the automatic generation of text in African languages. Such tools focus on converting abstract representations of meaning to text. Since African languages are low-resourced, economical use of resources and general maintainability are key considerations. However, there is no existing surface realizer architecture that possesses most of the maintainability characteristics (e.g., modularity, reusability, and analyzability) that will lead to maintainable software that can be used for the languages. Moreover, there is no consensus surface realization architecture created for other languages that can be adapted for the languages in question. In this work, we solve this by creating a novel surface realizer architecture suitable for low-resourced African languages that abide by the features of maintainable software. Its design comes after a granular analysis, classification, and comparison of the architectures used by 77 existing NLG systems. We compare our architecture to existing architectures and show that it supports the most features of a maintainable software product.Hasso Plattner Institute for Digital Engineering through the HPI Research School at UCT and the National Research Foundation (NRF) of South Africahttps://dl.acm.org/journal/tallipInformatic

    OntoVerbal-M: A multilingual verbaliser for SNOMED CT

    No full text
    Abstract. OntoVerbal-M is an ontology verbaliser that transforms OWL into fluent natural language paragraphs in multiple languages. We describe the application of OntoVerbal-M to SNOMED CT, whereby SNOMED CT classes are presented as textual paragraphs in both English and Mandarin through the use of natural language generation. SNOMED CT is a large description logic based terminology for recording in electronic health records. Often, neither the labels nor the description logic definitions in SNOMED CT are easy for users to understand. Furthermore, information is increasingly being recorded, not just using individual SNOMED CT concepts, but using dynamically created description logic expressions (“post-coordinated ” concepts). Such post-coordinated expressions can have no pre-assigned labels. In this context automatic verbalisation into multiple languages will be useful both for understanding and quality assurance of SNOMED CT definitions, and for helping different language-speaking-users to understand and share post-coordinated expressions
    corecore