15 research outputs found

    Tone labelling algorithm for Sesotho

    Get PDF
    M.Sc., Faculty of Science, University of the Witwatersrand, 2011Studies have shown that text-to-speech systems need detailed prosodic models of a language in order to ideally sound natural to native speakers of the language. A text-to-speech system developed for Sesotho needs to have tone implemented in it since Sesotho is a tonal language which uses pitch variations to distinguish lexical and/or grammatical meaning. In order to implement tone for a language such as Sesotho, it is necessary for a tone modeling algorithm to receive as input the tone labels of the syllables of a word. This allows the algorithm to predict the appropriate intonation of the word. The aim of our study is to improve a basic tone labeling algorithm that predicts tone labels using three Sesotho tonal rules. The application of this algorithm is restricted to polysyllabic verb stems. The research study involves implementing an extended tone labeling algorithm that implements four additional Sesotho tonal rules and extends its application to all the other parts of speech. The results of our study show that the extended tone labeling algorithm significantly improves the basic algorithm by increasing the number of matched tone labels. Furthermore, our study provides the basic step to tone modeling for languages such as Sesotho which do not mark tone labels in orthography

    Development of isiXhosa text-to-speech modules to support e-Services in marginalized rural areas

    Get PDF
    Information and Communication Technology (ICT) projects are being initiated and deployed in marginalized areas to help improve the standard of living for community members. This has lead to a new field, which is responsible for information processing and knowledge development in rural areas, called Information and Communication Technology for Development (ICT4D). An ICT4D projects has been implemented in a marginalized area called Dwesa; this is a rural area situated in the wild coast of the former homelandof Transkei, in the Eastern Cape Province of South Africa. In this rural community there are e-Service projects which have been developed and deployed to support the already existent ICT infrastructure. Some of these projects include the e-Commerce platform, e-Judiciary service, e-Health and e-Government portal. Although these projects are deployed in this area, community members face a language and literacy barrier because these services are typically accessed through English textual interfaces. This becomes a challenge because their language of communication is isiXhosa and some of the community members are illiterate. Most of the rural areas consist of illiterate people who cannot read and write isiXhosa but can only speak the language. This problem of illiteracy in rural areas affects both the youth and the elderly. This research seeks to design, develop and implement software modules that can be used to convert isiXhosa text into natural sounding isiXhosa speech. Such an application is called a Text-to-Speech (TTS) system. The main objective of this research is to improve ICT4D eServices’ usability through the development of an isiXhosa Text-to-Speech system. This research is undertaken within the context of Siyakhula Living Lab (SLL), an ICT4D intervention towards improving the lives of rural communities of South Africa in an attempt to bridge the digital divide. Thedeveloped TTS modules were subsequently tested to determine their applicability to improve eServices usability. The results show acceptable levels of usability as having produced audio utterances for the isiXhosa Text-To-Speech system for marginalized areas

    UmobiTalk: Ubiquitous Mobile Speech Based Learning Language Translator for Sesotho Language

    Get PDF
    Published ThesisThe need to conserve the under-resourced languages is becoming more urgent as some of them are becoming extinct; natural language processing can be used to redress this. Currently, most initiatives around language processing technologies are focusing on western languages such as English and French, yet resources for such languages are already available. The Sesotho language is one of the under-resourced Bantu languages; it is mostly spoken in Free State province of South Africa and in Lesotho. Like other parts of South Africa, Free State has experienced high number of migrants and non-Sesotho speakers from neighboring provinces and countries; such people are faced with serious language barrier problems especially in the informal settlements where everyone tends to speak only Sesotho. Non-Sesotho speakers refers to the racial groups such as Xhosas, Zulus, Coloureds, Whites and more, in which Sesotho language is not their native language. As a solution to this, we developed a parallel corpus that has English as source and Sesotho as a target language and packaged it in UmobiTalk - Ubiquitous mobile speech based learning translator. UmobiTalk is a mobile-based tool for learning Sesotho for English speakers. The development of this tool was based on the combination of automatic speech recognition, machine translation and speech synthesis

    Spoken language identification in resource-scarce environments

    Get PDF
    South Africa has eleven official languages, ten of which are considered “resource-scarce”. For these languages, even basic linguistic resources required for the development of speech technology systems can be difficult or impossible to obtain. In this thesis, the process of developing Spoken Language Identification (S-LID) systems in resource-scarce environments is investigated. A Parallel Phoneme Recognition followed by Language Modeling (PPR-LM) architecture is utilized and three specific scenarios are investigated: (1) incomplete resources, including the lack of audio transcriptions and/or pronunciation dictionaries; (2) inconsistent resources, including the use of speech corpora that are unmatched with regard to domain or channel characteristics; and (3) poor quality resources, such as wrongly labeled or poorly transcribed data. Each situation is analysed, techniques defined to mitigate the effect of limited or poor quality resources, and the effectiveness of these techniques evaluated experimentally. Techniques evaluated include the development of orthographic tokenizers, bootstrapping of transcriptions, filtering of low quality audio, diarization and channel normalization techniques, and the human verification of miss-classified utterances. The knowledge gained from this research is used to develop the first S-LID system able to distinguish between all South African languages. The system performs well, able to differentiate among the eleven languages with an accuracy of above 67%, and among the six primary South African language families with an accuracy of higher than 80%, on segments of speech of between 2s and 10s in length. AFRIKAANS : Suid-Afrika het elf amptelike tale waarvan tien as hulpbron-skaars beskou word. Vir die tien tale kan selfs die basiese hulpbronne wat benodig word om spraak tegnologie stelsels te ontwikkel moeilik wees om te bekom. Die proses om ‘n Gesproke Taal Identifisering stelsel vir hulpbron-skaars omgewings te ontwikkel, word in hierdie tesis ondersoek. ‘n Parallelle Foneem Herkenning gevolg deur Taal Modellering argitektuur word ingespan om drie spesifieke moontlikhede word ondersoek: (1) Onvolledige Hulpbronne, byvoorbeeld vermiste transkripsies en uitspraak woordeboeke; (2) Teenstrydige Hulpbronne, byvoorbeeld die gebruik van spraak data-versamelings wat teenstrydig is in terme van kanaal kenmerke; en (3) Hulpbronne van swak kwaliteit, byvoorbeeld foutief geklasifiseerde data en klank opnames wat swak getranskribeer is. Elke situasie word geanaliseer, tegnieke om die negatiewe effekte van min of swak hulpbronne te verminder word ontwikkel, en die bruikbaarheid van hierdie tegnieke word deur middel van eksperimente bepaal. Tegnieke wat ontwikkel word sluit die ontwikkeling van ortografiese ontleders, die outomatiese ontwikkeling van nuwe transkripsies, die filtrering van swak kwaliteit klank-data, klank-verdeling en kanaal normalisering tegnieke, en menslike verifikasie van verkeerd geklassifiseerde uitsprake in. Die kennis wat deur hierdie navorsing bekom word, word gebruik om die eerste Gesproke Taal Identifisering stelsel wat tussen al die tale van Suid-Afrika kan onderskei, te ontwikkel. Hierdie stelsel vaar relatief goed, en kan die elf tale met ‘n akkuraatheid van meer as 67% identifiseer. Indien daar op die ses taal families gefokus word, verbeter die persentasie tot meer as 80% vir segmente wat tussen 2 en 10 sekondes lank. CopyrightDissertation (MEng)--University of Pretoria, 2010.Electrical, Electronic and Computer Engineeringunrestricte

    A speaker classification framework for non-intrusive user modeling : speech-based personalization of in-car services

    Get PDF
    Speaker Classification, i.e. the automatic detection of certain characteristics of a person based on his or her voice, has a variety of applications in modern computer technology and artificial intelligence: As a non-intrusive source for user modeling, it can be employed for personalization of human-machine interfaces in numerous domains. This dissertation presents a principled approach to the design of a novel Speaker Classification system for automatic age and gender recognition which meets these demands. Based on literature studies, methods and concepts dealing with the underlying pattern recognition task are developed. The final system consists of an incremental GMM-SVM supervector architecture with several optimizations. An extensive data-driven experiment series explores the parameter space and serves as evaluation of the component. Further experiments investigate the language-independence of the approach. As an essential part of this thesis, a framework is developed that implements all tasks associated with the design and evaluation of Speaker Classification in an integrated development environment that is able to generate efficient runtime modules for multiple platforms. Applications from the automotive field and other domains demonstrate the practical benefit of the technology for personalization, e.g. by increasing local danger warning lead time for elderly drivers.Die Sprecherklassifikation, also die automatische Erkennung bestimmter Merkmale einer Person anhand ihrer Stimme, besitzt eine Vielzahl von Anwendungsmöglichkeiten in der modernen Computertechnik und KĂŒnstlichen Intelligenz: Als nicht-intrusive Wissensquelle fĂŒr die Benutzermodellierung kann sie zur Personalisierung in vielen Bereichen eingesetzt werden. In dieser Dissertation wird ein fundierter Ansatz zum Entwurf eines neuartigen Sprecherklassifikationssystems zur automatischen Bestimmung von Alter und Geschlecht vorgestellt, welches diese Anforderungen erfĂŒllt. Ausgehend von Literaturstudien werden Konzepte und Methoden zur Behandlung des zugrunde liegenden Mustererkennungsproblems entwickelt, welche zu einer inkrementell arbeitenden GMM-SVM-Supervector-Architektur mit diversen Optimierungen fĂŒhren. Eine umfassende datengetriebene Experimentalreihe dient der Erforschung des Parameterraumes und zur Evaluierung der Komponente. Weitere Studien untersuchen die SprachunabhĂ€ngigkeit des Ansatzes. Als wesentlicher Bestandteil der Arbeit wird ein Framework entwickelt, das alle im Zusammenhang mit Entwurf und Evaluierung von Sprecherklassifikation anfallenden Aufgaben in einer integrierten Entwicklungsumgebung implementiert, welche effiziente Laufzeitmodule fĂŒr verschiedene Plattformen erzeugen kann. Anwendungen aus dem Automobilbereich und weiteren DomĂ€nen demonstrieren den praktischen Nutzen der Technologie zur Personalisierung, z.B. indem die Vorlaufzeit von lokalen Gefahrenwarnungen fĂŒr Ă€ltere Fahrer erhöht wird

    ANALYSING THE FRAMES OF A BIBLE. The Case of the Setswana Translations of the Book of Ruth

    Get PDF
    This volume of BiAS is on theory and practise of biblical tranlation. In 1857, Setswana language produced the first complete Bible in a Bantu language. The second Setswana Bible was published in 1908 – the third in 1970. In each case, different circumstances, factors or contextual frames of reference dictated the need for such a Bible version. The frames converged to cause differences between the meanings of the Setswana Bibles and their sources. This book demonstrates how specific socio-cultural, organisational, linguistic, textual and communicational frames probably influenced the outcome of each Bible. It subsequently presents a framework for analysing existing Bibles and for minimising the occurrence of shifts in prospective translations. The book of Ruth is used as an example while Biblia Hebraica Stuttgartensia (BHS) is treated as an ideal source text

    A critical investigation of deaf comprehension of signed tv news interpretation

    Get PDF
    This study investigates factors hampering comprehension of sign language interpretations rendered on South African TV news bulletins in terms of Deaf viewers’ expectancy norms and corpus analysis of authentic interpretations. The research fills a gap in the emerging discipline of Sign Language Interpreting Studies, specifically with reference to corpus studies. The study presents a new model for translation/interpretation evaluation based on the introduction of Grounded Theory (GT) into a reception-oriented model. The research question is addressed holistically in terms of target audience competencies and expectations, aspects of the physical setting, interpreters’ use of language and interpreting choices. The South African Deaf community are incorporated as experts into the assessment process, thereby empirically grounding the research within the socio-dynamic context of the target audience. Triangulation in data collection and analysis was provided by applying multiple mixed data collection methods, namely questionnaires, interviews, eye-tracking and corpus tools. The primary variables identified by the study are the small picture size and use of dialect. Secondary variables identified include inconsistent or inadequate use of non-manual features, incoherent or non-simultaneous mouthing, careless or incorrect sign execution, too fast signing, loss of visibility against skin or clothing, omission of vital elements of sentence structure, adherence to source language structures, meaningless additions, incorrect referencing, oversimplification and violations of Deaf norms of restructuring, information transfer, gatekeeping and third person interpreting. The identification of these factors allows the construction of a series of testable hypotheses, thereby providing a broad platform for further research. Apart from pioneering corpus-driven sign language interpreting research, the study makes significant contributions to present knowledge of evaluative models, interpreting strategies and norms and systems of transcription and annotation.Linguistics and Modern LanguagesThesis (D. Litt.et Phil.) (Linguistics

    Information communication technologies as a support mechanism for learners experiencing reading difficulties

    Get PDF
    Reading difficulties are of concern worldwide, as evidenced by a number of studies, including the Association for the Development of Education in Africa (ADEA), the Centre for Evaluation & Assessment (CEA), and Progress in International Reading Literacy (PIRLS). In South Africa’s, Gauteng Province, in which this study was conducted, the Department of Education (DoE) launched campaigns, such as Foundations for Learning (FFL) and Annual National Assessment (ANA) to address this problem. The purpose of this study was to explore, explain and describe the use of Information Communication Technologies (ICTs) to support learners experiencing reading difficulties in two public primary schools. The study was influenced by Vygotsky’s socio-cultural theory of human learning that describes it as a social process and the origination of human intelligence in society or culture. It comprised skills, assumptions and practices that the researcher used when moving from paradigm to the empirical world. A qualitative approach was used to gain first-hand holistic understanding of the use of ICTs to support learners experiencing reading difficulties, with data collected using focus group interviews, individual interviews and observations. Participants were 18 members of the School Based Support Team (SBST) and two Learning Support Educators (LSEs) of the two selected primary schools. The use of ICTs as a support mechanism was explored, with a detailed view presented on the use of ICTs by the teachers during teaching and learning activities and how they supported learners experiencing reading difficulties. From the research findings, factors affecting learners experiencing reading difficulties were identified, including lack of resources (specifically ICTS) and lack of guidelines on identifying and providing support to the learners experiencing reading difficulties. Based on the findings, conclusions and recommendations were made and the researcher developed guidelines which could be used by teachers to provide ICTs support for learners with reading difficulties.Educational Studie
    corecore