4 research outputs found

    How speaker tongue and name source language affect the automatic recognition of spoken names

    Get PDF
    In this paper the automatic recognition of person names and geographical names uttered by native and non-native speakers is examined in an experimental set-up. The major aim was to raise our understanding of how well and under which circumstances previously proposed methods of multilingual pronunciation modeling and multilingual acoustic modeling contribute to a better name recognition in a cross-lingual context. To come to a meaningful interpretation of results we have categorized each language according to the amount of exposure a native speaker is expected to have had to this language. After having interpreted our results we have also tried to find an answer to the question of how much further improvement one might be able to attain with a more advanced pronunciation modeling technique which we plan to develop

    Evaluating the pronunciation of proper names by four French grapheme-to-phoneme converters

    No full text
    International Speech Communication Association (Isca) - International Astronautical Federation.ISBN : 13 9781604234480.This article reports on the results of a cooperative evaluation of grapheme-to-phoneme (GP) conversion for proper names in French. This work was carried out within the framework of a general evaluation campaign of various speech and language processing devices, including text-to-speech synthesis. The corpus and the methodology are described. The results of 4 systems are analysed: with 12-20% word error rates on a list of 8,000 proper names, they give a fairly accurate picture of the progress achieved, the state-of-the-art and the problems still to be solved, in the domain of GP conversion in French. In addition, the resources and collected data will be made available to the scientific and industrial community, in order to be re-used in future bench-marks

    Development of isiXhosa text-to-speech modules to support e-Services in marginalized rural areas

    Get PDF
    Information and Communication Technology (ICT) projects are being initiated and deployed in marginalized areas to help improve the standard of living for community members. This has lead to a new field, which is responsible for information processing and knowledge development in rural areas, called Information and Communication Technology for Development (ICT4D). An ICT4D projects has been implemented in a marginalized area called Dwesa; this is a rural area situated in the wild coast of the former homelandof Transkei, in the Eastern Cape Province of South Africa. In this rural community there are e-Service projects which have been developed and deployed to support the already existent ICT infrastructure. Some of these projects include the e-Commerce platform, e-Judiciary service, e-Health and e-Government portal. Although these projects are deployed in this area, community members face a language and literacy barrier because these services are typically accessed through English textual interfaces. This becomes a challenge because their language of communication is isiXhosa and some of the community members are illiterate. Most of the rural areas consist of illiterate people who cannot read and write isiXhosa but can only speak the language. This problem of illiteracy in rural areas affects both the youth and the elderly. This research seeks to design, develop and implement software modules that can be used to convert isiXhosa text into natural sounding isiXhosa speech. Such an application is called a Text-to-Speech (TTS) system. The main objective of this research is to improve ICT4D eServices’ usability through the development of an isiXhosa Text-to-Speech system. This research is undertaken within the context of Siyakhula Living Lab (SLL), an ICT4D intervention towards improving the lives of rural communities of South Africa in an attempt to bridge the digital divide. Thedeveloped TTS modules were subsequently tested to determine their applicability to improve eServices usability. The results show acceptable levels of usability as having produced audio utterances for the isiXhosa Text-To-Speech system for marginalized areas
    corecore