165 research outputs found

    The influence of orthography on spoken word recognition in Bangla

    Get PDF
    The lexical representation of words constitutes the phonological, orthographic and semantic information about a word, which is accessed together despite the task demanding only one aspect of the information. The role of orthography in word recognition tasks has been validated, though its influence on phonological tasks is lesser known. Recent studies in psycholinguistics have begun to investigate the possible influences of orthography on the auditory processing of words. The present paper reviews studies that have looked at orthographic influence on phonological tasks, and reports findings from a Rhyme-monitoring task in Bangla, to examine the role of orthography in auditory processing

    Development of a Yoruba Text-to-Speech System Using Festival

    Get PDF
    This paper presents a Text-to-Speech (TTS) synthesis system for YorĂşbĂ  language using the open-source Festival TTS engine. YorĂşbĂ  being a resource scarce language like most African languages however presents a major challenge to conventional speech synthesis approaches, which typically require large corpora for the training of such system. Speech data were recorded in a quiet environment with a noise cancelling microphone on a typical multimedia computer system using the Speech Filing System software (SFS), analysed and annotated using PRAAT speech processing software. Evaluation of the system was done using the intelligibility and naturalness metrics through mean opinion score. The result shows that the level of intelligibility and naturalness of the system on word-level is 55.56% and 50% respectively, but the system performs poorly for both intelligibility and naturalness test on sentence level. Hence, there is a need for further research to improve the quality of the synthesized speech. Keywords: Text-to-Speech, Festival, YorĂşbĂ , Syllabl

    A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks

    Full text link
    Deep LSTM is an ideal candidate for text recognition. However text recognition involves some initial image processing steps like segmentation of lines and words which can induce error to the recognition system. Without segmentation, learning very long range context is difficult and becomes computationally intractable. Therefore, alternative soft decisions are needed at the pre-processing level. This paper proposes a hybrid text recognizer using a deep recurrent neural network with multiple layers of abstraction and long range context along with a language model to verify the performance of the deep neural network. In this paper we construct a multi-hypotheses tree architecture with candidate segments of line sequences from different segmentation algorithms at its different branches. The deep neural network is trained on perfectly segmented data and tests each of the candidate segments, generating unicode sequences. In the verification step, these unicode sequences are validated using a sub-string match with the language model and best first search is used to find the best possible combination of alternative hypothesis from the tree structure. Thus the verification framework using language models eliminates wrong segmentation outputs and filters recognition errors

    Production of Bangla stops by native English speakers learning Bangla: An acoustic analysis

    Get PDF
    Differences in the phonetic and phonological systems of Bangla and English result in negative transfer in the Bangla stop productions of native English speakers. The phonetic realizations of Voice and Aspiration and their interactions with each other are the key factors in this. A production study was carried out focusing on sixteen of the twenty Bangla stops that are distinguished by a four-way voice/aspiration contrast at four different places of articulation, providing a contrastive acoustic analysis of the pronunciation of L1 and L2 adult speakers. Data containing these stops in an intervocalic environment in word-initial, word-medial, and word-final positions was elicited by digital recording from twelve native Bangla speakers and twelve native English speakers. The data from the L1 speakers was analyzed to investigate production characteristics related to the following acoustic variables: vowel voicing onset time, closure duration, closure voicing, preceding vowel duration, and duration of aspiration noise. The data from the L2 speakers was then analyzed using the same variables. The primary acoustic correlates of Voice and Aspiration in Bangla were found to be closure voicing and vowel voicing onset time, respectively, and the interaction of these two variables made a clear distinction between the four stop classes of Bangla: voiceless unaspirated, voiceless aspirated, voiced unaspirated, and voiced aspirated. Evidence was found supporting the work of various researchers who have suggested that a [breathy voice] feature is not necessary for a phonological description of the Indo-Aryan languages. The stop productions of the native English speakers indicated a conceptual awareness of the four stop classes, but it was also clear that they lacked a native-like control of the Voice and Aspiration features and their specific interactions with each other. The degree to which the L2 productions of the four stop classes were different from those of the L1 was directly correlated to each class’s similarity to English phonological patterns, providing evidence of certain predictable aspects of L1 transfer. In order to fully apply the results of this study in a pronunciation acquisition context, perceptual studies will need to be done to identify the salience of these acoustic variables for both L1 and L2 speakers. Perceptual studies involving L1 speakers may also give a greater understanding to the ongoing discussion on the best phonological description of the four-way stop systems of the Indo-Aryan languages

    RFID Technology in Intelligent Tracking Systems in Construction Waste Logistics Using Optimisation Techniques

    Get PDF
    Construction waste disposal is an urgent issue for protecting our environment. This paper proposes a waste management system and illustrates the work process using plasterboard waste as an example, which creates a hazardous gas when land filled with household waste, and for which the recycling rate is less than 10% in the UK. The proposed system integrates RFID technology, Rule-Based Reasoning, Ant Colony optimization and knowledge technology for auditing and tracking plasterboard waste, guiding the operation staff, arranging vehicles, schedule planning, and also provides evidence to verify its disposal. It h relies on RFID equipment for collecting logistical data and uses digital imaging equipment to give further evidence; the reasoning core in the third layer is responsible for generating schedules and route plans and guidance, and the last layer delivers the result to inform users. The paper firstly introduces the current plasterboard disposal situation and addresses the logistical problem that is now the main barrier to a higher recycling rate, followed by discussion of the proposed system in terms of both system level structure and process structure. And finally, an example scenario will be given to illustrate the system’s utilization

    How long is long? Word length effects in reading correspond to minimal graphemic units: An MEG study in Bangla.

    Get PDF
    This paper presents a magnetoencephalography (MEG) study on reading in Bangla, an east Indo-Aryan language predominantly written in an abugida script. The study aims to uncover how visual stimuli are processed and mapped onto abstract linguistic representations in the brain. Specifically, we investigate the neural responses that correspond to word length in Bangla, a language with a unique orthography that introduces multiple ways to measure word length. Our results show that MEG signals localised in the anterior left fusiform gyrus, at around 130ms, are highly correlated with word length when measured in terms of the number of minimal graphemic units in the word rather than independent graphemic units (akśar) or phonemes. Our findings suggest that minimal graphemic units could serve as a suitable metric for measuring word length in non-alphabetic orthographies such as Bangla

    Marathi Speech Synthesis: A Review

    Get PDF
    This paper seeks to reveal the various aspects of Marathi Speech synthesis. This paper has reviewed research development in the International languages as well as Indian languages and then centering on the development in Marathi languages with regard to other Indian languages. It is anticipated that this work will serve to explore more in Marathi language. DOI: 10.17762/ijritcc2321-8169.15064
    • …
    corecore