89 research outputs found
Minimizing word error rate in a dyslexic reading-oriented ASR engine using phoneme refinement and alternative pronunciation
Little attention has been given to detecting miscues in the text space read by dyslexic children over an automatic speech recognition (ASR) engine. In an ASR system, the miscues are represented by word error rate (WER) and miscue detection rate (MDR). At all time, WER must be kept low, and MDR high so as to achieve better recognition. This paper focus on minimizing word error rate by formulating a better model for perspicuous representation of input data. Such representation takes into account phoneme refinement and alternative pronunciation for a particular Bahasa Melayu (BM) speech data uttered by dyslexic children. Based on literature, a few other optimal models of input data and their recognition results were compared. It is found that
phoneme refinement and alternative pronunciation produced better recognition results as evidenced in the performance metrics --lower WER and higher MDR-- which are 25% and 80.77% respectively
Dyslexic children's reading pattern as input for ASR: Data, analysis, and pronunciation model
To realize an automatic speech recognition (ASR) model that
is able to recognize the Bahasa Melayu reading difficulties of dyslexic children, the language corpora has to be generated beforehand. For this purpose, data collection is performed in two public schools involving ten dyslexic children aged between seven to fourteen years old. A total of 114 Bahasa Melayu words,representing 23 consonant-vowel patterns in the spelling system of the language, served as the stimuli. The patterns range from simple to somewhat complex formations of consonant-vowel pairs in words listed in a level one primary school syllabus. An analysis was performed aimed at identifying the most frequent errors made by these dyslexic children when reading aloud, and
describing the emerging reading pattern of dyslexic children
in general. This paper hence provides an overview of the
entire process from data collection to analysis to modeling the pronunciations of words which will serve as the active lexicon for the ASR model. This paper also highlights the challenges of data collection involving dyslexic children when they are reading aloud, and other factors that contribute to the complex nature of the data collected
Management of information through an interactive voice response system the prospect of religious domain
This paper explores the potential of using an interactive voice response system to maange information in institutions of religious domain. We first explain the terminologies involved such as an interactive voice response system, computer telephony integration, information management, and
religious domain. Then we look at how information is managed through an interactive voice response system. IVR application supports two kinds of communications, i.e. real time communication and message type communication. Users could make either a real time communication or a message type communication as needed. Real time communication includes accessing databases in order to get instant and updated information within the real time during the call is made through the system. Message type communication includes voice mail application using telephone recording. The suitability of using interactive voice response for institution of religious domain is studied, and we also
describe a few options for the implementation of the system
Strategies in designing user interface for a microsoft multipoint-based courseware
Designing interaction in a courseware has been an ongoing challenge worldwide, especially now,after the introduction of Microsoft Multipoint SDK (Multipoint). This new feature enables multiple,up to 250, USB mouse devices to work simultaneously on one computer. Previous design of courseware interface only takes one mouse into account. Now interface designers have to consider 1 to 250 mouse devices, which in turn results in similar number of pointers or cursors scattering all over in the display. This paper table the impact of Multipoint feature in a courseware interface design and suggests ways to handle multiple pointers on screen while, at the same time,
still complying to the user interface guidelines. Utilizing a Multipoint-based courseware that we developed internally, we offer three strategies that can be adopted for a group of 2-4 pointers, 5-10 pointers, and more than ten pointers. These strategies are tested by a group of university
students and we found out that the strategy helps providing smooth interaction and collaboration for up to 15 users and 15 pointers on a single display. Details of the results is reported in the paper
Digital graphic novels: Technology enhanced narrative for learning
The effectiveness of digital graphic novels (DGN) has never been measured from the perspective of interaction design (IxD) although previous studies regarded DGN as having materiality and multisensoriality characteristics which were crucial for most purposes and territories, especially for students’ learning.Teaching a dry graduate course with most of the traditional methods has been too dull to reinforce concepts taught in the classroom. This paper proposes DGN for learning difficult subjects such as ‘research methods’ and ‘Islamic values’.The DGN however, must be of a well-crafted content based on abstractions of real life events Methods of abstracting the content and enhancing the narratives in producing a DGN are provided.The DGN prototypes were developed and experimented with real graduate students.The results have proven a number of interesting facts about IxD in DGN and the desirability of DGN as one of the promising learning tools
A virtual repository approach to departmental information sharing
Realising the difficulties in information sharing among the academicians, this paper introduces an alternative information sharing model supporting information management features. The model is designed in an attempt to facilitate information distribution and information sharing at minimum
effort and cost. A centralised database approach is used in the model enabling any educator in the higher learning institution to participate and manage the database, conceptualizing that the database is their very own ‘personal library’. This model is then implemented in an application given a persona name ViRepo –-a web-based internal virtual repository. The process of analysing,
designing, and implementing the model into ViRepo is sufficiently reported in this paper. ViRepo proved that the model, in actual fact, allows information access, information sharing, information management, partnership enhancement, and allows an utmost repository control by each and every participating educator
A retrospective view on the promise on machine translation for Bahasa Melayu-English
Research and development activities for machine translation systems from English language to others are more progressive than vice versa. It has been more than 30 years since the machine translation was introduced and yet a Malay language or Bahasa Melayu (BM) to English machine translation engine is not available. Consequently, many translation systems have been developed for the world's top 10 languages in terms of native speakers, but none for BM, although the language is used by more than 200 million speakers around the world. This paper attempts to seek possible reasons as why such situation occurs. A summative overview to show progress, challenges as well as future works on MT is presented. Issues faced by researchers and system developers in modeling and developing a machine translation engine are also discussed. The study of the previous translation systems (from other languages to English) reveals that the accuracy level can be achieved up to 85 %. The figure suggests that the translation system is not reliable if it is to be utilized in a serious translation activity. The most prominent difficulties are the complexity of grammar rules and ambiguity problems of the source language. Thus, we hypothesize that the inclusion of ‘semantic’ property in the translation rules may produce a better quality BM-English MT engine
A systematic method towards generating the Malaysian folktale classification system
This paper explicates two methods used in a study conducted to propose a Malaysian folktale classification system.Fundamentally, three substantial folktale classification systems exist and individually, each system classifies based on three distinctive folktale units: type, motif, and function. Independently, type and motif classify based on the content of folktale while function, the structure of folktale.The study aims to generate the Malaysian folktale classification system based on an amalgamation of the three renowned units of folktale.To classify, the method selected is structural-semantic analysis which encompassed three levels of classification. It classifies sequentially according to the content and structure of the Malaysian folktale. Nonetheless, prior to classify, an identification of the Malaysian folktales must take place Such task is steered by two qualifying factors: formal features in an operational definition developed in the study and also an ownership of the folktales. These factors function as a filter towards safeguarding the study from contamination of forms of folklore other than the study intends to examine.Additionally, they assist in discriminating between modern folktales and the ones with embedded cultural values of different generations. In a nutshell, this paper reveals the methods to identify and classify the Malaysian folktales
Pronunciation variations and context-dependent model to improve ASR performance for dyslexic children’s read speech
Focusing on the key element for an ASR-based application for dyslexic children reading isolated words in Bahasa Melayu, this paper can be an evidence of the need to have a carefully designed acoustic model for a satisfying recognition accuracy of 79.17% on test dataset. Pronunciation variations and context-dependent model are two main components of such acoustic model. This model adopts the most frequent errors in reading selected vocabulary, which are obtained from primary data collection and analysis.The analysis gives the most frequent spelling and reading errors as vowel substitution with over 20% of total errors made
ASR technology for immediate intervention to support reading for dyslexic children
REading is an essential skill towards literacy development and thus help should be provided so that children can master the skill at early age. For dyslexic children, mastering the skills is a challenge. It has been widely agreed that the theory behind such difficulties in reading for dyslexic lies in the phonological-core deficits. Support has been given in many ways to dyslexic children to teach them to read from teaching using various multi-sensory methods to using computer-based applications which include animated characters and text-to-speech (TTS) technology. In such applications, although stimulating, requires the children to call for help by pressing custom-made buttons on the computer screen. Often, such an application requires the dyslexic children to be aware of their mistakes and be able to judge when help is needed. They too are just reluctant to ask the computer for help. Hence, such technology does not provide immediate intervention to correct any reading failure. It is therefore worth to look at the promising automatic speech recognition technology (ASR) to provide such intervention. Hence, this paper gives an overview of the use of ASR to facilitate immediate reading intervention which is the key element of remediating reading among dyslexic children. For such intervention to work, data on reading mistakes and patterns are observed and collected in audio format. The data serve as training and testing samples for an ASR to train on. An observation was carried out in two public schools participated in the study to record dyslexic children’s reading in Bahasa Melayu (BM) and observe error patterns and their behaviours toward reading. A total of 10 dyslexic children are involved and a total of 6384 utterances from a set of selected words have been gathered and analysed. Data are grouped into error type categories and the analysis performed gives ‘vowel substitution’ as the most frequent error made (20%). The significant findings can be of interest of special education teachers or parents to devise and use suitable approach to correct reading mistakes often made by dyslexic children. The findings also contribute to the development of a suitable and well-tuned ASR model focusing on dyslexic children reading aloud in BM
- …