73 research outputs found
INTELLIGENT VOICE-BASED E-EDUCATION SYSTEM: A FRAMEWORK AND EVALUATION
Voice-based web e-Education is a technology-supported learning paradigm that allows phone-access of
learners to e-Learning web-based applications. These applications are designed mainly for the visually impaired. They
are however lacking in attributes of adaptive and reusable learning objects, which are emerging requirements for
applications in these domain. This paper presents a framework for developing intelligent voice-based applications in
the context of e-Education. The framework presented supports intelligent components such as adaptation and
recommendation services. A prototype Intelligent Voice-based E-Education System (iVEES) was developed and
subjected to test by visually impaired users. A usability study was carried out using the International Standard
Organization’s (ISO) 9241-11 specification to determine the level of effectiveness, efficiency and user satisfaction.
Report of our findings shows that the application is of immense benefit, based on the system’s inherent capacity for
taking autonomous decision that are capable of adapting to users’ requests
Transcoding multilingual and non-standard web content to voiceXML
Includes abstract.Includes bibliographical references (leaves 112-119).Transcoding systems redesign and reformat already existing web interfaces into other formats so that they can be available to other audiences. For example, change it into audio, sign language or other medium. The bene_t of such systems is less work on meeting the needs of di_erent audiences. This thesis describes the design and the implementation details of a transcoding system called Dinaco. Dinaco is targeted at converting HTML web pages which are created using Extensible MarkupLanguage (XML) technologies to speech interfaces. The di_erentiating feature ofDinaco is that it uses separated annotations during its transcoding process, while previous transcoding systems use HTML dependent annotations. These separated annotations enable Dinaco to pre-normalize non-standard words and to generate VoiceXML interfaces which have semantics of content. The semantics help Textto-Speech (TTS) tools to read multilingual text and to do text normalization. The results from experiments indicate that pre-normalizing non-standard words and appending semantics enable Dinaco to generate VoiceXML interfaces which are more usable than those which are generated by transcoding systems which use HTML dependent annotations. The thesis uses the design of Dinaco to demonstrate how separating annotations makes it possible to write descriptions of content which cannot be written using external HTML dependent annotations and how separating annotations makes it easy to write, maintain, re-use and share annotations
A Portable, Server-Side Dialog Framework for VoiceXML
ABSTRACT We describe a spoken dialog application framework that combines the power and flexibility of server-side Java Servlets and Java Server Pages (JSPs) with the deployment portability, reliability and scalability of standard web (HTTP) servers and VoiceXML clients. Applications are developed by extending a framework of Java classes in order to define dialogs through lower level actions such as speech recognition, audio prompting, speech synthesis, and backend data access. The framework delegates session data management to servlets, embedding frame-based representations for the application's global and session data. Dialog flow is controlled through general constructions such as loops, conditionals, scoped sub-dialogs, along with scoped command, error, and exception handling. Prompting and grammars are configured through simple JSP templates that generate the VoiceXML instructions for the server to return to the client. The framework is designed to be extensible, as demonstrated by the implementation of customizable backup and repeat commands integrated with session data, command handling and grammar scoping
Improving Speech Interaction in Vehicles Using Context-Aware Information through A SCXML Framework
Speech Technologies can provide important benefits for the development of more usable and safe in-vehicle human-machine interactive systems (HMIs). However mainly due robustness issues, the use of spoken interaction can entail important distractions to the driver. In this challenging scenario, while speech technologies are evolving, further research is necessary to explore how they can be complemented with both other modalities (multimodality) and information from the increasing number of available sensors (context-awareness). The perceived quality of speech technologies can significantly be increased by implementing such policies, which simply try to make the best use of all the available resources; and the in vehicle scenario is an excellent test-bed for this kind of initiatives. In this contribution we propose an event-based HMI design framework which combines context modelling and multimodal interaction using a W3C XML language known as SCXML. SCXML provides a general process control mechanism that is being considered by W3C to improve both voice interaction (VoiceXML) and multimodal interaction (MMI). In our approach we try to anticipate and extend these initiatives presenting a flexible SCXML-based approach for the design of a wide range of multimodal context-aware HMI in-vehicle interfaces. The proposed framework for HMI design and specification has been implemented in an automotive OSGi service platform, and it is being used and tested in the Spanish research project MARTA for the development of several in-vehicle interactive applications
Model based development of speech recognition grammar for VoiceXML
Speech interaction is a natural form of interaction between human and devices. This interaction technology is currently in high demand but often has limitations. A limited form of interaction is thus in use to achieve best possible efficiency. The limited form of speech interaction uses direct commands instead of complete natural language. The VoiceXML is a W3C (World wide consortium) recommended web based speech interaction application development language that performs the dialogue management. It has been used as a base language in this thesis for the case study.
The VoiceXML uses a grammatical base to recognise the user utterance for the commands. This thesis applies the model based development approach to create a hierarchical data model for the speech grammar. Further, the grammar has been separated from the interaction code. MetaEdit+ tool has been used for developing the grammar model and for generating the grammar file.
The approach is further compared with other grammar models. In conclusion, the applied approach is found suitable for the grammar modelling in VoiceXML application development.
Descriptors: Data Model, Grammar Model, MetaEdit+ , VoiceXML, Grammar generation, Hierarchical Model, Speech interactio
On the Development of Adaptive and User-Centred Interactive Multimodal Interfaces
Multimodal systems have attained increased attention in recent years, which has made possible important
improvements in the technologies for recognition, processing, and generation of multimodal information.
However, there are still many issues related to multimodality which are not clear, for example, the
principles that make it possible to resemble human-human multimodal communication. This chapter
focuses on some of the most important challenges that researchers have recently envisioned for future
multimodal interfaces. It also describes current efforts to develop intelligent, adaptive, proactive, portable
and affective multimodal interfaces
- …