164 research outputs found

    Design and evaluation of acceleration strategies for speeding up the development of dialog applications

    Get PDF
    In this paper, we describe a complete development platform that features different innovative acceleration strategies, not included in any other current platform, that simplify and speed up the definition of the different elements required to design a spoken dialog service. The proposed accelerations are mainly based on using the information from the backend database schema and contents, as well as cumulative information produced throughout the different steps in the design. Thanks to these accelerations, the interaction between the designer and the platform is improved, and in most cases the design is reduced to simple confirmations of the “proposals” that the platform dynamically provides at each step. In addition, the platform provides several other accelerations such as configurable templates that can be used to define the different tasks in the service or the dialogs to obtain or show information to the user, automatic proposals for the best way to request slot contents from the user (i.e. using mixed-initiative forms or directed forms), an assistant that offers the set of more probable actions required to complete the definition of the different tasks in the application, or another assistant for solving specific modality details such as confirmations of user answers or how to present them the lists of retrieved results after querying the backend database. Additionally, the platform also allows the creation of speech grammars and prompts, database access functions, and the possibility of using mixed initiative and over-answering dialogs. In the paper we also describe in detail each assistant in the platform, emphasizing the different kind of methodologies followed to facilitate the design process at each one. Finally, we describe the results obtained in both a subjective and an objective evaluation with different designers that confirm the viability, usefulness, and functionality of the proposed accelerations. Thanks to the accelerations, the design time is reduced in more than 56% and the number of keystrokes by 84%

    A FRAMEWORK FOR INTELLIGENT VOICE-ENABLED E-EDUCATION SYSTEMS

    Get PDF
    Although the Internet has received significant attention in recent years, voice is still the most convenient and natural way of communicating between human to human or human to computer. In voice applications, users may have different needs which will require the ability of the system to reason, make decisions, be flexible and adapt to requests during interaction. These needs have placed new requirements in voice application development such as use of advanced models, techniques and methodologies which take into account the needs of different users and environments. The ability of a system to behave close to human reasoning is often mentioned as one of the major requirements for the development of voice applications. In this paper, we present a framework for an intelligent voice-enabled e-Education application and an adaptation of the framework for the development of a prototype Course Registration and Examination (CourseRegExamOnline) module. This study is a preliminary report of an ongoing e-Education project containing the following modules: enrollment, course registration and examination, enquiries/information, messaging/collaboration, e-Learning and library. The CourseRegExamOnline module was developed using VoiceXML for the voice user interface(VUI), PHP for the web user interface (WUI), Apache as the middle-ware and MySQL database as back-end. The system would offer dual access modes using the VUI and WUI. The framework would serve as a reference model for developing voice-based e-Education applications. The e-Education system when fully developed would meet the needs of students who are normal users and those with certain forms of disabilities such as visual impairment, repetitive strain injury (RSI), etc, that make reading and writing difficult

    Vox et praeterea nihil

    Get PDF
    This paper explores design issues when creating a VoiceXML application, with special emphasis on grammar design and dialog flow. The paper can serve as a starting point for learning the basics of VoiceXML. The test application is an automatic directory assistance. Emneord: VoiceXML, vxml, Automatic Directory Assistance, Automated Speech Recognition, ASR, Text to Speech, TTS, dialog flow, Voice User Interface, VUI, VSP, Voice Service Provide

    Real time speech translator

    Get PDF
    This document is written to report on my work on the Real Time Voice Translator Project, the project I carried out as my final thesis project during the academic year 2007‐2008. During this period I have been working in the Research and Development Center (RDC) for Mobile Applications, a department of the Czech Technical University (CTU) in Prague. In the RDC I was a member of the Automatic Call Center Project (ACC Project) team, and within it, I was assigned to carry out the Real Time Voice Translator Project. The Automatic Call Center Project (ACC Project), now renamed to Voice2Web Project, is a project carried out by the Research and Development Center. The RDC is a department inside the Electro Technical Faculty of the CTU that carries out Research and Development projects regrding the Information Technologies (IT). Some of its partners are IBM, Vodafone and Ericson, who the RDC is doing projects for.  The ACC Project began on 2007 and its aim is to develop Voice Applications, within the IBM and RDC agreement, using IBM Voice Technologies and whatever open standards or open source software. IBM is an ACC Project partner and provides financing for it. It also provides hardware and software licenses to the ACC Project and gives us support. The members of the ACC Project are developing several Voice Applications at the same time, all them following the ACC Project purposes.   Although this document is focused on the Real Time Voice Translator Project, it will also explain in the introduction some aspects of the ACC Project. This is because the Real Time Voice Translator Project has a lot of points in common with it and it is worth, to understand it well, understand some points of the ACC Project as well. 

    Application of backend database contents and structure to the design of spoken dialog services

    Get PDF
    Current development platforms for designing spoken dialog services feature different kinds of strategies to help designers build, test, and deploy their applications. In general, these platforms are made up of several assistants that handle the different design stages (e.g. definition of the dialog flow, prompt and grammar definition, database connection, or to debug and test the running of the application). In spite of all the advances in this area, in general the process of designing spoken-based dialog services is a time consuming task that needs to be accelerated. In this paper we describe a complete development platform that reduces the design time by using different types of acceleration strategies based on using information from the data model structure and database contents, as well as cumulative information obtained throughout the successive steps in the design. Thanks to these accelerations, the interaction with the platform is simplified and the design is reduced, in most cases, to simple confirmations to the “proposals” that the platform automatically provides at each stage. Different kinds of proposals are available to complete the application flow such as the possibility of selecting which information slots should be requested to the user together, predefined templates for common dialogs, the most probable actions that make up each state defined in the flow, different solutions to solve specific speech-modality problems such as the presentation of the lists of retrieved results after querying the backend database. The platform also includes accelerations for creating speech grammars and prompts, and the SQL queries for accessing the database at runtime. Finally, we will describe the setup and results obtained in a simultaneous summative, subjective and objective evaluations with different designers used to test the usability of the proposed accelerations as well as their contribution to reducing the design time and interaction

    Staging Transformations for Multimodal Web Interaction Management

    Get PDF
    Multimodal interfaces are becoming increasingly ubiquitous with the advent of mobile devices, accessibility considerations, and novel software technologies that combine diverse interaction media. In addition to improving access and delivery capabilities, such interfaces enable flexible and personalized dialogs with websites, much like a conversation between humans. In this paper, we present a software framework for multimodal web interaction management that supports mixed-initiative dialogs between users and websites. A mixed-initiative dialog is one where the user and the website take turns changing the flow of interaction. The framework supports the functional specification and realization of such dialogs using staging transformations -- a theory for representing and reasoning about dialogs based on partial input. It supports multiple interaction interfaces, and offers sessioning, caching, and co-ordination functions through the use of an interaction manager. Two case studies are presented to illustrate the promise of this approach.Comment: Describes framework and software architecture for multimodal web interaction managemen

    VXML: AN ALTERNATIVE SOLUTION TO ACCESSING WEBSITE'S CONTENTS

    Get PDF
    Career Center Phone-based Application (CCPA) is a support tool for a career website that is developed using VoiceXML technology which allows users to access the contents of the website via phone call. The use of VoiceXML technology that connects the callers to the application via Public Switched Telephone Network (PSTN) has made the phone-based application accessible by any types of telephone, anywhere around the globe. CCP A work the same as SMS career tool that provides alternative to receive and update the website's content other than using Internet connection. However, this CCPA provides more than just receiving job alerts and applying for a job via SMS. With CCPA, the callers will have a new experience that is like "talking" with the content of the website. Here, callers may retrieve the company's profile, submit voice inquiries, authenticate/validate users for login, retrieve latest I 0 job opportunities that is available on the website and apply for a job where all are made via phone call. This is a new environment for career-based application that allows command and output presented in speech format plus, it will be the first VoiceXML-based application in Malaysia. User Centric Design (UCD) approaches has been selected to develop CCP A as it focuses on users' requirements and preferences while the Be Vocal cafe is chosen as an ASP to develop, test and host the voice application. This working CCP A has been tested using Black Box Testing on Vocal Scripter that is a real simulation of telephone. Vocal Scripter is used as it is cost free and effective plus, the application works when it runs perfectly on Vocal Scripter. Some testing using real telephones also have been conducted. As the result of this development, 5 main modules are implemented which are the general section that covers the welcome message, main menu and global help and menu links, voice inquiry section, users authentication section, job post retrieval section and job application section. In near future, the CCP A should be improved with Mixed Initiative Dialog approach that will provide a great call experience, enabled to support Malay Language and personalized to provide different and unique way of entertaining each of callers. In conclusion, this project will definitely initiate and encourage VoiceXML applications' development in Malaysia

    Constructing a low-cost, open-source, VoiceXML

    Get PDF
    Voice-enabled applications, applications that interact with a user via an audio channel, are used extensively today. Their use is growing as speech related technologies improve, as speech is one of the most natural methods of interaction. They can provide customer support as IVRs, can be used as an assistive technology, or can become an aural interface to the Internet. Given that the telephone is used extensively throughout the globe, the number of potential users of voice-enabled applications is very high. VoiceXML is a popular, open, high-level, standard means of creating voice-enabled applications which was designed to bring the benefits of web based development to services. While VoiceXML is an ideal language for creating these applications, VoiceXML gateways, the hardware and software responsible for interpreting VoiceXML applications and interfacing with the PSTN, are still expensive and so there is a need for a low-cost gateway. Asterisk, and open-source, TDM/VoIP telephony platform, can be used as a low-cost PSTN interface. This thesis investigates adding a VoiceXML service to Asterisk, creating a low-cost VoiceXML prototype gateway which is able to render voice-enabled applications. Following the Component-Based Software Engineering (CBSE) paradigm, the VoiceXML gateway is divided into a set of components which are sourced from the open-source community, and integrated to create the gateway. The browser requires a VoiceXML interpreter (OpenVXI), a Text-To-Speech engine (Festival) and a speech recognition engine (Sphinx 4). The integration of the components results in a low-cost, open-source VoiceXML gateway. System tests show that the integration of the components was successful, and that the system can handle concurrent calls. A fully compliant version of the gateway can be used in the real world to render voice-enabled applications at a low cost.KMBT_363Adobe Acrobat 9.55 Paper Capture Plug-i

    VOICE AUTHENTICATION USING VOICEXML

    Get PDF
    User Authentication through voice is one of the methods to ensure the protection of the sensitive data over the Internet. In this research, author wants to explore the technology and acquire clear understanding of VoiceXML, its concept and its architecture. Provided with understanding ofVoiceXML concept and architecture, this research will examines 3 levels of security using VoiceXML capabilities as a solution for validating users. The identified solutions will lead to development of VoiceXML prototype using available VoiceXML application development tool. This project has two main objectives. The first objective is to understand VoiceXML technology architecture and learn to develop and design a VoiceXML application. The second objective of the project is to observe three levels of securityas a solution for validating users. This project involved two approaches whichare performing researchon VoiceXML technology and developinga prototype that conclude the findings on the research performed. In the development of the prototype, Voice Application Life Cycle methodology is used which include 4 phases; Planning, Prototyping and Iteration, Development, and Launch. As the result, the prototype is expected to take full advantages ofcurrent VoiceXML technology. The prototype can be used as a template for future use by developers in order to make their voice application achieve the goal of user authentication which is to make the right information reliably and securely available to the right people. The final product for the projectis a prototype that has been called Voice Authentication through Speech Recognition that focuses on different level ofsecurity using voice authentication with VoiceXML
    corecore