5 research outputs found
Estudio de Estándar VOICEXML para el Acceso de Datos Vía Telefónica. Caso Práctico: Prototipo para Consultas de Planillas Telefónicas
El objetivo de la investigación es estudiar el estándar VoiceXML, que representa un lenguaje de programación basado en etiquetas, permite realizar aplicaciones basadas en navegación por voz, reproducción de archivos de audio pre-grabados, reconocimiento voz, presentar al usuario información dinámica extraídas desde motores de base de datos y reproducir en forma de voz. Para el desarrollo de esta investigación, se realizó la implementación de una centralita telefónica, que describe el proceso de instalación y configuración de la plataforma de pruebas para el estándar VoiceXML con todos elementos necesarios para su correcto funcionamiento, se utilizó motores para el reconocimiento y síntesis de voz, que permite una excelente funcionalidad del estándar. La aplicación consta de un menú que presenta opciones al usuario en forma de voz, el usuario escoge la opción mediante comandos de voz, la aplicación reconoce la información y procede a realizar nuevas actividades como navegar a otra opción o acceder a la base de datos del sistema para presentar la información al usuario mediante sonidos vocales. Luego de realizar la implementación del prototipo de acceso a datos mediante el estándar VoiceXML, se determinó que las aplicaciones VoiceXML utilizan un 17% de recursos frente a las aplicaciones Web con un 83%, lo cual determina que el acceso a la información con VoiceXML utiliza recursos mínimos. Se recomienda la utilización de este estándar para futuras aplicaciones de voz
Voice Portal with VoiceXML Platform
Import 05/08/2014Diplomová práce se zabývá návrhem a implementací webového a hlasového portálu na platformě VoiceXML.
Nejdříve je představen samotný jazyk VoiceXML a vybrané hlasové platformy. Následuje úvod do problematiky související s technologiemi pro syntézu řeči (TTS) a automatické rozpoznávání řeči (ASR).
V práci je dále popsán návrh a implementace webového a hlasového rozhraní informačního systému pro rezervaci letenek. Nechybí zde ani popis konfigurace hlasového portálu.
Součástí práce bylo také provedení zátěžových testů navrženého hlasového portálu.Master’s thesis deals with design and implementation of web and voice portal with VoiceXML platform.
There is an introduction to related technologies like speech synthesis (TTS) and automatic speech recognition (ASR) covered too.
First chapters are dedicated to introduction to VoiceXML language and voice platforms.
Following chapters of this thesis describe design and implementaion of web and voice interface for ticket reservations information system. There is also a description of the configuration of the voice portal included in next chapter.
There were made some benchmark tests of designed voice portal too.440 - Katedra telekomunikační technikyvýborn
Constructing a low-cost, open-source, VoiceXML
Voice-enabled applications, applications that interact with a user via an audio channel, are used extensively today. Their use is growing as speech related technologies improve, as speech is one of the most natural methods of interaction. They can provide customer support as IVRs, can be used as an assistive technology, or can become an aural interface to the Internet. Given that the telephone is used extensively throughout the globe, the number of potential users of voice-enabled applications is very high. VoiceXML is a popular, open, high-level, standard means of creating voice-enabled applications which was designed to bring the benefits of web based development to services. While VoiceXML is an ideal language for creating these applications, VoiceXML gateways, the hardware and software responsible for interpreting VoiceXML applications and interfacing with the PSTN, are still expensive and so there is a need for a low-cost gateway. Asterisk, and open-source, TDM/VoIP telephony platform, can be used as a low-cost PSTN interface. This thesis investigates adding a VoiceXML service to Asterisk, creating a low-cost VoiceXML prototype gateway which is able to render voice-enabled applications. Following the Component-Based Software Engineering (CBSE) paradigm, the VoiceXML gateway is divided into a set of components which are sourced from the open-source community, and integrated to create the gateway. The browser requires a VoiceXML interpreter (OpenVXI), a Text-To-Speech engine (Festival) and a speech recognition engine (Sphinx 4). The integration of the components results in a low-cost, open-source VoiceXML gateway. System tests show that the integration of the components was successful, and that the system can handle concurrent calls. A fully compliant version of the gateway can be used in the real world to render voice-enabled applications at a low cost.KMBT_363Adobe Acrobat 9.55 Paper Capture Plug-i
Plataforma embarcada de reconhecimento autom?tico da fala para o aux?lio de pessoas com mobilidade reduzida
A busca por maior independ?ncia e autonomia para as pessoas com defici?ncia tem se
apresentado como um fator decisivo ao proporcionar uma melhoria na qualidade de vida
desses indiv?duos atrav?s do uso de tecnologias assistivas. A fala se constitui na mais b?sica,
comum e eficiente forma de comunica??o entre os seres humanos, de modo que a entrada
de comandos por voz pode ser uma alternativa para que pessoas com mobilidade reduzida,
e que tenham preservada boa capacidade das habilidades da fala, realizem o controle do
computador ou outros dispositivos. O objetivo deste trabalho consiste no desenvolvimento
de uma interface de comandos por voz, atrav?s do reconhecimento autom?tico da fala, que
seja facilmente adaptada e incorporada a sistemas e ferramentas de aux?lio ao controle do
ambiente dom?stico (dom?tica). Com esse intuito, foram executadas duas abordagens de
desenvolvimento. A primeira consistiu de um experimento piloto realizado com o intuito de
formar uma base inicial de conhecimento no desenvolvimento de aplica??es utilizando o
reconhecimento de comandos por voz. Esta etapa baseou-se na utiliza??o de um m?dulo de
hardware espec?fico, que recebe os comando de voz diretamente atrav?s de um microfone,
constituindo-se de um sistema dependente de locutor capaz de reconhecer comandos de
palavras isoladas para o controle das luzes de umLED RGB. J? a segunda abordagem, integra
componentes de hardware aberto e software livre e de c?digo aberto, sendo os comandos
de voz fornecidos ao sistema atrav?s de um smartphone configurado com softphone VoIP
(Voz sobre IP). Nesse ?ltimo caso, o softphone, ent?o, se registra no servidor de comunica??o
Asterisk, que implementa uma central telef?nica com unidade de resposta aud?vel (URA).
Integrada ao servidor, est? a ferramenta de reconhecimento da fala, Julius. Esses componentes
est?o embarcados na plataforma Beaglebone Black, de baixo custo. O sistema ? dependente
de locutor e capaz de reconhecer frases com tr?s palavras para o controle da ilumina??o,
televis?o e acesso a portas de umambiente dom?stico hipot?tico constitu?do de sala, cozinha,
quarto, banheiro e ?rea externa. Os resultados obtidos a partir dos testes realizados indicam
taxas de acerto de 95,9% e 94,77% para as interfaces desenvolvidas na primeira e segunda
abordagens, respectivamente. Esses ?ndices sugeremque ? vi?vel o emprego dos m?dulos de
reconhecimento desenvolvidos na implementa??o de solu??es de tecnologias assistivas
Mobile phones interaction techniques for second economy people
Second economy people in developing countries are people living in communities that are underserved in terms of basic amenities and social services. Due to literacy challenges and user accessibility problems in rural communities, it is often difficult to design user interfaces that conform to the capabilities and cultural experiences of low-literacy rural community users. Rural community users are technologically illiterate and lack the knowledge of the potential of information and communication technologies. In order to embrace new technology, users will need to perceive the user interface and application as useful and easy to interact with. This requires proper understanding of the users and their socio-cultural environment. This will enable the interfaces and interactions to conform to their behaviours, motivations as well as cultural experiences and preferences and thus enhance usability and user experience. Mobile phones have the potential to increase access to information and provide a platform for economic development in rural communities. Rural communities have economic potential in terms of agriculture and micro-enterprises. Information technology can be used to enhance socio-economic activities and improve rural livelihood. We conducted a study to design user interfaces for a mobile commerce application for micro-entrepreneurs in a rural community in South Africa. The aim of the study was to design mobile interfaces and interaction techniques that are easy to use and meet the cultural preferences and experiences of users who have little to no previous experience of mobile commerce technology. And also to explore the potentials of information technologies rural community users, and bring mobile added value services to rural micro-entrepreneurs. We applied a user-centred design approach in Dwesa community and used qualitative and quantitative research methods to collect data for the design of the user interfaces (graphic user interface and voice user interface) and mobile commerce application. We identified and used several interface elements to design and finally evaluate the graphical user interface. The statistics analysis of the evaluation results show that the users in the community have positive perception of the usefulness of the application, the ease of use and intention to use the application. Community users with no prior experience with this technology were able to learn and understand the interface, recorded minimum errors and a high level of v precision during task performance when they interacted with the shop-owner graphic user interface. The voice user interface designed in this study consists of two flavours (dual tone multi-frequency input and voice input) for rural users. The evaluation results show that community users recorded higher tasks successes and minimum errors with the dual tone multi-frequency input interface than the voice only input interface. Also, a higher percentage of users prefer the dual tone multi-frequency input interface. The t-Test statistical analysis performed on the tasks completion times and error rate show that there was significant statistical difference between the dual tone multi-frequency input interface and the voice input interface. The interfaces were easy to learn, understand and use. Properly designed user interfaces that meet the experience and capabilities of low-literacy users in rural areas will improve usability and users‟ experiences. Adaptation of interfaces to users‟ culture and preferences will enhance information services accessibility among different user groups in different regions. This will promote technology acceptance in rural communities for socio-economic benefits. The user interfaces presented in this study can be adapted to different cultures to provide similar services for marginalised communities in developing countrie