52,373 research outputs found

    A Pilot Study of Speech and Pen User Interface For Graphical Editing

    Get PDF
    As computer size continues to decrease and new user interface technologies become more ubiquitous, the conventional keyboard and mouse input interfaces are becoming harder to design into newer machines and less practical for use in some applications. The pen is one input technology more suited for the upcoming generation of smaller computers using direct manipulation interfaces. However, a pen-only user interface relies on continuous gesture and handwriting tecognizers that are often slow, inaccurate, and error prone for command and text entry. Speech recognition is an input modality that can input commands quickly and potentially be a fast text entry mechanism, but lacks the capability of direct object manipulation and has inaccurate recognition. The combination of both pen and voice input should complement each other for direct graphic manipulation applications. This thesis compares the speed, usability, user-friendliness, and accuracy of a pen-only graphical editor against a pen-with-speech graphical editor. Two versions of a graphical editor were developed which have the same functionality. One is controlled by pen input alone and the other is controlled by both pen and speech input. The pen-only editor used the tool bar for command entry and character handwriting recognition for text entry. The pen-with-speech editor used speech recognition for both command and text entry. In a pilot study using both editors, 13 computer science graduate students were asked to draw a petri net, a state diagram, a flowchart, and a dataflow diagram. Shape entry was facilitated by automatic shape recognition that transformed continuous drawing information into a perfected shape. Experimental results comparing the editor\u27s user interfaces were then analysed. Results show that the addition of speech made the editor slightly faster. Experimental subjects claimed this editor was more usable, perceived to be faster, and preferred to use. About half of the subjects found the editor with speech not to be more user-friendly than the pen-only editor. The accuracy of character recognition for the pen-with-speech editor was significantly inferior to the pen-only editor\u27s handwriting recognition. The low recognition accuracy was caused by the speech recognizer\u27s inability to distinguish between similar sounding letters

    Ontology driven voice-based interaction in mobile environment

    Get PDF
    The paper deals with a new approach for spoken dialogue handling in mobile environment. The goal of our project is to allow the user to retrieve information from a knowledge base defined by ontology, using speech in a mobile environment. This environment has specific features that should be taken into account when the speech recognition and synthesis is performed. First of all, it limits the size of the language that can be understood by speech recognizers. On the other hand, it allows us to use information about user context. Our approach is to use the knowledge and user context to allow the user to speak freely to the system. Our research has been performed in the framework of an EU funded project MUMMY. This project is targeted to the use of mobile devices on building sites. This fact determines the approach to the solution of the problem. The main issue is user context in which the interaction takes place. As the application (construction site) is rather specific it is possible to use the knowledge related to this particular application during the speech recognition process. Up-to now the voice based user interfaces are based on various techniques that usually contain various constraints which limit the communication context to strictly predefined application domain. The main idea behind our solution is usage of ontology that represents the knowledge related to our particular application in specific user context. The knowledge acquired from ontology allows the user to communicate in mobile environment as the user input analysis is heavily simplified. The crucial step in our solution was the design of proper system architecture that allows the system to access the knowledge in ontology and use it to enhance the recognition process. The model of environment in which the recognition process is performed has several parts: - Domain ontology (construction sites in general) - instance of the domain ontology (specific construction site) - conversation history + specific user context (location, type of mobile device etc.). The key part of the model is the access mechanism that allows to extract particular knowledge in specific context. This access mechanism is controlled by means of dialogue automaton that controls the course of dialogue. The acquired knowledge is used in the speech recognizer for generation of a specific grammar that defines the possible speech inputs in a particular moment of the dialogue - in the next state another access into ontology in different context is done resulting in generation of a grammar that defines new possible inputs. The same access mechanism is also used to produce a response to user\u27s input in natural language. There exists a pilot implementation of the voice based user interface system, which has been tested in various situations and the results obtained are very encouraging

    Reactive Statistical Mapping: Towards the Sketching of Performative Control with Data

    Get PDF
    Part 1: Fundamental IssuesInternational audienceThis paper presents the results of our participation to the ninth eNTERFACE workshop on multimodal user interfaces. Our target for this workshop was to bring some technologies currently used in speech recognition and synthesis to a new level, i.e. being the core of a new HMM-based mapping system. The idea of statistical mapping has been investigated, more precisely how to use Gaussian Mixture Models and Hidden Markov Models for realtime and reactive generation of new trajectories from inputted labels and for realtime regression in a continuous-to-continuous use case. As a result, we have developed several proofs of concept, including an incremental speech synthesiser, a software for exploring stylistic spaces for gait and facial motion in realtime, a reactive audiovisual laughter and a prototype demonstrating the realtime reconstruction of lower body gait motion strictly from upper body motion, with conservation of the stylistic properties. This project has been the opportunity to formalise HMM-based mapping, integrate various of these innovations into the Mage library and explore the development of a realtime gesture recognition tool

    Multimodal agent interfaces and system architectures for health and fitness companions

    Get PDF
    Multimodal conversational spoken dialogues using physical and virtual agents provide a potential interface to motivate and support users in the domain of health and fitness. In this paper we present how such multimodal conversational Companions can be implemented to support their owners in various pervasive and mobile settings. In particular, we focus on different forms of multimodality and system architectures for such interfaces

    Dialogue based interfaces for universal access.

    Get PDF
    Conversation provides an excellent means of communication for almost all people. Consequently, a conversational interface is an excellent mechanism for allowing people to interact with systems. Conversational systems are an active research area, but a wide range of systems can be developed with current technology. More sophisticated interfaces can take considerable effort, but simple interfaces can be developed quite rapidly. This paper gives an introduction to the current state of the art of conversational systems and interfaces. It describes a methodology for developing conversational interfaces and gives an example of an interface for a state benefits web site. The paper discusses how this interface could improve access for a wide range of people, and how further development of this interface would allow a larger range of people to use the system and give them more functionality

    Web-based haptic applications for blind people to create virtual graphs

    Get PDF
    Haptic technology has great potentials in many applications. This paper introduces our work on delivery haptic information via the Web. A multimodal tool has been developed to allow blind people to create virtual graphs independently. Multimodal interactions in the process of graph creation and exploration are provided by using a low-cost haptic device, the Logitech WingMan Force Feedback Mouse, and Web audio. The Web-based tool also provides blind people with the convenience of receiving information at home. In this paper, we present the development of the tool and evaluation results. Discussions on the issues related to the design of similar Web-based haptic applications are also given

    Design and implementation of a user-oriented speech recognition interface: the synergy of technology and human factors

    Get PDF
    The design and implementation of a user-oriented speech recognition interface are described. The interface enables the use of speech recognition in so-called interactive voice response systems which can be accessed via a telephone connection. In the design of the interface a synergy of technology and human factors is achieved. This synergy is very important for making speech interfaces a natural and acceptable form of human-machine interaction. Important concepts such as interfaces, human factors and speech recognition are discussed. Additionally, an indication is given as to how the synergy of human factors and technology can be realised by a sketch of the interface's implementation. An explanation is also provided of how the interface might be integrated in different applications fruitfully
    • 

    corecore