14 research outputs found

    A modified approach of POMDP-based dialogue management

    Get PDF
    This thesis applies the theory of history information space for a thorough study of dialogue management in major approaches, ranging from the classical approach based upon finite state machine to the most recent approach using partially observable Markov decision process (PODMP). While most of the approaches use various techniques to estimate system state, the POMDP-based approach avoids state estimation and uses belief state for decision making. In addition, it provides a mechanism to model uncertainty and allows for errorrecovery. PODMP-based dialogue management demonstrates undeniable advantages in the handling of input uncertainty over all the other approaches. However, applying Markovian over the belief-state space in the current POMDP models causes significant loss of valuable information in dialogue history, leading to untruthful recognition of user\u27s intention. To improve the performance of POMDP-based dialogue management this thesis introduces belief history into the planning process, and uses not only the current but also the previous belief state for the determination of actions. In the new approach, all changes of belief state require a validation with domain constraints, and an invalid change results in a modification to the actions provided by the POMDP solver. Experiments show that this new approach is able to handle uncertainty caused by user\u27s lack of domain knowledge and practical constraints, thus becoming more accurate in intention recognition

    Enhancing computer-human interaction with animated facial expressions

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1991.Includes bibliographical references (leaves 87-93).by Brent Cabot James Britton.M.S

    Coverbal iconic gesture in human-computer interaction

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1993.Includes bibliographical references (leaves 60-63).by Carlton James Sparrell.M.S

    A feature-based approach to continuous-gesture analysis

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1994.Includes bibliographical references (p. 95-98) and index.by Alan Daniel Wexelblat.M.S

    Improved Intention Discovery with Classified Emotions in A Modified POMDP

    Get PDF
    Emotions are one of the most proactive topics in psychology, a basis of forceful conversation and divergence from the earliest philosophers and other thinkers to the present day. Human emotion classification using different machine learning techniques is an active area of research over the last decade. This investigation discusses a new approach for virtual agents to better understand and interact with the user. Our research focuses on deducing the belief state of a user who interacts with a single agent using recognized emotions from the text/speech based input. We built a customized decision tree with six primary states of emotions being recognized from different sets of inputs. The belief state at each given instance of time slice is inferred by drawing a belief network using the different sets of emotions and calculating state of belief using a POMDP (Partially Observable Markov Decision Process) based solver. Hence the existing POMDP model is customized in order to incorporate emotion as observations for finding the possible user intentions. This helps to overcome the limitations of the present methods to better recognize the belief state. As well, the new approach allows us to analyze human emotional behaviour in indefinite environments and helps to generate an effective interaction between the human and the computer

    Designing multimodal interaction for the visually impaired

    Get PDF
    Although multimodal computer input is believed to have advantages over unimodal input, little has been done to understand how to design a multimodal input mechanism to facilitate visually impaired users\u27 information access. This research investigates sighted and visually impaired users\u27 multimodal interaction choices when given an interaction grammar that supports speech and touch input modalities. It investigates whether task type, working memory load, or prevalence of errors in a given modality impact a user\u27s choice. Theories in human memory and attention are used to explain the users\u27 speech and touch input coordination. Among the abundant findings from this research, the following are the most important in guiding system design: (1) Multimodal input is likely to be used when it is available. (2) Users select input modalities based on the type of task undertaken. Users prefer touch input for navigation operations, but speech input for non-navigation operations. (3) When errors occur, users prefer to stay in the failing modality, instead of switching to another modality for error correction. (4) Despite the common multimodal usage patterns, there is still a high degree of individual differences in modality choices. Additional findings include: (I) Modality switching becomes more prevalent when lower working memory and attentional resources are required for the performance of other concurrent tasks. (2) Higher error rates increases modality switching but only under duress. (3) Training order affects modality usage. Teaching a modality first versus second increases the use of this modality in users\u27 task performance. In addition to discovering multimodal interaction patterns above, this research contributes to the field of human computer interaction design by: (1) presenting a design of an eyes-free multimodal information browser, (2) presenting a Wizard of Oz method for working with visually impaired users in order to observe their multimodal interaction. The overall contribution of this work is that of one of the early investigations into how speech and touch might be combined into a non-visual multimodal system that can effectively be used for eyes-free tasks

    Giving directions to computers via two-handed gesture, speech, and gaze

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1992.Includes bibliographical references (leaves 71-74).by Edward Joseph Herranz.M.S

    Visual classification of co-verbal gestures for gesture understanding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2001.Includes bibliographical references (leaves 86-92).A person's communicative intent can be better understood by either a human or a machine if the person's gestures are understood. This thesis project demonstrates an expansion of both the range of co-verbal gestures a machine can identify, and the range of communicative intents the machine can infer. We develop an automatic system that uses realtime video as sensory input and then segments, classifies, and responds to co-verbal gestures made by users in realtime as they converse with a synthetic character known as REA, which is being developed in parallel by Justine Cassell and her students at the MIT Media Lab. A set of 670 natural gestures, videotaped and visually tracked in the course of conversational interviews and then hand segmented and annotated according to a widely used gesture classification scheme, is used in an offline training process that trains Hidden Markov Model classifiers. A number of feature sets are extracted and tested in the offline training process, and the best performer is employed in an online HMM segmenter and classifier that requires no encumbering attachments to the user. Modifications made to the REA system enable REA to respond to the user's beat and deictic gestures as well as turntaking requests the user may convey in gesture.(cont.) The recognition results obtained are far above chance, but too low for use in a production recognition system. The results provide a measure of validity for the gesture categories chosen, and they provide positive evidence for an appealing but difficult to prove proposition: to the extent that a machine can recognize and use these categories of gestures to infer information not present in the words spoken, there is exploitable complementary information in the gesture stream.by Lee Winston Campbell.Ph.D

    The Conductor Interaction Method:Interacting Using Hand Gestures and Gaze.

    Get PDF
    Over the past thirty years computers have increasingly become part of our everyday lives. Most humans have become computer users in one way or another, since many activities involve either the direct use of a computer or are supported by one. This has prompted research into developing methods and mechanisms to assist humans in interacting with computers (known as Human Computer Interaction or HCI). This research is responsible for the development of a number of techniques that have been used over the years, some of which are quite old but continue to be used, and some are more recent and still evolving. Many of these interaction techniques, however, are not natural in their use and typically require the user to learn a new means of interaction. Inconsistencies within these techniques and restrictions they impose on user creativity can also make such interaction techniques difficult to use, especially for novice users. This thesis proposes an alternative interaction method, the Conductor Interaction Method, which aims to provide a more natural and easier to learn interaction technique. This novel interaction method extends existing Human Computer Interaction methods by drawing upon techniques found in human-human interaction. It is argued that the use of a two-phased multimodal interaction mechanism, using gaze for selection and gesture for manipulation, incorporated within a metaphor based environment, can provide a viable alternative for interacting with a computer (especially for novice users). The model for the Conductor Interaction Method is presented along with an architecture and implementation for a system that realises it. The effectiveness of the Conductor Interaction Method is demonstrated via a number of studies, in which users made use of the developed system. The studies involved users of mixed computer experience
    corecore