1,244 research outputs found

    Practical aspects of designing and developing a multimodal embodied agent

    Get PDF
    2021 Spring.Includes bibliographical references.This thesis reviews key elements that went into the design and construction of the CSU CwC Embodied agent, also known as the Diana System. The Diana System has been developed over five years by a joint team of researchers at three institutions – Colorado State University, Brandeis University and the University of Florida. Over that time, I contributed to this overall effort and in this thesis, I present a practical review of key elements involved in designing and constructing the system. Particular attention is paid to Diana's multimodal capabilities that engage asynchronously and concurrently to support realistic interactions with the user. Diana can communicate in visual as well as auditory modalities. She can understand a variety of hand gestures for object manipulation, deixis, etc. and can gesture in return. Diana can also hold a conversation with the user in spoken and/or written English. Gestures and speech are often at play simultaneously, supplementing and complementing each other. Diana conveys her attention through several non-verbal cues like slower blinking when inattentive, keeping her gaze on the subject of her attention, etc. Finally, her ability to express emotions with facial expressions adds another crucial human element to any user interaction with the system. Central to Diana's capabilities is a blackboard architecture coordinating a hierarchy of modular components, each controlling a part of Diana's perceptual, cognitive, and motor abilities. The modular design facilitates contributions from multiple disciplines, namely VoxSim/VoxML with Text-to-speech/Automatic Speech Recognition systems for natural language understanding, deep neural networks for gesture recognition, 3D computer animation systems, etc. – all integrated within the Unity game engine to create an embodied, intelligent agent that is Diana. The primary contribution of this thesis is to provide a detailed explanation of Diana's internal working along with a thorough background of the research that supports these technologies

    Embodied interaction: Learning Chinese characters through body movements

    Get PDF
    This experimental study examined the design and effectiveness of embodied interactions for learning. The researchers designed a digital learning environment integrating body joint mapping sensors to teach novice learners Chinese characters, and examined whether the embodied interaction would lead to greater knowledge acquisition in language learning compared to the conventional mouse-based interaction. Fifty-three adult learners were randomly assigned to experimental and control groups. The study adopted a pretest, an immediate posttest, and a delayed posttest on knowledge acquisition. Although higher scores were found for the embodied interaction group in both posttests, only the delayed posttest showed a statistically significant group difference. The findings suggested that active embodied actions lead to better knowledge retention compared with the passive visual embodiment. The body-moving process works as an alternative and complementary encoding strategy for character understanding and memorization by associating the semantic meaning of a character with the construction of a body posture

    Designing Human-Centered Collective Intelligence

    Get PDF
    Human-Centered Collective Intelligence (HCCI) is an emergent research area that seeks to bring together major research areas like machine learning, statistical modeling, information retrieval, market research, and software engineering to address challenges pertaining to deriving intelligent insights and solutions through the collaboration of several intelligent sensors, devices and data sources. An archetypal contextual CI scenario might be concerned with deriving affect-driven intelligence through multimodal emotion detection sources in a bid to determine the likability of one movie trailer over another. On the other hand, the key tenets to designing robust and evolutionary software and infrastructure architecture models to address cross-cutting quality concerns is of keen interest in the “Cloud” age of today. Some of the key quality concerns of interest in CI scenarios span the gamut of security and privacy, scalability, performance, fault-tolerance, and reliability. I present recent advances in CI system design with a focus on highlighting optimal solutions for the aforementioned cross-cutting concerns. I also describe a number of design challenges and a framework that I have determined to be critical to designing CI systems. With inspiration from machine learning, computational advertising, ubiquitous computing, and sociable robotics, this literature incorporates theories and concepts from various viewpoints to empower the collective intelligence engine, ZOEI, to discover affective state and emotional intent across multiple mediums. The discerned affective state is used in recommender systems among others to support content personalization. I dive into the design of optimal architectures that allow humans and intelligent systems to work collectively to solve complex problems. I present an evaluation of various studies that leverage the ZOEI framework to design collective intelligence

    Kinect vs. low-cost inertial sensing for gesture recognition

    Get PDF
    In this paper, we investigate efficient recognition of human gestures / movements from multimedia and multimodal data, including the Microsoft Kinect and translational and rotational acceleration and velocity from wearable inertial sensors. We firstly present a system that automatically classifies a large range of activities (17 different gestures) using a random forest decision tree. Our system can achieve near real time recognition by appropriately selecting the sensors that led to the greatest contributing factor for a particular task. Features extracted from multimodal sensor data were used to train and evaluate a customized classifier. This novel technique is capable of successfully classifying various gestures with up to 91 % overall accuracy on a publicly available data set. Secondly we investigate a wide range of different motion capture modalities and compare their results in terms of gesture recognition accuracy using our proposed approach. We conclude that gesture recognition can be effectively performed by considering an approach that overcomes many of the limitations associated with the Kinect and potentially paves the way for low-cost gesture recognition in unconstrained environments

    Gesture Recognition System Application to early childhood education

    Get PDF
    One of the most socially and culturally advantageous uses of human-computer interaction is enhancing playing and learning for children. In this study, gesture interactive game-based learning (GIGL) is tested to see if these kinds of applications are suitable to stimulate working memory (WM) and basic mathematical skills (BMS) in early childhood (5-6 years old) using a hand gesture recognition system. Hand gesture is being performed by the user and to control a computer system by that incoming information. We can conclude that the children who used GIGL technology showed a significant increase in their learning performance in WM and BMS, surpassing those who did normal school activities

    Multimodality with Eye tracking and Haptics: A New Horizon for Serious Games?

    Get PDF
    The goal of this review is to illustrate the emerging use of multimodal virtual reality that can benefit learning-based games. The review begins with an introduction to multimodal virtual reality in serious games and we provide a brief discussion of why cognitive processes involved in learning and training are enhanced under immersive virtual environments. We initially outline studies that have used eye tracking and haptic feedback independently in serious games, and then review some innovative applications that have already combined eye tracking and haptic devices in order to provide applicable multimodal frameworks for learning-based games. Finally, some general conclusions are identified and clarified in order to advance current understanding in multimodal serious game production as well as exploring possible areas for new applications

    Finger-stylus for non touch-enable systems

    Get PDF
    Since computer was invented, people are using many devices to interact with computer. Initially there were keyboard, mouse etc. but with advancement of technology, new ways are being discovered that are quite common and natural to the humans like stylus for touch-enabled systems. In the current age of technology, the user is expected to touch the machine interface to give input. Hand gesture is used in such a way to interact with machines where natural bare hand is used to communicate without touching machine interface. It gives a feeling to the user that he is interacting in a natural way with some human, not with traditional machines. This paper presents a technique where the user need not touch the machine interface to draw on the screen. Here hand finger draws shapes on monitor like stylus, without touching the monitor. This method can be used in many applications including games. The finger is used as an input device that acts like a paint-brush or finger-stylus and is used to make shapes in front of the camera. Fingertip extraction and motion tracking were done in Matlab with real time constraints. This work is an early attempt to replace stylus with the natural finger without touching the screen

    Video interaction using pen-based technology

    Get PDF
    Dissertação para obtenção do Grau de Doutor em InformáticaVideo can be considered one of the most complete and complex media and its manipulating is still a difficult and tedious task. This research applies pen-based technology to video manipulation, with the goal to improve this interaction. Even though the human familiarity with pen-based devices, how they can be used on video interaction, in order to improve it, making it more natural and at the same time fostering the user’s creativity is an open question. Two types of interaction with video were considered in this work: video annotation and video editing. Each interaction type allows the study of one of the interaction modes of using pen-based technology: indirectly, through digital ink, or directly, trough pen gestures or pressure. This research contributes with two approaches for pen-based video interaction: pen-based video annotations and video as ink. The first uses pen-based annotations combined with motion tracking algorithms, in order to augment video content with sketches or handwritten notes. It aims to study how pen-based technology can be used to annotate a moving objects and how to maintain the association between a pen-based annotations and the annotated moving object The second concept replaces digital ink by video content, studding how pen gestures and pressure can be used on video editing and what kind of changes are needed in the interface, in order to provide a more familiar and creative interaction in this usage context.This work was partially funded by the UTAustin-Portugal, Digital Media, Program (Ph.D. grant: SFRH/BD/42662/2007 - FCT/MCTES); by the HP Technology for Teaching Grant Initiative 2006; by the project "TKB - A Transmedia Knowledge Base for contemporary dance" (PTDC/EAT/AVP/098220/2008 funded by FCT/MCTES); and by CITI/DI/FCT/UNL (PEst-OE/EEI/UI0527/2011
    corecore