10 research outputs found

    Prosody Based Co-analysis for Continuous Recognition of Coverbal Gestures

    Full text link
    Although speech and gesture recognition has been studied extensively, all the successful attempts of combining them in the unified framework were semantically motivated, e.g., keyword-gesture cooccurrence. Such formulations inherited the complexity of natural language processing. This paper presents a Bayesian formulation that uses a phenomenon of gesture and speech articulation for improving accuracy of automatic recognition of continuous coverbal gestures. The prosodic features from the speech signal were coanalyzed with the visual signal to learn the prior probability of co-occurrence of the prominent spoken segments with the particular kinematical phases of gestures. It was found that the above co-analysis helps in detecting and disambiguating visually small gestures, which subsequently improves the rate of continuous gesture recognition. The efficacy of the proposed approach was demonstrated on a large database collected from the weather channel broadcast. This formulation opens new avenues for bottom-up frameworks of multimodal integration.Comment: Alternative see: http://vision.cse.psu.edu/kettebek/academ/publications.ht

    Guidelines for digital storytelling for Arab children

    Get PDF
    Children are getting more exposed to various technologies in teaching-learning. Various types of teaching-learning have been designed, including interactive digital storytelling. In Malaysia, local children have been clear about story-based learning materials. However, the situation is a little bit different with Arab children. Because the number of Arab children migrating into Malaysia is increasing, for following their parents who are studying at higher levels, they have to also make themselves familiar with the local scenario. In accordance, this study is initiates, to identify their acceptance towards story-based learning materials, or specifically interactive digital storytelling. Hence, this study reacts proactively, by approaching Arab children asking for their feedback on whether they have any desire for interactive digital storytelling. Through a series of interviews, this study found that they have a strong desire and tendency. Then, the following objectives have been stated: (1) to determine the components for the interactive digital storytelling for Arab children, (2) to design and develop a prototype of the interactive digital storytelling, and (3) to observe on how the Arab children experience the interactive digital storytelling. User-centered design (UCD) approach has been gone through in ensuring that the objectives are achieved. The process of determining the components for the interactive digital storytelling was carried out by directly involving Arab children and their teachers from three preschools in Changlun and Sintok. It was similar with the efforts in determining the contents, and interface design until the prototype development. Having the prototype ready, user testing was carried out to explore the way Arab children experience the prototype. All the processes involved various techniques through observation, interviews, and noting. Specifically, the user testing involved qualitative and empirical data. Qualitative data were gathered through observation, meanwhile the empirical data were gathered using Computer System Usability Questionnaire (CSUQ) tool. In the end, having data processed, the findings show that Arab children are highly satisfied with the prototype. Scientifically, the developed prototype is a mirror of the obtained guidelines, obtained through the UCD seminars. Hence, the positive acceptance on the prototype reflects positive acceptance on the guidelines, as the main contribution of this study. Besides the guidelines as the main contribution of this study, the developed prototype is also a wonderful contribution to the Arab children and their teacher. They will be using it as part of their teaching and learning material

    Prosody and Kinesics Based Co-analysis Towards Continuous Gesture Recognition

    Get PDF
    The aim of this study is to develop a multimodal co-analysis framework for continuous gesture recognition by exploiting prosodic and kinesics manifestation of natural communication. Using this framework, a co-analysis pattern between correlating components is obtained. The co-analysis pattern is clustered using K-means clustering to determine how well the pattern distinguishes the gestures. Features of the proposed approach that differentiate it from the other models are its less susceptibility to idiosyncrasies, its scalability, and simplicity. The experiment was performed on Multimodal Annotated Gesture Corpus (MAGEC) that we created for research on understanding non-verbal communication community, particularly the gestures

    Hand Gesture Interaction with Human-Computer

    Get PDF
    Hand gestures are an important modality for human computer interaction. Compared to many existing interfaces, hand gestures have the advantages of being easy to use, natural, and intuitive. Successful applications of hand gesture recognition include computer games control, human-robot interaction, and sign language recognition, to name a few. Vision-based recognition systems can give computers the capability of understanding and responding to hand gestures. The paper gives an overview of the field of hand gesture interaction with Human- Computer, and describes the early stages of a project about gestural command sets, an issue that has often been neglected. Currently we have built a first prototype for exploring the use of pieand marking menus in gesture-based interaction. The purpose is to study if such menus, with practice, could support the development of autonomous gestural command sets. The scenario is remote control of home appliances, such as TV sets and DVD players, which in the future could be extended to the more general scenario of ubiquitous computing in everyday situations. Some early observations are reported, mainly concerning problems with user fatigue and precision of gestures. Future work is discussed, such as introducing flow menus for reducing fatigue, and control menus for continuous control functions. The computer vision algorithms will also have to be developed further

    Conceptual model for usable multi-modal mobile assistance during Umrah

    Get PDF
    Performing Umrah is very demanding and to be performed in very crowded environments. In response to that, many efforts have been initiated to overcome the difficulties faced by pilgrims. However, those efforts focus on acquiring initial perspective and background knowledge before going to Mecca. Findings of preliminary study show that those efforts do not support multi-modality for user interaction. Nowadays the computational capabilities in mobile phones enable it to serve people in various aspects of daily life. Consequently, the mobile phone penetration has increased dramatically in the last decade. Hence, this study aims to propose a comprehensive conceptual model for usable multimodal mobile assistance during Umrah called Multi-model Mobile Assistance during Umrah (MMA-U). Thus, four (4) supporting objectives are formulated, and the Design Science Research Methodology has been adopted. For the usability of MMA-U, Systematic Literature Review (SLR) indicates ten (10) attributes: usefulness, errors rate, simplicity, reliability, ease of use, safety, flexibility, accessibility, attitude, and acceptability. Meanwhile, the content and comparative analysis result in five (5) components that construct the conceptual model of MMA-U: structural, content composition, design principles, development approach, technology, and the design and usability theories. Then, the MMA-U has been reviewed and well-accepted by 15 experts. Later, the MMA-U was incorporated into a prototype called Personal Digital Mutawwif (PDM). The PDM was developed for the purpose of user test in the field. The findings indicate that PDM facilitates the execution of Umrah and successfully meet pilgrims’ needs and expectations. Also, the pilgrims were satisfied and felt that they need to have PDM. In fact, they would recommend PDM to their friends, which mean that use of PDM is safe and suitable while performing Umrah. As a conclusion, the theoretical contribution; the conceptual model of MMA-U; provides guidelines for developing multimodal content mobile applications during Umrah

    Toward an intelligent multimodal interface for natural interaction

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 73-76).Advances in technology are enabling novel approaches to human-computer interaction (HCI) in a wide variety of devices and settings (e.g., the Microsoft® Surface, the Nintendo® Wii, iPhone®, etc.). While many of these devices have been commercially successful, the use of multimodal interaction technology is still not well understood from a more principled system design or cognitive science perspective. The long-term goal of our research is to build an intelligent multimodal interface for natural interaction that can serve as a testbed for enabling the formulation of a more principled system design framework for multimodal HCI. This thesis focuses on the gesture input modality. Using a new hand tracking technology capable of tracking 3D hand postures in real-time, we developed a recognition system for continuous natural gestures. By nature gestures, we mean the ones encountered in spontaneous interaction, rather than a set of artificial gestures designed for the convenience of recognition. To date we have achieved 96% accuracy on isolated gesture recognition, and 74% correct rate on continuous gesture recognition with data from different users and twelve gesture classes. We are able to connect the gesture recognition system with Google Earth, enabling gestural control of a 3D map. In particular, users can do 3D tilting of the map using non touch-based gesture which is more intuitive than touch-based ones. We also did an exploratory user study to observe natural behavior under a urban search and rescue scenario with a large tabletop display. The qualitative results from the study provides us with good starting points for understanding how users naturally gesture, and how to integrate different modalities. This thesis has set the stage for further development towards our long-term goal.by Ying Yin.S.M

    HUMAN ROBOT INTERACTION THROUGH SEMANTIC INTEGRATION OF MULTIPLE MODALITIES, DIALOG MANAGEMENT, AND CONTEXTS

    Get PDF
    The hypothesis for this research is that applying the Human Computer Interaction (HCI) concepts of using multiple modalities, dialog management, context, and semantics to Human Robot Interaction (HRI) will improve the performance of Instruction Based Learning (IBL) compared to only using speech. We tested the hypothesis by simulating a domestic robot that can be taught to clean a house using a multi-modal interface. We used a method of semantically integrating the inputs from multiple modalities and contexts that multiplies a confidence score for each input by a Fusion Weight, sums the products, and then uses the input with the highest product sum. We developed an algorithm for determining the Fusion Weights. We concluded that different modalities, contexts, and modes of dialog management impact human robot interaction; however, which combination is better depends on the importance of the accuracy of learning what is taught versus the succinctness of the dialog between the user and the robot

    Interacção gestual sem superfícies de apoio

    Get PDF
    Tese de mestrado em Engenharia Informática (Sistemas de Informação), apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2011Os periféricos de entrada deixaram de ser a única forma de transmitir intenç-¸ ões à máquina, sendo agora possível fazê-lo com o próprio corpo. Dispositivos que permitem interacção gestual sem recurso a periféricos intermediários têm vindo a aumentar, principalmente na área dos jogos. Esta tendência levanta várias questões a serem investigadas na área da interacção pessoa-máquina. A aproximação simplista de transferir conceitos de interacção do paradigma clássico WIMP, baseado nos dispositivos tradicionais de entrada, rato e teclado, rapidamente conduz a problemas inesperados. As características de uma interface concebida para uma interacção gestual em que não há contacto com nenhum dispositivo de entrada não se irão adequar ao paradigma utilizado nos últimos 40 anos. Estamos assim em condições de explorar como a interacção gestual com ou sem voz pode contribuir para minimizar os problemas com o paradigma clássico WIMP no tipo de interacção em que não há o contacto com nenhum periférico. Neste trabalho irá ser explorado o campo da interacção gestual, com ou sem voz. Através de aplicações pretende-se conduzir vários estudos de manipulação de objectos virtuais baseada em visão computacional. A manipulação dos objectos é realizada com dois modos de interacção (gestos e voz) podendo estes surgir integrados ou não. Pretende-se analisar se a interacção gestual é apelativa para os utilizadores para alguns tipos de aplicações e acções, enquanto para outros tipos, os gestos poderão não ser a modalidade preferida de interacção.The input peripherals aren’t anymore the only way to transmit intentions to the machine, being now possible to do it with our own body. The number of devices that allow gestural interaction, without the need of intermediate peripherals, are increasing, mainly in the area of video games. This tendency raises several questions that need to be investigated in the area of person-machine interaction. The simplistic approach of transferring interaction concepts from the classic paradigm WIMP, based on the traditional input devices, mouse and keyboard, quickly leads to unexpected problems. The characteristics of an interface conceived to a gestural interaction were there isn’t any kind of contact with an input device won’t suit with the paradigm of the last 40 years. So we’re in conditions to exploit how the gestural interaction can contribute to minimize the classic paradigm issues. In this work the field of gestural interaction, with and without voice, will be analyzed. Through the use of applications, it’s intended to lead various studies of virtual objects manipulation based on computational vision. The objects manipulation is done with two kinds of interactions, gestural and voice, that may emerge integrated our not. It’s intended to analyze if the gestural interaction is appealing to the users for some kind of applications and actions, while for other types, gestural may not be the preferred interaction modality

    Human haptic perception in virtual environments: An investigation of the interrelationship between physical stiffness and perceived roughness.

    Get PDF
    Research in the area of haptics and how we perceive the sensations that come from haptic interaction started almost a century ago, yet there is little fundamental knowledge as to how and whether a change in the physical values of one characteristic can alter the perception of another. The increasing availability of haptic interaction through the development of force-feedback devices opens new possibilities in interaction, allowing for accurate real time change of physical attributes on virtual objects in order to test the haptic perception changes to the human user. An experiment was carried out to ascertain whether a change in the stiffness value would have a noticeable effect on the perceived roughness of a virtual object. Participants were presented with a textured surface and were asked to estimate how rough it felt compared to a standard. What the participants did not know was that the simulated texture on both surfaces remained constant and the only physical attribute changing in every trial was the comparison object’s surface stiffness. The results showed that there is a strong relationship between physical stiffness and perceived roughness that can be accurately described by a power function, and the roughness magnitude estimations of roughness showed an increase with increasing stiffness values. The conclusion is that there are relationships between these parameters, where changes in the physical stiffness of a virtual object can change how rough it is perceived to be in a very clear and predictable way. Extending this study can lead to an investigation on how other physical attributes affects one or more perceived haptic dimensions and subsequently insights can be used for constructing something like a haptic pallet for a haptic display designer, where altering one physical attribute can in turn change a whole array of perceived haptic dimensions in a clear and predictable way

    Gesture and Speech in Interaction - 4th edition (GESPIN 4)

    Get PDF
    International audienceThe fourth edition of Gesture and Speech in Interaction (GESPIN) was held in Nantes, France. With more than 40 papers, these proceedings show just what a flourishing field of enquiry gesture studies continues to be. The keynote speeches of the conference addressed three different aspects of multimodal interaction:gesture and grammar, gesture acquisition, and gesture and social interaction. In a talk entitled Qualitiesof event construal in speech and gesture: Aspect and tense, Alan Cienki presented an ongoing researchproject on narratives in French, German and Russian, a project that focuses especially on the verbal andgestural expression of grammatical tense and aspect in narratives in the three languages. Jean-MarcColletta's talk, entitled Gesture and Language Development: towards a unified theoretical framework,described the joint acquisition and development of speech and early conventional and representationalgestures. In Grammar, deixis, and multimodality between code-manifestation and code-integration or whyKendon's Continuum should be transformed into a gestural circle, Ellen Fricke proposed a revisitedgrammar of noun phrases that integrates gestures as part of the semiotic and typological codes of individuallanguages. From a pragmatic and cognitive perspective, Judith Holler explored the use ofgaze and hand gestures as means of organizing turns at talk as well as establishing common ground in apresentation entitled On the pragmatics of multi-modal face-to-face communication: Gesture, speech andgaze in the coordination of mental states and social interaction.Among the talks and posters presented at the conference, the vast majority of topics related, quitenaturally, to gesture and speech in interaction - understood both in terms of mapping of units in differentsemiotic modes and of the use of gesture and speech in social interaction. Several presentations explored the effects of impairments(such as diseases or the natural ageing process) on gesture and speech. The communicative relevance ofgesture and speech and audience-design in natural interactions, as well as in more controlled settings liketelevision debates and reports, was another topic addressed during the conference. Some participantsalso presented research on first and second language learning, while others discussed the relationshipbetween gesture and intonation. While most participants presented research on gesture and speech froman observer's perspective, be it in semiotics or pragmatics, some nevertheless focused on another importantaspect: the cognitive processes involved in language production and perception. Last but not least,participants also presented talks and posters on the computational analysis of gestures, whether involvingexternal devices (e.g. mocap, kinect) or concerning the use of specially-designed computer software forthe post-treatment of gestural data. Importantly, new links were made between semiotics and mocap data
    corecore