34 research outputs found

    Prosody Based Co-analysis for Continuous Recognition of Coverbal Gestures

    Full text link
    Although speech and gesture recognition has been studied extensively, all the successful attempts of combining them in the unified framework were semantically motivated, e.g., keyword-gesture cooccurrence. Such formulations inherited the complexity of natural language processing. This paper presents a Bayesian formulation that uses a phenomenon of gesture and speech articulation for improving accuracy of automatic recognition of continuous coverbal gestures. The prosodic features from the speech signal were coanalyzed with the visual signal to learn the prior probability of co-occurrence of the prominent spoken segments with the particular kinematical phases of gestures. It was found that the above co-analysis helps in detecting and disambiguating visually small gestures, which subsequently improves the rate of continuous gesture recognition. The efficacy of the proposed approach was demonstrated on a large database collected from the weather channel broadcast. This formulation opens new avenues for bottom-up frameworks of multimodal integration.Comment: Alternative see: http://vision.cse.psu.edu/kettebek/academ/publications.ht

    Multimodal interaction on english testing academic assessment

    Get PDF
    [EN] Multimodal interaction methods applied to learning environments of the English language will be a line for future research from the use of adapted mobile phones or PDAs. Today's mobile devices allow access and data entry in a synchronized manner through different channels. At the academic level we made the first analysis of English language learning on a multimodal experimental platform. The research will evaluate the impact of college students use for future online applications aimed at improving language skills through self-learning. (C) 2012 Published by Elsevier Ltd. Selection and/or peer review under responsibility of Prof. Dr. Huseyin UzunboyluThis work was carried out through research conducted in the project: "Analysis and verification of adaptation and accessible multimodal interaction for language examinations on mobile devices". It has been funded by the Universitat Politècnica de València as part of the program: "Projects for new lines in multidisciplinary research PID-05-10".Magal Royo, T.; Giménez López, JL.; García Laborda, J. (2012). Multimodal interaction on english testing academic assessment. Procedia Social and Behavioral Sciences. 46:5824-5827. https://doi.org/10.1016/j.sbspro.2012.06.522S582458274

    User Error Handling Strategies on a Non Visual Multimodal Interface: Preliminary Results from an Exploratory Study

    Get PDF
    The present study addresses two questions: On a non-visual multimodal interface for textual information browsing, (1) how prevalent is input modality switching as an error handling strategy, and (2) how much does an input modality need to fail before input modality switching occurs. The results indicate that although switching input modalities to correct errors is an expected practice on multimodal GUIs, it is not the prevalent strategy for non-visual multimodal interfaces. We believe that users are more likely to diversify their error handling strategies within a modality, if different strategies are possible, but we have not found conclusive evidence for this belief. However, our analysis suggests that the failure to switch modalities when errors occur may, in part, be due to the prevalence of alternative error handling strategies in a particular input modality, that is, the user prefers to stay in the same modality rather than assume the cognitive load of a switch

    An information assistant system for the prevention of tunnel vision in crisis management

    Get PDF
    In the crisis management environment, tunnel vision is a set of bias in decision makers’ cognitive process which often leads to incorrect understanding of the real crisis situation, biased perception of information, and improper decisions. The tunnel vision phenomenon is a consequence of both the challenges in the task and the natural limitation in a human being’s cognitive process. An information assistant system is proposed with the purpose of preventing tunnel vision. The system serves as a platform for monitoring the on-going crisis event. All information goes through the system before arrives at the user. The system enhances the data quality, reduces the data quantity and presents the crisis information in a manner that prevents or repairs the user’s cognitive overload. While working with such a system, the users (crisis managers) are expected to be more likely to stay aware of the actual situation, stay open minded to possibilities, and make proper decisions

    Cognitive Principles in Robust Multimodal Interpretation

    Full text link
    Multimodal conversational interfaces provide a natural means for users to communicate with computer systems through multiple modalities such as speech and gesture. To build effective multimodal interfaces, automated interpretation of user multimodal inputs is important. Inspired by the previous investigation on cognitive status in multimodal human machine interaction, we have developed a greedy algorithm for interpreting user referring expressions (i.e., multimodal reference resolution). This algorithm incorporates the cognitive principles of Conversational Implicature and Givenness Hierarchy and applies constraints from various sources (e.g., temporal, semantic, and contextual) to resolve references. Our empirical results have shown the advantage of this algorithm in efficiently resolving a variety of user references. Because of its simplicity and generality, this approach has the potential to improve the robustness of multimodal input interpretation

    Multimodal Interactivity in the Foreign Language Section of the Spanish University Admission Examination

    Full text link
    [ES] En el presente trabajo1 se muestran las enormes posibilidades que ofrece el concepto de multimodalidad en los entornos digitales interactivos aplicados a las pruebas de selectividad en el área de conocimiento de idiomas y que han sido desarrollados en los últimos años en diferentes proyectos de investigación. Las investigaciones surgidas a partir de la automatización telemática de las pruebas de selectividad, en el ámbito de la gestión, se relacionan con trabajos que tratan de abordar la problemática relativa al acceso online a las pruebas, y que permitirían en el futuro gestionar de manera eficiente la ejecución del propio examen y la organización semiautomática de su corrección por Internet. Paralelamente, se ha abordado el uso de aplicaciones multiplataforma y/o multinavegador, capaces de asumir los retos de la validación técnica y funcional de la accesibilidad que permita su acceso universal. La multimodalidad aplicada a los métodos de navegación durante la realización de la prueba telemática es posible gracias a los dispositivos ubicuos que permitirán la interacción simultánea en el proceso de introducción de datos por parte del alumno. La propuesta del presente trabajo consiste en demostrar la viabilidad técnica del proceso de ejecución del examen, en función de las variables tecnológicas y formales de la navegación, considerando el examen disponible en la actualidad, en dispositivos ubicuos que permitirán avanzar en los procesos de aprendizaje de lenguas asistido por ordenador, también llamado CALL (Computer-Assisted Language Learning). El desarrollo de la investigación ha permitido la creación de un prototipo que servirá para analizar y validar la funcionalidad de este tipo de tecnologías interactivas adaptadas a la realización de los exámenes de acceso a la universidad.Nuestro agradecimiento al Ministerio de Ciencia e Innovación (MICINN) por la financiación del Proyecto de Investigación (con cofinanciación FEDER) en el marco del Plan Nacional I+D+I «Orientación, propuestas y enseñanza para la sección de inglés en la Prueba de Acceso a la Universidad». Referencia FFI2011-22442. Asimismo, parte del presente trabajo se ha desarrollado gracias a las investigaciones llevadas a cabo en el proyecto «Análisis y verificación de la adaptación e interacción multimodal accesible en la realización de exámenes de aprendizaje de idiomas sobre dispositivos móviles», subvencionado por la Universidad Politécnica de Valencia dentro del programa: «Proyectos de nuevas líneas de investigación multidisciplinares. PAID-05-10».Magal Royo, T.; Giménez López, JL. (2012). La interactividad multimodal en la sección de lengua extranjera de la Prueba de Acceso a la Universidad en España. Revista de Educación. (357):163-177. http://hdl.handle.net/10251/29888S16317735

    ForcePhone : new prototype with pressure and thermal feedback

    Get PDF
    The project worked on goes by the name of 'ForcePhone II.' It follows in the footsteps of the previous 'ForcePhone' project developed by Nokia Research and the Helsinki Institute of Information Technology. [HSH12] The original ForcePhone took pressure feedback from a caller who would squeeze a phone, in this case a Nokia N900 with a built in pressure sensor, and then would send the receiver differing strengths of vibration based on the strength of the squeeze. A stronger squeeze would deliver a stronger vibration. Our task with the ForcePhone II is to investigate how other forms of haptic input and feedback can be used to increase communication bandwidth. To do this we built a prototype that would go by the name of 'ForcePhone II.' This prototype allows us to research and experiment alternative forms of haptic feedback. The prototype includes a heating pad on the backside of the device, a rope that tightens over the users hand on the side, and a vibrating motor inside. In our experiment we will be testing user's reactions to the outputs of pressure, vibration, and heat. This is to answer our central research question: Is pressure and/or heat a useful alternative to the existing 'silent' stimuli provided by today's phones
    corecore