Search CORE

1,857 research outputs found

The Multimodal Tutor: Adaptive Feedback from Multimodal Experiences

Author: Di Mitri D.
Publication venue: Open Universiteit
Publication date: 04/09/2020
Field of study

This doctoral thesis describes the journey of ideation, prototyping and empirical testing of the Multimodal Tutor, a system designed for providing digital feedback that supports psychomotor skills acquisition using learning and multimodal data capturing. The feedback is given in real-time with machine-driven assessment of the learner's task execution. The predictions are tailored by supervised machine learning models trained with human annotated samples. The main contributions of this thesis are: a literature survey on multimodal data for learning, a conceptual model (the Multimodal Learning Analytics Model), a technological framework (the Multimodal Pipeline), a data annotation tool (the Visual Inspection Tool) and a case study in Cardiopulmonary Resuscitation training (CPR Tutor). The CPR Tutor generates real-time, adaptive feedback using kinematic and myographic data and neural networks

Open University of the Netherlands Research Portal

Multimodal Data Analytics and Fusion for Data Science

Author: Tian Haiman
Publication venue: FIU Digital Commons
Publication date: 01/01/2019
Field of study

Advances in technologies have rapidly accumulated a zettabyte of “new” data every two years. The huge amount of data have a powerful impact on various areas in science and engineering and generates enormous research opportunities, which calls for the design and development of advanced approaches in data analytics. Given such demands, data science has become an emerging hot topic in both industry and academia, ranging from basic business solutions, technological innovations, and multidisciplinary research to political decisions, urban planning, and policymaking. Within the scope of this dissertation, a multimodal data analytics and fusion framework is proposed for data-driven knowledge discovery and cross-modality semantic concept detection. The proposed framework can explore useful knowledge hidden in different formats of data and incorporate representation learning from data in multimodalities, especial for disaster information management. First, a Feature Affinity-based Multiple Correspondence Analysis (FA-MCA) method is presented to analyze the correlations between low-level features from different features, and an MCA-based Neural Network (MCA-NN) ispro- posedto capture the high-level features from individual FA-MCA models and seamlessly integrate the semantic data representations for video concept detection. Next, a genetic algorithm-based approach is presented for deep neural network selection. Furthermore, the improved genetic algorithm is integrated with deep neural networks to generate populations for producing optimal deep representation learning models. Then, the multimodal deep representation learning framework is proposed to incorporate the semantic representations from data in multiple modalities efficiently. At last, fusion strategies are applied to accommodate multiple modalities. In this framework, cross-modal mapping strategies are also proposed to organize the features in a better structure to improve the overall performance

A survey of technologies supporting design of a multimodal interactive robot for military communication

Author: Sheuli Paul
Publication venue: Emerald Publishing
Publication date: 01/11/2023
Field of study

Purpose – This paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this emerging field. Communication is multimodal. Multimodality is a representation of many modes chosen from rhetorical aspects for its communication potentials. The author seeks to define the available automation capabilities in communication using multimodalities that will support a proposed Interactive Robot System (IRS) as an AI mounted robotic platform to advance the speed and quality of military operational and tactical decision making. Design/methodology/approach – This review will begin by presenting key developments in the robotic interaction field with the objective of identifying essential technological developments that set conditions for robotic platforms to function autonomously. After surveying the key aspects in Human Robot Interaction (HRI), Unmanned Autonomous System (UAS), visualization, Virtual Environment (VE) and prediction, the paper then proceeds to describe the gaps in the application areas that will require extension and integration to enable the prototyping of the IRS. A brief examination of other work in HRI-related fields concludes with a recapitulation of the IRS challenge that will set conditions for future success. Findings – Using insights from a balanced cross section of sources from the government, academic, and commercial entities that contribute to HRI a multimodal IRS in military communication is introduced. Multimodal IRS (MIRS) in military communication has yet to be deployed. Research limitations/implications – Multimodal robotic interface for the MIRS is an interdisciplinary endeavour. This is not realistic that one can comprehend all expert and related knowledge and skills to design and develop such multimodal interactive robotic interface. In this brief preliminary survey, the author has discussed extant AI, robotics, NLP, CV, VDM, and VE applications that is directly related to multimodal interaction. Each mode of this multimodal communication is an active research area. Multimodal human/military robot communication is the ultimate goal of this research. Practical implications – A multimodal autonomous robot in military communication using speech, images, gestures, VST and VE has yet to be deployed. Autonomous multimodal communication is expected to open wider possibilities for all armed forces. Given the density of the land domain, the army is in a position to exploit the opportunities for human–machine teaming (HMT) exposure. Naval and air forces will adopt platform specific suites for specially selected operators to integrate with and leverage this emerging technology. The possession of a flexible communications means that readily adapts to virtual training will enhance planning and mission rehearsals tremendously. Social implications – Interaction, perception, cognition and visualization based multimodal communication system is yet missing. Options to communicate, express and convey information in HMT setting with multiple options, suggestions and recommendations will certainly enhance military communication, strength, engagement, security, cognition, perception as well as the ability to act confidently for a successful mission. Originality/value – The objective is to develop a multimodal autonomous interactive robot for military communications. This survey reports the state of the art, what exists and what is missing, what can be done and possibilities of extension that support the military in maintaining effective communication using multimodalities. There are some separate ongoing progresses, such as in machine-enabled speech, image recognition, tracking, visualizations for situational awareness, and virtual environments. At this time, there is no integrated approach for multimodal human robot interaction that proposes a flexible and agile communication. The report briefly introduces the research proposal about multimodal interactive robot in military communication

Directory of Open Access Journals

On data-driven systems analyzing, supporting and enhancing users’ interaction and experience

Author: Cruz-Benito Juan
Publication venue
Publication date: 01/01/2018
Field of study

[EN]The research areas of Human-Computer Interaction and Software Architectures have been traditionally treated separately, but in the literature, many authors made efforts to merge them to build better software systems. One of the common gaps between software engineering and usability is the lack of strategies to apply usability principles in the initial design of software architectures. Including these principles since the early phases of software design would help to avoid later architectural changes to include user experience requirements. The combination of both fields (software architectures and Human-Computer Interaction) would contribute to building better interactive software that should include the best from both the systems and user-centered designs. In that combination, the software architectures should enclose the fundamental structure and ideas of the system to offer the desired quality based on sound design decisions. Moreover, the information kept within a system is an opportunity to extract knowledge about the system itself, its components, the software included, the users or the interaction occurring inside. The knowledge gained from the information generated in a software environment can be used to improve the system itself, its software, the users’ experience, and the results. So, the combination of the areas of Knowledge Discovery and Human-Computer Interaction offers ideal conditions to address Human-Computer-Interaction-related challenges. The Human-Computer Interaction focuses on human intelligence, the Knowledge Discovery in computational intelligence, and the combination of both can raise the support of human intelligence with machine intelligence to discover new insights in a world crowded of data. This Ph.D. Thesis deals with these kinds of challenges: how approaches like data-driven software architectures (using Knowledge Discovery techniques) can help to improve the users' interaction and experience within an interactive system. Specifically, it deals with how to improve the human-computer interaction processes of different kind of stakeholders to improve different aspects such as the user experience or the easiness to accomplish a specific task. Several research actions and experiments support this investigation. These research actions included performing a systematic literature review and mapping of the literature that was aimed at finding how the software architectures in the literature have been used to support, analyze or enhance the human-computer interaction. Also, the actions included work on four different research scenarios that presented common challenges in the Human- Computer Interaction knowledge area. The case studies that fit into the scenarios selected were chosen based on the Human-Computer Interaction challenges they present, and on the authors’ accessibility to them. The four case studies were: an educational laboratory virtual world, a Massive Open Online Course and the social networks where the students discuss and learn, a system that includes very large web forms, and an environment where programmers develop code in the context of quantum computing. The development of the experiences involved the review of more than 2700 papers (only in the literature review phase), the analysis of the interaction of 6000 users in four different contexts or the analysis of 500,000 quantum computing programs. As outcomes from the experiences, some solutions are presented regarding the minimal software artifacts to include in software architectures, the behavior they should exhibit, the features desired in the extended software architecture, some analytic workflows and approaches to use, or the different kinds of feedback needed to reinforce the users’ interaction and experience. The results achieved led to the conclusion that, despite this is not a standard practice in the literature, the software environments should embrace Knowledge Discovery and datadriven principles to analyze and respond appropriately to the users’ needs and improve or support the interaction. To adopt Knowledge Discovery and data-driven principles, the software environments need to extend their software architectures to cover also the challenges related to Human-Computer Interaction. Finally, to tackle the current challenges related to the users’ interaction and experience and aiming to automate the software response to users’ actions, desires, and behaviors, the interactive systems should also include intelligent behaviors through embracing the Artificial Intelligence procedures and techniques

Interactive maps: What we know and what we need to know

Author: Roth Robert E.
Publication venue: DigitalCommons@UMaine
Publication date: 01/06/2013
Field of study

This article provides a review of the current state of science regarding cartographic interaction a complement to the traditional focus within cartography on cartographic representation. Cartographic interaction is defined as the dialog between a human and map mediated through a computing device and is essential to the research into interactive cartography geovisualization and geovisual analytics. The review is structured around six fundamental questions facing a science of cartographic interaction: (1) what is cartographic interaction (e.g. digital versus analog interactions interaction versus interfaces stages of interaction interactive maps versus mapping systems versus map mash-ups); (2) why provide cartographic interaction (e.g. visual thinking geographic insight the stages of science the cartographic problematic); (3) when should cartographic interaction be provided (e.g. static versus interactive maps interface complexity the productivity paradox flexibility versus constraint work versus enabling interactions); (4) who should be provided with cartographic interaction (e.g. user-centered design user ability expertise and motivation adaptive cartography and geocollaboration); (5) where should cartographic interaction be provided (e.g. input capabilities bandwidth and processing power display capabilities mobile mapping and location-based services); and (6) how should cartographic interaction be provided (e.g. interaction primitives objective-based versus operator-based versus operand-based taxonomies interface styles interface design)? The article concludes with a summary of research questions facing cartographic interaction and offers an outlook for cartography as a field of study moving forward

University of Maine

Directory of Open Access Journals

Spatio-Temporal Multimedia Big Data Analytics Using Deep Neural Networks

Author: Pouyanfar Samira
Publication venue: FIU Digital Commons
Publication date: 01/01/2019
Field of study

With the proliferation of online services and mobile technologies, the world has stepped into a multimedia big data era, where new opportunities and challenges appear with the high diversity multimedia data together with the huge amount of social data. Nowadays, multimedia data consisting of audio, text, image, and video has grown tremendously. With such an increase in the amount of multimedia data, the main question raised is how one can analyze this high volume and variety of data in an efficient and effective way. A vast amount of research work has been done in the multimedia area, targeting different aspects of big data analytics, such as the capture, storage, indexing, mining, and retrieval of multimedia big data. However, there is insufficient research that provides a comprehensive framework for multimedia big data analytics and management. To address the major challenges in this area, a new framework is proposed based on deep neural networks for multimedia semantic concept detection with a focus on spatio-temporal information analysis and rare event detection. The proposed framework is able to discover the pattern and knowledge of multimedia data using both static deep data representation and temporal semantics. Specifically, it is designed to handle data with skewed distributions. The proposed framework includes the following components: (1) a synthetic data generation component based on simulation and adversarial networks for data augmentation and deep learning training, (2) an automatic sampling model to overcome the imbalanced data issue in multimedia data, (3) a deep representation learning model leveraging novel deep learning techniques to generate the most discriminative static features from multimedia data, (4) an automatic hyper-parameter learning component for faster training and convergence of the learning models, (5) a spatio-temporal deep learning model to analyze dynamic features from multimedia data, and finally (6) a multimodal deep learning fusion model to integrate different data modalities. The whole framework has been evaluated using various large-scale multimedia datasets that include the newly collected disaster-events video dataset and other public datasets