9,872 research outputs found
Sharing Human-Generated Observations by Integrating HMI and the Semantic Sensor Web
Current âInternet of Thingsâ concepts point to a future where connected objects gather meaningful information about their environment and share it with other objects and people. In particular, objects embedding Human Machine Interaction (HMI), such as mobile devices and, increasingly, connected vehicles, home appliances, urban interactive infrastructures, etc., may not only be conceived as sources of sensor information, but, through interaction with their users, they can also produce highly valuable context-aware human-generated observations. We believe that the great promise offered by combining and sharing all of the different sources of information available can be realized through the integration of HMI and Semantic Sensor Web technologies. This paper presents a technological framework that harmonizes two of the most influential HMI and Sensor Web initiatives: the W3Câs Multimodal Architecture and Interfaces (MMI) and the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) with its semantic extension, respectively. Although the proposed framework is general enough to be applied in a variety of connected objects integrating HMI, a particular development is presented for a connected car scenario where driversâ observations about the traffic or their environment are shared across the Semantic Sensor Web. For implementation and evaluation purposes an on-board OSGi (Open Services Gateway Initiative) architecture was built, integrating several available HMI, Sensor Web and Semantic Web technologies. A technical performance test and a conceptual validation of the scenario with potential users are reported, with results suggesting the approach is soun
SGGNet: Speech-Scene Graph Grounding Network for Speech-guided Navigation
The spoken language serves as an accessible and efficient interface, enabling
non-experts and disabled users to interact with complex assistant robots.
However, accurately grounding language utterances gives a significant challenge
due to the acoustic variability in speakers' voices and environmental noise. In
this work, we propose a novel speech-scene graph grounding network (SGGNet)
that robustly grounds spoken utterances by leveraging the acoustic similarity
between correctly recognized and misrecognized words obtained from automatic
speech recognition (ASR) systems. To incorporate the acoustic similarity, we
extend our previous grounding model, the scene-graph-based grounding network
(SGGNet), with the ASR model from NVIDIA NeMo. We accomplish this by feeding
the latent vector of speech pronunciations into the BERT-based grounding
network within SGGNet. We evaluate the effectiveness of using latent vectors of
speech commands in grounding through qualitative and quantitative studies. We
also demonstrate the capability of SGGNet in a speech-based navigation task
using a real quadruped robot, RBQ-3, from Rainbow Robotics.Comment: 7 pages, 6 figures, Paper accepted for the Special Session at the
2023 International Symposium on Robot and Human Interactive Communication
(RO-MAN), [Dohyun Kim, Yeseung Kim, Jaehwi Jang, and Minjae Song] contributed
equally to this wor
Voice Interaction for Augmented Reality Navigation Interfaces with Natural Language Understanding
Voice interaction with natural language understanding (NLU) has been extensively explored in desktop computers, handheld devices, and human-robot interaction. However, there is limited research into voice interaction with NLU in augmented reality (AR). There are benefits of using voice interaction in AR, such as high naturalness and being hands-free. In this project, we introduce VOARLA, an NLU-powered AR voice interface, which navigate courier driver delivery a package. A user study was completed to evaluate VOARLA against an AR voice interface without NLU to investigate the effectiveness of NLU in the navigation interface in AR. We evaluated from three aspects: accuracy, productivity, and commands learning curve. Results found that using NLU in AR increases the accuracy of the interface by 15%. However, higher accuracy did not correlate to an increase in productivity. Results suggest that NLU helped users remember the commands on the first run when they were unfamiliar with the system. This suggests that using NLU in an AR hands-free application can make the learning curve easier for new users
VISION-BASED URBAN NAVIGATION PROCEDURES FOR VERBALLY INSTRUCTED ROBOTS
The work presented in this thesis is part of a project in instruction based learning (IBL) for mobile
robots were a robot is designed that can be instructed by its users through unconstrained natural
language. The robot uses vision guidance to follow route instructions in a miniature town model.
The aim of the work presented here was to determine the functional vocabulary of the robot in the
form of "primitive procedures". In contrast to previous work in the field of instructable robots this
was done following a "user-centred" approach were the main concern was to create primitive
procedures that can be directly associated with natural language instructions. To achieve this, a corpus
of human-to-human natural language instructions was collected and analysed. A set of primitive
actions was found with which the collected corpus could be represented. These primitive actions were
then implemented as robot-executable procedures.
Natural language instructions are under-specified when destined to be executed by a robot. This is
because instructors omit information that they consider as "commonsense" and rely on the listener's
sensory-motor capabilities to determine the details of the task execution. In this thesis the under-specification
problem is solved by determining the missing information, either during the learning of
new routes or during their execution by the robot. During learning, the missing information is
determined by imitating the commonsense approach human listeners take to achieve the same
purpose. During execution, missing information, such as the location of road layout features
mentioned in route instructions, is determined from the robot's view by using image template
matching. The original contribution of this thesis, in both these methods, lies in the fact that they are
driven by the natural language examples found in the corpus collected for the IDL project.
During the testing phase a high success rate of primitive calls, when these were considered individually,
showed that the under-specification problem has overall been solved. A novel method for testing the
primitive procedures, as part of complete route descriptions, is also proposed in this thesis. This was
done by comparing the performance of human subjects when driving the robot, following route
descriptions, with the performance of the robot when executing the same route descriptions. The
results obtained from this comparison clearly indicated where errors occur from the time when a
human speaker gives a route description to the time when the task is executed by a human listener or
by the robot.
Finally, a software speed controller is proposed in this thesis in order to control the wheel speeds of
the robot used in this project. The controller employs PI (Proportional and Integral) and PID
(Proportional, Integral and Differential) control and provides a good alternative to expensive hardware
Machine Analysis of Facial Expressions
No abstract
Computational Intelligence and Human- Computer Interaction: Modern Methods and Applications
The present book contains all of the articles that were accepted and published in the Special Issue of MDPIâs journal Mathematics titled "Computational Intelligence and HumanâComputer Interaction: Modern Methods and Applications". This Special Issue covered a wide range of topics connected to the theory and application of different computational intelligence techniques to the domain of humanâcomputer interaction, such as automatic speech recognition, speech processing and analysis, virtual reality, emotion-aware applications, digital storytelling, natural language processing, smart cars and devices, and online learning. We hope that this book will be interesting and useful for those working in various areas of artificial intelligence, humanâcomputer interaction, and software engineering as well as for those who are interested in how these domains are connected in real-life situations
Integration of Assistive Technologies into 3D Simulations: Exploratory Studies
Virtual worlds and environments have many purposes, ranging from games to scientific research. However, universal accessibility features in such virtual environments are limited. As the impairment prevalence rate increases yearly, so does the research interests in the field of assistive technologies. This work introduces research in assistive technologies and presents three software developments that explore the integration of assistive technologies within virtual environments, with a strong focus on Brain-Computer Interfaces. An accessible gaming system, a hands-free navigation software system, and a Brain-Computer Interaction plugin have been developed to study the capabilities of accessibility features within virtual 3D environments. Details of the specification, design, and implementation of these software applications are presented in the thesis. Observations and preliminary results as well as directions of future work are also included
- âŠ