10 research outputs found
Underwater Gesture Recognition Using Classical Computer Vision and Deep Learning Techniques
Underwater Gesture Recognition is a challenging task since conditions which are normally not an issue in gesture recognition on land must be considered. Such issues include low visibility, low contrast, and unequal spectral propagation. In this work, we explore the underwater gesture recognition problem by taking on the recently released Cognitive Autonomous Diving Buddy Underwater Gestures dataset. The contributions of this paper are as follows: (1) Use traditional computer vision techniques along with classical machine learning to perform gesture recognition on the CADDY dataset; (2) Apply deep learning using a convolutional neural network to solve the same problem; (3) Perform confusion matrix analysis to determine the types of gestures that are relatively difficult to recognize and understand why; (4) Compare the performance of the methods above in terms of accuracy and inference speed. We achieve up to 97.06% accuracy with our CNN. To the best of our knowledge, our work is one of the earliest attempts, if not the first, to apply computer vision and machine learning techniques for gesture recognition on the said dataset. As such, we hope this work will serve as a benchmark for future work on the CADDY dataset
Gesture Recognition Wristband Device with Optimised Piezoelectric Energy Harvesters
Wearable devices can be used for monitoring vital human physiological signs and for interacting with computers. Due to the limited lifetime of batteries, these devices require novel energy harvesting solutions to ensure uninterrupted and autonomous operation. We therefore developed a wearable wristband device with piezoelectric transducers, which were used for hybrid functionality. These transducers were used for both energy harvesting and sensing applications. In fact, we also demonstrate that gestures can be classified using electricity generated from these piezoelectric transducers as a result of tendon movements around the wrist. In this paper, we demonstrate how a multi-physics simulation model was used to maximize the amount of harvestable energy from these piezoelectric transducers
Diver Interest via Pointing in Three Dimensions: 3D Pointing Reconstruction for Diver-AUV Communication
This paper presents Diver Interest via Pointing in Three Dimensions (DIP-3D),
a method to relay an object of interest from a diver to an autonomous
underwater vehicle (AUV) by pointing that includes three-dimensional distance
information to discriminate between multiple objects in the AUV's camera image.
Traditional dense stereo vision for distance estimation underwater is
challenging because of the relative lack of saliency of scene features and
degraded lighting conditions. Yet, including distance information is necessary
for robotic perception of diver pointing when multiple objects appear within
the robot's image plane. We subvert the challenges of underwater distance
estimation by using sparse reconstruction of keypoints to perform pose
estimation on both the left and right images from the robot's stereo camera.
Triangulated pose keypoints, along with a classical object detection method,
enable DIP-3D to infer the location of an object of interest when multiple
objects are in the AUV's field of view. By allowing the scuba diver to point at
an arbitrary object of interest and enabling the AUV to autonomously decide
which object the diver is pointing to, this method will permit more natural
interaction between AUVs and human scuba divers in underwater-human robot
collaborative tasks.Comment: Under Review International Conference of Robotics and Automation 202
Underwater Image Super-Resolution using Deep Residual Multipliers
We present a deep residual network-based generative model for single image
super-resolution (SISR) of underwater imagery for use by autonomous underwater
robots. We also provide an adversarial training pipeline for learning SISR from
paired data. In order to supervise the training, we formulate an objective
function that evaluates the \textit{perceptual quality} of an image based on
its global content, color, and local style information. Additionally, we
present USR-248, a large-scale dataset of three sets of underwater images of
'high' (640x480) and 'low' (80x60, 160x120, and 320x240) spatial resolution.
USR-248 contains paired instances for supervised training of 2x, 4x, or 8x SISR
models. Furthermore, we validate the effectiveness of our proposed model
through qualitative and quantitative experiments and compare the results with
several state-of-the-art models' performances. We also analyze its practical
feasibility for applications such as scene understanding and attention modeling
in noisy visual conditions
Integrating Affective Expressions into Robot-Assisted Search and Rescue to Improve Human-Robot Communication
Unexplained or ambiguous behaviours of rescue robots can lead to inefficient collaborations between humans and robots in robot-assisted SAR teams. To date, rescue robots do not have the ability to interact with humans on a social level, which is believed to be an essential ability that can improve the quality of interactions. This thesis research proposes to bring affective robot expressions into the SAR context to provide rescue robots social capabilities.
The first experiment presented in Chapter 3 investigates whether there is consensus in mapping emotions to messages/situations in Urban Search and Rescue (USAR) scenarios, where efficiency and effectiveness of interactions are crucial to success. We studied mappings between 10 specific messages, presented in two different communication styles,
reflecting common situations that might happen during search and rescue missions and the emotions exhibited by robots in those situations. The data was obtained through a Mechanical Turk study with 78 participants. The findings support the feasibility of using emotions as an additional communication channel to improve multi-modal human-robot interaction for urban search and rescue robots and suggest that these mappings are robust, i.e., are not affected by the robot’s communication style.
The second experiment was conducted on Amazon Mechanical Turk as well with 223 participants. We used Affect Control Theory (ACT) as a method for deriving the mappings between situations and emotions (similar to the ones in the first experiment) and as an alternative method to obtaining mappings that can be adjusted for different emotion sets
(Chapter 4). The results suggested that there is consistency in the choice of emotions for a robot to show in different situations between the two methods used in the first and second experiment, indicating the feasibility of using emotions as an additional modality in SAR robots.
After validating the feasibility of bringing emotions to SAR context based on the findings from the first two experiments, we created affective expressions based on Evaluation, Potency and Activity (EPA) dimensions of ACT with the help of LED lights on a rescue robot called Husky. We evaluated the effect of emotions on rescue workers’ situational awareness through an online Amazon Mechanical Turk Study with 151 participants (Chapter 5). Findings indicated that participants who saw Husky with affective expressions (conveyed through lights) had better perception accuracy of the situation happening in the disaster scene than participants who saw the videos of the Husky robot without any affective lights. In other words, Husky with affective lights improved participants’ situational awareness