10 research outputs found

    Underwater Gesture Recognition Using Classical Computer Vision and Deep Learning Techniques

    Get PDF
    Underwater Gesture Recognition is a challenging task since conditions which are normally not an issue in gesture recognition on land must be considered. Such issues include low visibility, low contrast, and unequal spectral propagation. In this work, we explore the underwater gesture recognition problem by taking on the recently released Cognitive Autonomous Diving Buddy Underwater Gestures dataset. The contributions of this paper are as follows: (1) Use traditional computer vision techniques along with classical machine learning to perform gesture recognition on the CADDY dataset; (2) Apply deep learning using a convolutional neural network to solve the same problem; (3) Perform confusion matrix analysis to determine the types of gestures that are relatively difficult to recognize and understand why; (4) Compare the performance of the methods above in terms of accuracy and inference speed. We achieve up to 97.06% accuracy with our CNN. To the best of our knowledge, our work is one of the earliest attempts, if not the first, to apply computer vision and machine learning techniques for gesture recognition on the said dataset. As such, we hope this work will serve as a benchmark for future work on the CADDY dataset

    Gesture Recognition Wristband Device with Optimised Piezoelectric Energy Harvesters

    Get PDF
    Wearable devices can be used for monitoring vital human physiological signs and for interacting with computers. Due to the limited lifetime of batteries, these devices require novel energy harvesting solutions to ensure uninterrupted and autonomous operation. We therefore developed a wearable wristband device with piezoelectric transducers, which were used for hybrid functionality. These transducers were used for both energy harvesting and sensing applications. In fact, we also demonstrate that gestures can be classified using electricity generated from these piezoelectric transducers as a result of tendon movements around the wrist. In this paper, we demonstrate how a multi-physics simulation model was used to maximize the amount of harvestable energy from these piezoelectric transducers

    Diver Interest via Pointing in Three Dimensions: 3D Pointing Reconstruction for Diver-AUV Communication

    Full text link
    This paper presents Diver Interest via Pointing in Three Dimensions (DIP-3D), a method to relay an object of interest from a diver to an autonomous underwater vehicle (AUV) by pointing that includes three-dimensional distance information to discriminate between multiple objects in the AUV's camera image. Traditional dense stereo vision for distance estimation underwater is challenging because of the relative lack of saliency of scene features and degraded lighting conditions. Yet, including distance information is necessary for robotic perception of diver pointing when multiple objects appear within the robot's image plane. We subvert the challenges of underwater distance estimation by using sparse reconstruction of keypoints to perform pose estimation on both the left and right images from the robot's stereo camera. Triangulated pose keypoints, along with a classical object detection method, enable DIP-3D to infer the location of an object of interest when multiple objects are in the AUV's field of view. By allowing the scuba diver to point at an arbitrary object of interest and enabling the AUV to autonomously decide which object the diver is pointing to, this method will permit more natural interaction between AUVs and human scuba divers in underwater-human robot collaborative tasks.Comment: Under Review International Conference of Robotics and Automation 202

    Underwater Image Super-Resolution using Deep Residual Multipliers

    Full text link
    We present a deep residual network-based generative model for single image super-resolution (SISR) of underwater imagery for use by autonomous underwater robots. We also provide an adversarial training pipeline for learning SISR from paired data. In order to supervise the training, we formulate an objective function that evaluates the \textit{perceptual quality} of an image based on its global content, color, and local style information. Additionally, we present USR-248, a large-scale dataset of three sets of underwater images of 'high' (640x480) and 'low' (80x60, 160x120, and 320x240) spatial resolution. USR-248 contains paired instances for supervised training of 2x, 4x, or 8x SISR models. Furthermore, we validate the effectiveness of our proposed model through qualitative and quantitative experiments and compare the results with several state-of-the-art models' performances. We also analyze its practical feasibility for applications such as scene understanding and attention modeling in noisy visual conditions

    Integrating Affective Expressions into Robot-Assisted Search and Rescue to Improve Human-Robot Communication

    Get PDF
    Unexplained or ambiguous behaviours of rescue robots can lead to inefficient collaborations between humans and robots in robot-assisted SAR teams. To date, rescue robots do not have the ability to interact with humans on a social level, which is believed to be an essential ability that can improve the quality of interactions. This thesis research proposes to bring affective robot expressions into the SAR context to provide rescue robots social capabilities. The first experiment presented in Chapter 3 investigates whether there is consensus in mapping emotions to messages/situations in Urban Search and Rescue (USAR) scenarios, where efficiency and effectiveness of interactions are crucial to success. We studied mappings between 10 specific messages, presented in two different communication styles, reflecting common situations that might happen during search and rescue missions and the emotions exhibited by robots in those situations. The data was obtained through a Mechanical Turk study with 78 participants. The findings support the feasibility of using emotions as an additional communication channel to improve multi-modal human-robot interaction for urban search and rescue robots and suggest that these mappings are robust, i.e., are not affected by the robot’s communication style. The second experiment was conducted on Amazon Mechanical Turk as well with 223 participants. We used Affect Control Theory (ACT) as a method for deriving the mappings between situations and emotions (similar to the ones in the first experiment) and as an alternative method to obtaining mappings that can be adjusted for different emotion sets (Chapter 4). The results suggested that there is consistency in the choice of emotions for a robot to show in different situations between the two methods used in the first and second experiment, indicating the feasibility of using emotions as an additional modality in SAR robots. After validating the feasibility of bringing emotions to SAR context based on the findings from the first two experiments, we created affective expressions based on Evaluation, Potency and Activity (EPA) dimensions of ACT with the help of LED lights on a rescue robot called Husky. We evaluated the effect of emotions on rescue workers’ situational awareness through an online Amazon Mechanical Turk Study with 151 participants (Chapter 5). Findings indicated that participants who saw Husky with affective expressions (conveyed through lights) had better perception accuracy of the situation happening in the disaster scene than participants who saw the videos of the Husky robot without any affective lights. In other words, Husky with affective lights improved participants’ situational awareness
    corecore