389 research outputs found

    Feature extraction using MPEG-CDVS and Deep Learning with application to robotic navigation and image classification

    Get PDF
    The main contributions of this thesis are the evaluation of MPEG Compact Descriptor for Visual Search in the context of indoor robotic navigation and the introduction of a new method for training Convolutional Neural Networks with applications to object classification. The choice for image descriptor in a visual navigation system is not straightforward. Visual descriptors must be distinctive enough to allow for correct localisation while still offering low matching complexity and short descriptor size for real-time applications. MPEG Compact Descriptor for Visual Search is a low complexity image descriptor that offers several levels of compromises between descriptor distinctiveness and size. In this work, we describe how these trade-offs can be used for efficient loop-detection in a typical indoor environment. We first describe a probabilistic approach to loop detection based on the standard’s suggested similarity metric. We then evaluate the performance of CDVS compression modes in terms of matching speed, feature extraction, and storage requirements and compare them with the state of the art SIFT descriptor for five different types of indoor floors. During the second part of this thesis we focus on the new paradigm to machine learning and computer vision called Deep Learning. Under this paradigm visual features are no longer extracted using fine-grained, highly engineered feature extractor, but rather using a Convolutional Neural Networks (CNN) that extracts hierarchical features learned directly from data at the cost of long training periods. In this context, we propose a method for speeding up the training of Convolutional Neural Networks (CNN) by exploiting the spatial scaling property of convolutions. This is done by first training a pre-train CNN of smaller kernel resolutions for a few epochs, followed by properly rescaling its kernels to the target’s original dimensions and continuing training at full resolution. We show that the overall training time of a target CNN architecture can be reduced by exploiting the spatial scaling property of convolutions during early stages of learning. Moreover, by rescaling the kernels at different epochs, we identify a trade-off between total training time and maximum obtainable accuracy. Finally, we propose a method for choosing when to rescale kernels and evaluate our approach on recent architectures showing savings in training times of nearly 20% while test set accuracy is preserved

    Control strategies for cleaning robots in domestic applications: A comprehensive review:

    Get PDF
    Service robots are built and developed for various applications to support humans as companion, caretaker, or domestic support. As the number of elderly people grows, service robots will be in increasing demand. Particularly, one of the main tasks performed by elderly people, and others, is the complex task of cleaning. Therefore, cleaning tasks, such as sweeping floors, washing dishes, and wiping windows, have been developed for the domestic environment using service robots or robot manipulators with several control approaches. This article is primarily focused on control methodology used for cleaning tasks. Specifically, this work mainly discusses classical control and learning-based controlled methods. The classical control approaches, which consist of position control, force control, and impedance control , are commonly used for cleaning purposes in a highly controlled environment. However, classical control methods cannot be generalized for cluttered environment so that learning-based control methods could be an alternative solution. Learning-based control methods for cleaning tasks can encompass three approaches: learning from demonstration (LfD), supervised learning (SL), and reinforcement learning (RL). These control approaches have their own capabilities to generalize the cleaning tasks in the new environment. For example, LfD, which many research groups have used for cleaning tasks, can generate complex cleaning trajectories based on human demonstration. Also, SL can support the prediction of dirt areas and cleaning motion using large number of data set. Finally, RL can learn cleaning actions and interact with the new environment by the robot itself. In this context, this article aims to provide a general overview of robotic cleaning tasks based on different types of control methods using manipulator. It also suggest a description of the future directions of cleaning tasks based on the evaluation of the control approaches

    Learning Algorithm Design for Human-Robot Skill Transfer

    Get PDF
    In this research, we develop an intelligent learning scheme for performing human-robot skills transfer. Techniques adopted in the scheme include the Dynamic Movement Prim- itive (DMP) method with Dynamic Time Warping (DTW), Gaussian Mixture Model (G- MM) with Gaussian Mixture Regression (GMR) and the Radical Basis Function Neural Networks (RBFNNs). A series of experiments are conducted on a Baxter robot, a NAO robot and a KUKA iiwa robot to verify the effectiveness of the proposed design.During the design of the intelligent learning scheme, an online tracking system is de- veloped to control the arm and head movement of the NAO robot using a Kinect sensor. The NAO robot is a humanoid robot with 5 degrees of freedom (DOF) for each arm. The joint motions of the operator’s head and arm are captured by a Kinect V2 sensor, and this information is then transferred into the workspace via the forward and inverse kinematics. In addition, to improve the tracking performance, a Kalman filter is further employed to fuse motion signals from the operator sensed by the Kinect V2 sensor and a pair of MYO armbands, so as to teleoperate the Baxter robot. In this regard, a new strategy is developed using the vector approach to accomplish a specific motion capture task. For instance, the arm motion of the operator is captured by a Kinect sensor and programmed through a processing software. Two MYO armbands with embedded inertial measurement units are worn by the operator to aid the robots in detecting and replicating the operator’s arm movements. For this purpose, the armbands help to recognize and calculate the precise velocity of motion of the operator’s arm. Additionally, a neural network based adaptive controller is designed and implemented on the Baxter robot to illustrate the validation forthe teleoperation of the Baxter robot.Subsequently, an enhanced teaching interface has been developed for the robot using DMP and GMR. Motion signals are collected from a human demonstrator via the Kinect v2 sensor, and the data is sent to a remote PC for teleoperating the Baxter robot. At this stage, the DMP is utilized to model and generalize the movements. In order to learn from multiple demonstrations, DTW is used for the preprocessing of the data recorded on the robot platform, and GMM is employed for the evaluation of DMP to generate multiple patterns after the completion of the teaching process. Next, we apply the GMR algorithm to generate a synthesized trajectory to minimize position errors in the three dimensional (3D) space. This approach has been tested by performing tasks on a KUKA iiwa and a Baxter robot, respectively.Finally, an optimized DMP is added to the teaching interface. A character recombination technology based on DMP segmentation that uses verbal command has also been developed and incorporated in a Baxter robot platform. To imitate the recorded motion signals produced by the demonstrator, the operator trains the Baxter robot by physically guiding it to complete the given task. This is repeated five times, and the generated training data set is utilized via the playback system. Subsequently, the DTW is employed to preprocess the experimental data. For modelling and overall movement control, DMP is chosen. The GMM is used to generate multiple patterns after implementing the teaching process. Next, we employ the GMR algorithm to reduce position errors in the 3D space after a synthesized trajectory has been generated. The Baxter robot, remotely controlled by the user datagram protocol (UDP) in a PC, records and reproduces every trajectory. Additionally, Dragon Natural Speaking software is adopted to transcribe the voice data. This proposed approach has been verified by enabling the Baxter robot to perform a writing task of drawing robot has been taught to write only one character

    Spatial representation for planning and executing robot behaviors in complex environments

    Get PDF
    Robots are already improving our well-being and productivity in different applications such as industry, health-care and indoor service applications. However, we are still far from developing (and releasing) a fully functional robotic agent that can autonomously survive in tasks that require human-level cognitive capabilities. Robotic systems on the market, in fact, are designed to address specific applications, and can only run pre-defined behaviors to robustly repeat few tasks (e.g., assembling objects parts, vacuum cleaning). They internal representation of the world is usually constrained to the task they are performing, and does not allows for generalization to other scenarios. Unfortunately, such a paradigm only apply to a very limited set of domains, where the environment can be assumed to be static, and its dynamics can be handled before deployment. Additionally, robots configured in this way will eventually fail if their "handcrafted'' representation of the environment does not match the external world. Hence, to enable more sophisticated cognitive skills, we investigate how to design robots to properly represent the environment and behave accordingly. To this end, we formalize a representation of the environment that enhances the robot spatial knowledge to explicitly include a representation of its own actions. Spatial knowledge constitutes the core of the robot understanding of the environment, however it is not sufficient to represent what the robot is capable to do in it. To overcome such a limitation, we formalize SK4R, a spatial knowledge representation for robots which enhances spatial knowledge with a novel and "functional" point of view that explicitly models robot actions. To this end, we exploit the concept of affordances, introduced to express opportunities (actions) that objects offer to an agent. To encode affordances within SK4R, we define the "affordance semantics" of actions that is used to annotate an environment, and to represent to which extent robot actions support goal-oriented behaviors. We demonstrate the benefits of a functional representation of the environment in multiple robotic scenarios that traverse and contribute different research topics relating to: robot knowledge representations, social robotics, multi-robot systems and robot learning and planning. We show how a domain-specific representation, that explicitly encodes affordance semantics, provides the robot with a more concrete understanding of the environment and of the effects that its actions have on it. The goal of our work is to design an agent that will no longer execute an action, because of mere pre-defined routine, rather, it will execute an actions because it "knows'' that the resulting state leads one step closer to success in its task

    Multimodal human hand motion sensing and analysis - a review

    Get PDF

    Asynchronous Collaborative Autoscanning with Mode Switching for Multi-Robot Scene Reconstruction

    Full text link
    When conducting autonomous scanning for the online reconstruction of unknown indoor environments, robots have to be competent at exploring scene structure and reconstructing objects with high quality. Our key observation is that different tasks demand specialized scanning properties of robots: rapid moving speed and far vision for global exploration and slow moving speed and narrow vision for local object reconstruction, which are referred as two different scanning modes: explorer and reconstructor, respectively. When requiring multiple robots to collaborate for efficient exploration and fine-grained reconstruction, the questions on when to generate and how to assign those tasks should be carefully answered. Therefore, we propose a novel asynchronous collaborative autoscanning method with mode switching, which generates two kinds of scanning tasks with associated scanning modes, i.e., exploration task with explorer mode and reconstruction task with reconstructor mode, and assign them to the robots to execute in an asynchronous collaborative manner to highly boost the scanning efficiency and reconstruction quality. The task assignment is optimized by solving a modified Multi-Depot Multiple Traveling Salesman Problem (MDMTSP). Moreover, to further enhance the collaboration and increase the efficiency, we propose a task-flow model that actives the task generation and assignment process immediately when any of the robots finish all its tasks with no need to wait for all other robots to complete the tasks assigned in the previous iteration. Extensive experiments have been conducted to show the importance of each key component of our method and the superiority over previous methods in scanning efficiency and reconstruction quality.Comment: 13pages, 12 figures, Conference: SIGGRAPH Asia 202

    Vision-Guided Robot Hearing

    Get PDF
    International audienceNatural human-robot interaction (HRI) in complex and unpredictable environments is important with many potential applicatons. While vision-based HRI has been thoroughly investigated, robot hearing and audio-based HRI are emerging research topics in robotics. In typical real-world scenarios, humans are at some distance from the robot and hence the sensory (microphone) data are strongly impaired by background noise, reverberations and competing auditory sources. In this context, the detection and localization of speakers plays a key role that enables several tasks, such as improving the signal-to-noise ratio for speech recognition, speaker recognition, speaker tracking, etc. In this paper we address the problem of how to detect and localize people that are both seen and heard. We introduce a hybrid deterministic/probabilistic model. The deterministic component allows us to map 3D visual data onto an 1D auditory space. The probabilistic component of the model enables the visual features to guide the grouping of the auditory features in order to form audiovisual (AV) objects. The proposed model and the associated algorithms are implemented in real-time (17 FPS) using a stereoscopic camera pair and two microphones embedded into the head of the humanoid robot NAO. We perform experiments with (i)~synthetic data, (ii)~publicly available data gathered with an audiovisual robotic head, and (iii)~data acquired using the NAO robot. The results validate the approach and are an encouragement to investigate how vision and hearing could be further combined for robust HRI

    Explainable shared control in assistive robotics

    Get PDF
    Shared control plays a pivotal role in designing assistive robots to complement human capabilities during everyday tasks. However, traditional shared control relies on users forming an accurate mental model of expected robot behaviour. Without this accurate mental image, users may encounter confusion or frustration whenever their actions do not elicit the intended system response, forming a misalignment between the respective internal models of the robot and human. The Explainable Shared Control paradigm introduced in this thesis attempts to resolve such model misalignment by jointly considering assistance and transparency. There are two perspectives of transparency to Explainable Shared Control: the human's and the robot's. Augmented reality is presented as an integral component that addresses the human viewpoint by visually unveiling the robot's internal mechanisms. Whilst the robot perspective requires an awareness of human "intent", and so a clustering framework composed of a deep generative model is developed for human intention inference. Both transparency constructs are implemented atop a real assistive robotic wheelchair and tested with human users. An augmented reality headset is incorporated into the robotic wheelchair and different interface options are evaluated across two user studies to explore their influence on mental model accuracy. Experimental results indicate that this setup facilitates transparent assistance by improving recovery times from adverse events associated with model misalignment. As for human intention inference, the clustering framework is applied to a dataset collected from users operating the robotic wheelchair. Findings from this experiment demonstrate that the learnt clusters are interpretable and meaningful representations of human intent. This thesis serves as a first step in the interdisciplinary area of Explainable Shared Control. The contributions to shared control, augmented reality and representation learning contained within this thesis are likely to help future research advance the proposed paradigm, and thus bolster the prevalence of assistive robots.Open Acces
    • …
    corecore