37,054 research outputs found

    A real-time human-robot interaction system based on gestures for assistive scenarios

    Get PDF
    Natural and intuitive human interaction with robotic systems is a key point to develop robots assisting people in an easy and effective way. In this paper, a Human Robot Interaction (HRI) system able to recognize gestures usually employed in human non-verbal communication is introduced, and an in-depth study of its usability is performed. The system deals with dynamic gestures such as waving or nodding which are recognized using a Dynamic Time Warping approach based on gesture specific features computed from depth maps. A static gesture consisting in pointing at an object is also recognized. The pointed location is then estimated in order to detect candidate objects the user may refer to. When the pointed object is unclear for the robot, a disambiguation procedure by means of either a verbal or gestural dialogue is performed. This skill would lead to the robot picking an object in behalf of the user, which could present difficulties to do it by itself. The overall system — which is composed by a NAO and Wifibot robots, a KinectTM v2 sensor and two laptops — is firstly evaluated in a structured lab setup. Then, a broad set of user tests has been completed, which allows to assess correct performance in terms of recognition rates, easiness of use and response times.Postprint (author's final draft

    Robust Spoken Language Understanding for House Service Robots

    Get PDF
    Service robotics has been growing significantly in thelast years, leading to several research results and to a numberof consumer products. One of the essential features of theserobotic platforms is represented by the ability of interactingwith users through natural language. Spoken commands canbe processed by a Spoken Language Understanding chain, inorder to obtain the desired behavior of the robot. The entrypoint of such a process is represented by an Automatic SpeechRecognition (ASR) module, that provides a list of transcriptionsfor a given spoken utterance. Although several well-performingASR engines are available off-the-shelf, they operate in a generalpurpose setting. Hence, they may be not well suited in therecognition of utterances given to robots in specific domains. Inthis work, we propose a practical yet robust strategy to re-ranklists of transcriptions. This approach improves the quality of ASRsystems in situated scenarios, i.e., the transcription of roboticcommands. The proposed method relies upon evidences derivedby a semantic grammar with semantic actions, designed tomodel typical commands expressed in scenarios that are specificto human service robotics. The outcomes obtained throughan experimental evaluation show that the approach is able toeffectively outperform the ASR baseline, obtained by selectingthe first transcription suggested by the AS

    Interactive Robot Learning of Gestures, Language and Affordances

    Full text link
    A growing field in robotics and Artificial Intelligence (AI) research is human-robot collaboration, whose target is to enable effective teamwork between humans and robots. However, in many situations human teams are still superior to human-robot teams, primarily because human teams can easily agree on a common goal with language, and the individual members observe each other effectively, leveraging their shared motor repertoire and sensorimotor resources. This paper shows that for cognitive robots it is possible, and indeed fruitful, to combine knowledge acquired from interacting with elements of the environment (affordance exploration) with the probabilistic observation of another agent's actions. We propose a model that unites (i) learning robot affordances and word descriptions with (ii) statistical recognition of human gestures with vision sensors. We discuss theoretical motivations, possible implementations, and we show initial results which highlight that, after having acquired knowledge of its surrounding environment, a humanoid robot can generalize this knowledge to the case when it observes another agent (human partner) performing the same motor actions previously executed during training.Comment: code available at https://github.com/gsaponaro/glu-gesture

    Do (and say) as I say: Linguistic adaptation in human-computer dialogs

    Get PDF
    © Theodora Koulouri, Stanislao Lauria, and Robert D. Macredie. This article has been made available through the Brunel Open Access Publishing Fund.There is strong research evidence showing that people naturally align to each other’s vocabulary, sentence structure, and acoustic features in dialog, yet little is known about how the alignment mechanism operates in the interaction between users and computer systems let alone how it may be exploited to improve the efficiency of the interaction. This article provides an account of lexical alignment in human–computer dialogs, based on empirical data collected in a simulated human–computer interaction scenario. The results indicate that alignment is present, resulting in the gradual reduction and stabilization of the vocabulary-in-use, and that it is also reciprocal. Further, the results suggest that when system and user errors occur, the development of alignment is temporarily disrupted and users tend to introduce novel words to the dialog. The results also indicate that alignment in human–computer interaction may have a strong strategic component and is used as a resource to compensate for less optimal (visually impoverished) interaction conditions. Moreover, lower alignment is associated with less successful interaction, as measured by user perceptions. The article distills the results of the study into design recommendations for human–computer dialog systems and uses them to outline a model of dialog management that supports and exploits alignment through mechanisms for in-use adaptation of the system’s grammar and lexicon

    A bank of unscented Kalman filters for multimodal human perception with mobile service robots

    Get PDF
    A new generation of mobile service robots could be ready soon to operate in human environments if they can robustly estimate position and identity of surrounding people. Researchers in this field face a number of challenging problems, among which sensor uncertainties and real-time constraints. In this paper, we propose a novel and efficient solution for simultaneous tracking and recognition of people within the observation range of a mobile robot. Multisensor techniques for legs and face detection are fused in a robust probabilistic framework to height, clothes and face recognition algorithms. The system is based on an efficient bank of Unscented Kalman Filters that keeps a multi-hypothesis estimate of the person being tracked, including the case where the latter is unknown to the robot. Several experiments with real mobile robots are presented to validate the proposed approach. They show that our solutions can improve the robot's perception and recognition of humans, providing a useful contribution for the future application of service robotics

    Single camera pose estimation using Bayesian filtering and Kinect motion priors

    Full text link
    Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014 conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video: https://www.youtube.com/watch?v=dJMTSo7-uF
    corecore