1,943 research outputs found

    A real-time human-robot interaction system based on gestures for assistive scenarios

    Get PDF
    Natural and intuitive human interaction with robotic systems is a key point to develop robots assisting people in an easy and effective way. In this paper, a Human Robot Interaction (HRI) system able to recognize gestures usually employed in human non-verbal communication is introduced, and an in-depth study of its usability is performed. The system deals with dynamic gestures such as waving or nodding which are recognized using a Dynamic Time Warping approach based on gesture specific features computed from depth maps. A static gesture consisting in pointing at an object is also recognized. The pointed location is then estimated in order to detect candidate objects the user may refer to. When the pointed object is unclear for the robot, a disambiguation procedure by means of either a verbal or gestural dialogue is performed. This skill would lead to the robot picking an object in behalf of the user, which could present difficulties to do it by itself. The overall system — which is composed by a NAO and Wifibot robots, a KinectTM v2 sensor and two laptops — is firstly evaluated in a structured lab setup. Then, a broad set of user tests has been completed, which allows to assess correct performance in terms of recognition rates, easiness of use and response times.Postprint (author's final draft

    Tracking and modeling focus of attention in meetings [online]

    Get PDF
    Abstract This thesis addresses the problem of tracking the focus of attention of people. In particular, a system to track the focus of attention of participants in meetings is developed. Obtaining knowledge about a person\u27s focus of attention is an important step towards a better understanding of what people do, how and with what or whom they interact or to what they refer. In meetings, focus of attention can be used to disambiguate the addressees of speech acts, to analyze interaction and for indexing of meeting transcripts. Tracking a user\u27s focus of attention also greatly contributes to the improvement of human­computer interfaces since it can be used to build interfaces and environments that become aware of what the user is paying attention to or with what or whom he is interacting. The direction in which people look; i.e., their gaze, is closely related to their focus of attention. In this thesis, we estimate a subject\u27s focus of attention based on his or her head orientation. While the direction in which someone looks is determined by head orientation and eye gaze, relevant literature suggests that head orientation alone is a su#cient cue for the detection of someone\u27s direction of attention during social interaction. We present experimental results from a user study and from several recorded meetings that support this hypothesis. We have developed a Bayesian approach to model at whom or what someone is look­ ing based on his or her head orientation. To estimate head orientations in meetings, the participants\u27 faces are automatically tracked in the view of a panoramic camera and neural networks are used to estimate their head orientations from pre­processed images of their faces. Using this approach, the focus of attention target of subjects could be correctly identified during 73% of the time in a number of evaluation meet­ ings with four participants. In addition, we have investigated whether a person\u27s focus of attention can be pre­dicted from other cues. Our results show that focus of attention is correlated to who is speaking in a meeting and that it is possible to predict a person\u27s focus of attention based on the information of who is talking or was talking before a given moment. We have trained neural networks to predict at whom a person is looking, based on information about who was speaking. Using this approach we were able to predict who is looking at whom with 63% accuracy on the evaluation meetings using only information about who was speaking. We show that by using both head orientation and speaker information to estimate a person\u27s focus, the accuracy of focus detection can be improved compared to just using one of the modalities for focus estimation. To demonstrate the generality of our approach, we have built a prototype system to demonstrate focus­aware interaction with a household robot and other smart appliances in a room using the developed components for focus of attention tracking. In the demonstration environment, a subject could interact with a simulated household robot, a speech­enabled VCR or with other people in the room, and the recipient of the subject\u27s speech was disambiguated based on the user\u27s direction of attention. Zusammenfassung Die vorliegende Arbeit beschäftigt sich mit der automatischen Bestimmung und Ver­folgung des Aufmerksamkeitsfokus von Personen in Besprechungen. Die Bestimmung des Aufmerksamkeitsfokus von Personen ist zum Verständnis und zur automatischen Auswertung von Besprechungsprotokollen sehr wichtig. So kann damit beispielsweise herausgefunden werden, wer zu einem bestimmten Zeitpunkt wen angesprochen hat beziehungsweise wer wem zugehört hat. Die automatische Bestim­mung des Aufmerksamkeitsfokus kann desweiteren zur Verbesserung von Mensch-Maschine­Schnittstellen benutzt werden. Ein wichtiger Hinweis auf die Richtung, in welche eine Person ihre Aufmerksamkeit richtet, ist die Kopfstellung der Person. Daher wurde ein Verfahren zur Bestimmung der Kopfstellungen von Personen entwickelt. Hierzu wurden künstliche neuronale Netze benutzt, welche als Eingaben vorverarbeitete Bilder des Kopfes einer Person erhalten, und als Ausgabe eine Schätzung der Kopfstellung berechnen. Mit den trainierten Netzen wurde auf Bilddaten neuer Personen, also Personen, deren Bilder nicht in der Trainingsmenge enthalten waren, ein mittlerer Fehler von neun bis zehn Grad für die Bestimmung der horizontalen und vertikalen Kopfstellung erreicht. Desweiteren wird ein probabilistischer Ansatz zur Bestimmung von Aufmerksamkeits­zielen vorgestellt. Es wird hierbei ein Bayes\u27scher Ansatzes verwendet um die A­posterior iWahrscheinlichkeiten verschiedener Aufmerksamkteitsziele, gegeben beobachteter Kopfstellungen einer Person, zu bestimmen. Die entwickelten Ansätze wurden auf mehren Besprechungen mit vier bis fünf Teilnehmern evaluiert. Ein weiterer Beitrag dieser Arbeit ist die Untersuchung, inwieweit sich die Blickrich­tung der Besprechungsteilnehmer basierend darauf, wer gerade spricht, vorhersagen läßt. Es wurde ein Verfahren entwickelt um mit Hilfe von neuronalen Netzen den Fokus einer Person basierend auf einer kurzen Historie der Sprecherkonstellationen zu schätzen. Wir zeigen, dass durch Kombination der bildbasierten und der sprecherbasierten Schätzung des Aufmerksamkeitsfokus eine deutliche verbesserte Schätzung erreicht werden kann. Insgesamt wurde mit dieser Arbeit erstmals ein System vorgestellt um automatisch die Aufmerksamkeit von Personen in einem Besprechungsraum zu verfolgen. Die entwickelten Ansätze und Methoden können auch zur Bestimmung der Aufmerk­samkeit von Personen in anderen Bereichen, insbesondere zur Steuerung von comput­erisierten, interaktiven Umgebungen, verwendet werden. Dies wird an einer Beispielapplikation gezeigt

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    Robust 3D IMU-LIDAR Calibration and Multi Sensor Probabilistic State Estimation

    Get PDF
    Autonomous robots are highly complex systems. In order to operate in dynamic environments, adaptability in their decision-making algorithms is a must. Thus, the internal and external information that robots obtain from sensors is critical to re-evaluate their decisions in real time. Accuracy is key in this endeavor, both from the hardware side and the modeling point of view. In order to guarantee the highest performance, sensors need to be correctly calibrated. To this end, some parameters are tuned so that the particular realization of a sensor best matches a generalized mathematical model. This step grows in complexity with the integration of multiple sensors, which is generally a requirement in order to cope with the dynamic nature of real world applications. This project aims to deal with the calibration of an inertial measurement unit, or IMU, and a Light Detection and Ranging device, or LiDAR. An offline batch optimization procedure is proposed to optimally estimate the intrinsic and extrinsic parameters of the model. Then, an online state estimation module that makes use of the aforementioned parameters and the fusion of LiDAR-inertial data for local navigation is proposed. Additionally, it incorporates real time corrections to account for the time-varying nature of the model, essential to deal with exposure to continued operation and wear and tear. Keywords: sensor fusion, multi-sensor calibration, factor graphs, batch optimization, Gaussian Processes, state estimation, LiDAR-inertial odometry, Error State Kalman Filter, Normal Distributions Transform

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Grasping, Perching, And Visual Servoing For Micro Aerial Vehicles

    Get PDF
    Micro Aerial Vehicles (MAVs) have seen a dramatic growth in the consumer market because of their ability to provide new vantage points for aerial photography and videography. However, there is little consideration for physical interaction with the environment surrounding them. Onboard manipulators are absent, and onboard perception, if existent, is used to avoid obstacles and maintain a minimum distance from them. There are many applications, however, which would benefit greatly from aerial manipulation or flight in close proximity to structures. This work is focused on facilitating these types of close interactions between quadrotors and surrounding objects. We first explore high-speed grasping, enabling a quadrotor to quickly grasp an object while moving at a high relative velocity. Next, we discuss planning and control strategies, empowering a quadrotor to perch on vertical surfaces using a downward-facing gripper. Then, we demonstrate that such interactions can be achieved using only onboard sensors by incorporating vision-based control and vision-based planning. In particular, we show how a quadrotor can use a single camera and an Inertial Measurement Unit (IMU) to perch on a cylinder. Finally, we generalize our approach to consider objects in motion, and we present relative pose estimation and planning, enabling tracking of a moving sphere using only an onboard camera and IMU

    Comparison of interaction modalities for mobile indoor robot guidance : direct physical interaction, person following, and pointing control

    Get PDF
    © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksThree advanced natural interaction modalities for mobile robot guidance in an indoor environment were developed and compared using two tasks and quantitative metrics to measure performance and workload. The first interaction modality is based on direct physical interaction requiring the human user to push the robot in order to displace it. The second and third interaction modalities exploit a 3-D vision-based human-skeleton tracking allowing the user to guide the robot by either walking in front of it or by pointing toward a desired location. In the first task, the participants were asked to guide the robot between different rooms in a simulated physical apartment requiring rough movement of the robot through designated areas. The second task evaluated robot guidance in the same environment through a set of waypoints, which required accurate movements. The three interaction modalities were implemented on a generic differential drive mobile platform equipped with a pan-tilt system and a Kinect camera. Task completion time and accuracy were used as metrics to assess the users’ performance, while the NASA-TLX questionnaire was used to evaluate the users’ workload. A study with 24 participants indicated that choice of interaction modality had significant effect on completion time (F(2,61)=84.874, p<0.001), accuracy (F(2,29)=4.937, p=0.016), and workload (F(2,68)=11.948, p<0.001). The direct physical interaction required less time, provided more accuracy and less workload than the two contactless interaction modalities. Between the two contactless interaction modalities, the person-following interaction mod- lity was systematically better than the pointing-control one: The participants completed the tasks faster with less workloadPeer ReviewedPostprint (author's final draft
    • …
    corecore