Search CORE

30 research outputs found

Human Operator Tracking System for Safe Industrial Collaborative Robotics

Author: Eduardo João Caldas da Fonseca
Publication venue
Publication date: 13/07/2021
Field of study

With the advent of the Industry 4.0 paradigm, manufacturing is shifting from mass production towards customisable production lines. While robots excel at reliably executing repeating tasks in a fast and precise manner, they lack the now desired versatility of humans. Human-robot collaboration (HRC) seeks to address this issue by allowing human operators to work together with robots in close proximity, leveraging the strengths of both agents to increase adaptability and productivity. Safety is critical to user acceptance and the success of collaborative robots (cobots) and is thus a focus of research. Typical approaches provide the cobot with information such as operator pose estimates or higher-level motion predictions to facilitate adaptive planning of trajectory or action. Therefore, locating the operator in the shared workspace is a key feature. This dissertation seeks to kickstart the development of a human operator tracking system that provides a three-dimensional pose estimate and, in turn, ensures safety. State-of-the-art methods for human pose estimation in two-dimensional RGB images are tested with a custom dataset and evaluated. The results are then analysed considering real-time capability in the use case of a single operator performing industrial assembly tasks in a collaborative robotic cell equipped with a robotic arm. The resulting observations enable future work like fusion of depth information.With the advent of the Industry 4.0 paradigm, manufacturing is shifting from mass production towards customisable production lines. While robots excel at reliably executing repeating tasks in a fast and precise manner, they lack the now desired versatility of humans. Human-robot collaboration (HRC) seeks to address this issue by allowing human operators to work together with robots in close proximity, leveraging the strengths of both agents to increase adaptability and productivity. Safety is critical to user acceptance and the success of collaborative robots (cobots) and is thus a focus of research. Typical approaches provide the cobot with information such as operator pose estimates or higher-level motion predictions to facilitate adaptive planning of trajectory or action. Therefore, locating the operator in the shared workspace is a key feature. This dissertation seeks to kickstart the development of a human operator tracking system that provides a three-dimensional pose estimate and, in turn, ensures safety. State-of-the-art methods for human pose estimation in two-dimensional RGB images are tested with a custom dataset and evaluated. The results are then analysed considering real-time capability in the use case of a single operator performing industrial assembly tasks in a collaborative robotic cell equipped with a robotic arm. The resulting observations enable future work like fusion of depth information

Repositório Aberto da Universidade do Porto

Deep-learning feature descriptor for tree bark re-identification

Author: Robert Martin
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2020
Field of study

L’habilité de visuellement ré-identifier des objets est une capacité fondamentale des systèmes de vision. Souvent, ces systèmes s’appuient sur une collection de signatures visuelles basées sur des descripteurs comme SIFT ou SURF. Cependant, ces descripteurs traditionnels ont été conçus pour un certain domaine d’aspects et de géométries de surface (relief limité). Par conséquent, les surfaces très texturées telles que l’écorce des arbres leur posent un défi. Alors, cela rend plus difficile l’utilisation des arbres comme points de repère identifiables à des fins de navigation (robotique) ou le suivi du bois abattu le long d’une chaîne logistique (logistique). Nous proposons donc d’utiliser des descripteurs basés sur les données, qui une fois entraîné avec des images d’écorce, permettront la ré-identification de surfaces d’arbres. À cet effet, nous avons collecté un grand ensemble de données contenant 2 400 images d’écorce présentant de forts changements d’éclairage, annotées par surface et avec la possibilité d’être alignées au pixels près. Nous avons utilisé cet ensemble de données pour échantillonner parmis plus de 2 millions de parcelle d’image de 64x64 pixels afin d’entraîner nos nouveaux descripteurs locaux DeepBark et SqueezeBark. Notre méthode DeepBark a montré un net avantage par rapport aux descripteurs fabriqués à la main SIFT et SURF. Par exemple, nous avons démontré que DeepBark peut atteindre une mAP de 87.2% lorsqu’il doit retrouver 11 images d’écorce pertinentes, i.e correspondant à la même surface physique, à une image requête parmis 7,900 images. Notre travail suggère donc qu’il est possible de ré-identifier la surfaces des arbres dans un contexte difficile, tout en rendant public un nouvel ensemble de données.The ability to visually re-identify objects is a fundamental capability in vision systems. Oftentimes,it relies on collections of visual signatures based on descriptors, such as SIFT orSURF. However, these traditional descriptors were designed for a certain domain of surface appearances and geometries (limited relief). Consequently, highly-textured surfaces such as tree bark pose a challenge to them. In turn, this makes it more difficult to use trees as identifiable landmarks for navigational purposes (robotics) or to track felled lumber along a supply chain (logistics). We thus propose to use data-driven descriptors trained on bark images for tree surface re-identification. To this effect, we collected a large dataset containing 2,400 bark images with strong illumination changes, annotated by surface and with the ability to pixel align them. We used this dataset to sample from more than 2 million 64 64 pixel patches to train our novel local descriptors DeepBark and SqueezeBark. Our DeepBark method has shown a clear advantage against the hand-crafted descriptors SIFT and SURF. For instance, we demonstrated that DeepBark can reach a mAP of 87.2% when retrieving 11 relevant barkimages, i.e. corresponding to the same physical surface, to a bark query against 7,900 images. ur work thus suggests that re-identifying tree surfaces in a challenging illuminations contextis possible. We also make public our dataset, which can be used to benchmark surfacere-identification techniques

CorpusUL

The attentive robot companion: learning spatial information from observation and verbal interaction

Author: Ziegler Leon
Publication venue: Universität Bielefeld
Publication date: 01/01/2015
Field of study

Ziegler L. The attentive robot companion: learning spatial information from observation and verbal interaction. Bielefeld: Universität Bielefeld; 2015.This doctoral thesis investigates how a robot companion can gain a certain degree of situational awareness through observation and interaction with its surroundings. The focus lies on the representation of the spatial knowledge gathered constantly over time in an indoor environment. However, from the background of research on an interactive service robot, methods for deployment in inference and verbal communication tasks are presented. The design and application of the models are guided by the requirements of referential communication. The approach here involves the analysis of the dynamic properties of structures in the robot’s field of view allowing it to distinguish objects of interest from other agents and background structures. The use of multiple persistent models representing these dynamic properties enables the robot to track changes in multiple scenes over time to establish spatial and temporal references. This work includes building a coherent representation considering allocentric and egocentric aspects of spatial knowledge for these models. Spatial analysis is extended with a semantic interpretation of objects and regions. This top-down approach for generating additional context information enhances the grounding process in communication. A holistic, boosting-based classification approach using a wide range of 2D and 3D visual features anchored in the spatial representation allows the system to identify room types. The process of grounding referential descriptions from a human interlocutor in the spatial representation is evaluated through referencing furniture. This method uses a probabilistic network for handling ambiguities in the descriptions and employs a strategy for resolving conflicts. In order to approve the real-world applicability of these approaches, this system was deployed on the mobile robot BIRON in a realistic apartment scenario involving observation and verbal interaction with an interlocutor

Publications at Bielefeld University

Creating a virtual slide map from sputum smear images for region-of-interest localisation in automated microscopy

Author: Patel Bhavin
Publication venue: Department of Human Biology
Publication date: 01/01/2010
Field of study

Includes abstract.Includes bibliographical references (leaves 140-144).Automated microscopy for the detection of tuberculosis (TB) in sputum smears seeks to address the strain on technicians in busy TB laboratories and to achieve faster diagnosis in countries with a heavy TB burden. As a step in the development of an automated microscope, the project described here was concerned with microscope auto-positioning; this primarily involves generating a point of reference on a slide, which can be used to automatically bring desired fields on the slide to the field-of-view of the microscope for re-examination. The study was carried out using a conventional microscope and Ziehl- Neelsen (ZN) stained sputum smear slides. All images were captured at 40x magnification. A digital replication, the virtual slide map, of an actual slide was constructed by combining the manually acquired images of the different fields of the slide. The geometric hashing scheme was found to be suitable for auto-stitching a large number of images (over 300 images) to form a virtual slide map. An object recognition algorithm, which was also based on the geometric hashing technique, was used to localise a query image (the current field-of-view) on the virtual slide map. This localised field-of-view then served as the point of reference. The true positive (correct localisation of a query image on the virtual slide map) rate achieved by the algorithm was above 88% even for noisy query images captured at slide orientations up to 26°. The image registration error, computed as the average mean square error, was less than 14 pixel2 (corresponding to 1.02 μm2 and 0.001% error in an image measuring 1030 x 1300 pixels) corresponding to a root mean square registration error of 3.7 pixels. Superior image registration accuracy was obtained at the expense of time using the scale invariant feature transform (SIFT), with a image registration error of 1 pixel2 (0.07 μm2). The object recognition algorithm is inherently robust to changes in slide orientation and placement, which are likely to occur in practice as it is impossible to place the slide in exactly the same position on the microscope at different times. Moreover, the algorithm showed high tolerance to illumination changes and robustness to noise

Cape Town University OpenUCT

Intelligent Sensors for Human Motion Analysis

Author
Publication venue: 'MDPI AG'
Publication date: 25/10/2022
Field of study

The book, "Intelligent Sensors for Human Motion Analysis," contains 17 articles published in the Special Issue of the Sensors journal. These articles deal with many aspects related to the analysis of human movement. New techniques and methods for pose estimation, gait recognition, and fall detection have been proposed and verified. Some of them will trigger further research, and some may become the backbone of commercial systems

Directory of Open Access Books (DOAB)

Robotic Goal-Based Semi-Autonomous Algorithms Improve Remote Operator Performance

Author: Hunt Shawn
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2010
Field of study

The focus of this research was to determine if reliable goal-based semi-autonomous algorithms are able to improve remote operator performance or not. Two semi-autonomous algorithms were examined: visual servoing and visual dead reckoning. Visual servoing uses computer vision techniques to generate movement commands while using internal properties of the camera combined with sensor data that tell the robot its current position based on its previous position. This research shows that the semi-autonomous algorithms developed increased performance in a measurable way. An analysis of tracking algorithms for visual servoing was conducted and tracking algorithms were enhanced to make them as robust as possible. The developed algorithms were implemented on a currently fielded military robot and a human-in-the-loop experiment was conducted to measure performance

Digital Commons@Wayne State University

Monocular 3d Object Recognition

Author: Zhu Menglong
Publication venue: ScholarlyCommons
Publication date: 01/01/2016
Field of study

Object recognition is one of the fundamental tasks of computer vision. Recent advances in the field enable reliable 2D detections from a single cluttered image. However, many challenges still remain. Object detection needs timely response for real world applications. Moreover, we are genuinely interested in estimating the 3D pose and shape of an object or human for the sake of robotic manipulation and human-robot interaction. In this thesis, a suite of solutions to these challenges is presented. First, Active Deformable Part Models (ADPM) is proposed for fast part-based object detection. ADPM dramatically accelerates the detection by dynamically scheduling the part evaluations and efficiently pruning the image locations. Second, we unleash the power of marrying discriminative 2D parts with an explicit 3D geometric representation. Several methods of such scheme are proposed for recovering rich 3D information of both rigid and non-rigid objects from monocular RGB images. (1) The accurate 3D pose of an object instance is recovered from cluttered images using only the CAD model. (2) A global optimal solution for simultaneous 2D part localization, 3D pose and shape estimation is obtained by optimizing a unified convex objective function. Both appearance and geometric compatibility are jointly maximized. (3) 3D human pose estimation from an image sequence is realized via an Expectation-Maximization algorithm. The 2D joint location uncertainties are marginalized out during inference and 3D pose smoothness is enforced across frames. By bridging the gap between 2D and 3D, our methods provide an end-to-end solution to 3D object recognition from images. We demonstrate a range of interesting applications using only a single image or a monocular video, including autonomous robotic grasping with a single image, 3D object image pop-up and a monocular human MoCap system. We also show empirical start-of-art results on a number of benchmarks on 2D detection and 3D pose and shape estimation

ScholarlyCommons@Penn

Task-adaptable, Pervasive Perception for Robots Performing Everyday Manipulation

Author: Bálint-Benczédi Ferenc
Publication venue
Publication date: 01/01/2020
Field of study

Intelligent robotic agents that help us in our day-to-day chores have been an aspiration of robotics researchers for decades. More than fifty years since the creation of the first intelligent mobile robotic agent, robots are still struggling to perform seemingly simple tasks, such as setting or cleaning a table. One of the reasons for this is that the unstructured environments these robots are expected to work in impose demanding requirements on a robota s perception system. Depending on the manipulation task the robot is required to execute, different parts of the environment need to be examined, the objects in it found and functional parts of these identified. This is a challenging task, since the visual appearance of the objects and the variety of scenes they are found in are large. This thesis proposes to treat robotic visual perception for everyday manipulation tasks as an open question-asnswering problem. To this end RoboSherlock, a framework for creating task-adaptable, pervasive perception systems is presented. Using the framework, robot perception is addressed from a systema s perspective and contributions to the state-of-the-art are proposed that introduce several enhancements which scale robot perception toward the needs of human-level manipulation. The contributions of the thesis center around task-adaptability and pervasiveness of perception systems. A perception task-language and a language interpreter that generates task-relevant perception plans is proposed. The task-language and task-interpreter leverage the power of knowledge representation and knowledge-based reasoning in order to enhance the question-answering capabilities of the system. Pervasiveness, a seamless integration of past, present and future percepts, is achieved through three main contributions: a novel way for recording, replaying and inspecting perceptual episodic memories, a new perception component that enables pervasive operation and maintains an object belief state and a novel prospection component that enables robots to relive their past experiences and anticipate possible future scenarios. The contributions are validated through several real world robotic experiments that demonstrate how the proposed system enhances robot perception

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

Optical and hyperspectral image analysis for image-guided surgery

Author: Manni Francesca
Publication venue: Eindhoven University of Technology
Publication date: 05/04/2023
Field of study

Pure OAI Repository