309 research outputs found

    Mapping and Semantic Perception for Service Robotics

    Get PDF
    Para realizar una tarea, los robots deben ser capaces de ubicarse en el entorno. Si un robot no sabe dónde se encuentra, es imposible que sea capaz de desplazarse para alcanzar el objetivo de su tarea. La localización y construcción de mapas simultánea, llamado SLAM, es un problema estudiado en la literatura que ofrece una solución a este problema. El objetivo de esta tesis es desarrollar técnicas que permitan a un robot comprender el entorno mediante la incorporación de información semántica. Esta información también proporcionará una mejora en la localización y navegación de las plataformas robóticas. Además, también demostramos cómo un robot con capacidades limitadas puede construir de forma fiable y eficiente los mapas semánticos necesarios para realizar sus tareas cotidianas.El sistema de construcción de mapas presentado tiene las siguientes características: En el lado de la construcción de mapas proponemos la externalización de cálculos costosos a un servidor en nube. Además, proponemos métodos para registrar información semántica relevante con respecto a los mapas geométricos estimados. En cuanto a la reutilización de los mapas construidos, proponemos un método que combina la construcción de mapas con la navegación de un robot para explorar mejor un entorno y disponer de un mapa semántico con los objetos relevantes para una misión determinada.En primer lugar, desarrollamos un algoritmo semántico de SLAM visual que se fusiona los puntos estimados en el mapa, carentes de sentido, con objetos conocidos. Utilizamos un sistema monocular de SLAM basado en un EKF (Filtro Extendido de Kalman) centrado principalmente en la construcción de mapas geométricos compuestos únicamente por puntos o bordes; pero sin ningún significado o contenido semántico asociado. El mapa no anotado se construye utilizando sólo la información extraída de una secuencia de imágenes monoculares. La parte semántica o anotada del mapa -los objetos- se estiman utilizando la información de la secuencia de imágenes y los modelos de objetos precalculados. Como segundo paso, mejoramos el método de SLAM presentado anteriormente mediante el diseño y la implementación de un método distribuido. La optimización de mapas y el almacenamiento se realiza como un servicio en la nube, mientras que el cliente con poca necesidad de computo, se ejecuta en un equipo local ubicado en el robot y realiza el cálculo de la trayectoria de la cámara. Los ordenadores con los que está equipado el robot se liberan de la mayor parte de los cálculos y el único requisito adicional es una conexión a Internet.El siguiente paso es explotar la información semántica que somos capaces de generar para ver cómo mejorar la navegación de un robot. La contribución en esta tesis se centra en la detección 3D y en el diseño e implementación de un sistema de construcción de mapas semántico.A continuación, diseñamos e implementamos un sistema de SLAM visual capaz de funcionar con robustez en entornos poblados debido a que los robots de servicio trabajan en espacios compartidos con personas. El sistema presentado es capaz de enmascarar las zonas de imagen ocupadas por las personas, lo que aumenta la robustez, la reubicación, la precisión y la reutilización del mapa geométrico. Además, calcula la trayectoria completa de cada persona detectada con respecto al mapa global de la escena, independientemente de la ubicación de la cámara cuando la persona fue detectada.Por último, centramos nuestra investigación en aplicaciones de rescate y seguridad. Desplegamos un equipo de robots en entornos que plantean múltiples retos que implican la planificación de tareas, la planificación del movimiento, la localización y construcción de mapas, la navegación segura, la coordinación y las comunicaciones entre todos los robots. La arquitectura propuesta integra todas las funcionalidades mencionadas, asi como varios aspectos de investigación novedosos para lograr una exploración real, como son: localización basada en características semánticas-topológicas, planificación de despliegue en términos de las características semánticas aprendidas y reconocidas, y construcción de mapas.In order to perform a task, robots need to be able to locate themselves in the environment. If a robot does not know where it is, it is impossible for it to move, reach its goal and complete the task. Simultaneous Localization and Mapping, known as SLAM, is a problem extensively studied in the literature for enabling robots to locate themselves in unknown environments. The goal of this thesis is to develop and describe techniques to allow a service robot to understand the environment by incorporating semantic information. This information will also provide an improvement in the localization and navigation of robotic platforms. In addition, we also demonstrate how a simple robot can reliably and efficiently build the semantic maps needed to perform its quotidian tasks. The mapping system as built has the following features. On the map building side we propose the externalization of expensive computations to a cloud server. Additionally, we propose methods to register relevant semantic information with respect to the estimated geometrical maps. Regarding the reuse of the maps built, we propose a method that combines map building with robot navigation to better explore a room in order to obtain a semantic map with the relevant objects for a given mission. Firstly, we develop a semantic Visual SLAM algorithm that merges traditional with known objects in the estimated map. We use a monocular EKF (Extended Kalman Filter) SLAM system that has mainly been focused on producing geometric maps composed simply of points or edges but without any associated meaning or semantic content. The non-annotated map is built using only the information extracted from an image sequence. The semantic or annotated parts of the map –the objects– are estimated using the information in the image sequence and the precomputed object models. As a second step we improve the EKF SLAM presented previously by designing and implementing a visual SLAM system based on a distributed framework. The expensive map optimization and storage is allocated as a service in the Cloud, while a light camera tracking client runs on a local computer. The robot’s onboard computers are freed from most of the computation, the only extra requirement being an internet connection. The next step is to exploit the semantic information that we are able to generate to see how to improve the navigation of a robot. The contribution of this thesis is focused on 3D sensing which we use to design and implement a semantic mapping system. We then design and implement a visual SLAM system able to perform robustly in populated environments due to service robots work in environments where people are present. The system is able to mask the image regions occupied by people out of the rigid SLAM pipeline, which boosts the robustness, the relocation, the accuracy and the reusability of the geometrical map. In addition, it estimates the full trajectory of each detected person with respect to the scene global map, irrespective of the location of the moving camera at the point when the people were imaged. Finally, we focus our research on rescue and security applications. The deployment of a multirobot team in confined environments poses multiple challenges that involve task planning, motion planning, localization and mapping, safe navigation, coordination and communications among all the robots. The architecture integrates, jointly with all the above-mentioned functionalities, several novel features to achieve real exploration: localization based on semantic-topological features, deployment planning in terms of the semantic features learned and recognized, and map building.<br /

    Toward Real-Time Video-Enhanced Augmented Reality for Medical Visualization and Simulation

    Get PDF
    In this work we demonstrate two separate forms of augmented reality environments for use with minimally-invasive surgical techniques. In Chapter 2 it is demonstrated how a video feed from a webcam, which could mimic a laparoscopic or endoscopic camera used during an interventional procedure, can be used to identify the pose of the camera with respect to the viewed scene and augment the video feed with computer-generated information, such as rendering of internal anatomy not visible beyond the image surface, resulting in a simple augmented reality environment. Chapter 3 details our implementation of a similar system to the one previously mentioned, albeit with an external tracking system. Additionally, we discuss the challenges and considerations for expanding this system to support an external tracking system, specifically the Polaris Spectra optical tracker. Because of the relocation of the tracking origin to a point other than the camera center, there is an additional registration step necessary to establish the position of all components within the scene. This modification is expected to increase accuracy and robustness of the system

    A performance evaluation method to compare the multi-view point cloud data registration based on ICP algorithm and reference marker

    Get PDF
    Registration of range images of surfaces is a fundamental problem in three-dimensional modelling. This process is performed by finding a rotation matrix and translation vector between two sets of data points requiring registration. Many techniques have been developed to solve the registration problem. Therefore, it is important to understand the accuracy of various registration techniques when we decide which technique will be selected to perform registration task. This paper presents a new approach to test and compare registration techniques in terms of accuracy. Among various registration methods, iterative closest point-based algorithms and reference marker methods are two types of commonly applied methods which are used to accomplish this task because they are easy to implement and relatively low cost. These two methods have been selected to perform a comprehensively quantitative evaluation by using the proposed method and the registration results are verified using the calibrated NPL freeform standard

    Object Tracking

    Get PDF
    Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application

    Measuring skeletal kinematics with accelerometers on the skin surface

    Get PDF
    The most common motion analysis method uses cameras to track the position of markers on bodily surfaces over time. Although each species has a common skeletal frame to reference recorded motions, the soft tissue covering each is not rigid. Markers, therefore, experience motion relative to the bone and do not accurately portray underlying bone activity. This limits clinical use of motion studies and the understanding of joint motion. Use of MEMS accelerometers for removing soft tissue artifact, motion relative to the bone, from surface measurements and determining the position of the underlying bone was investigated. An animal limb was modeled experimentally as a double pendulum with soft tissue as sprung masses with motions perpendicular to the pendulums. Horizontal motion was cycled at the top joint with a 25 cm stroke. Position data obtained from the mass with a Codamotion™ system and integrated accelerometer data were combined in a Kalman filter to determine global position. Acceleration data in the sensor coordinate system determined tissue artifact and were compared to measurements using CODA markers on the mass and pendulum. Removing artifact from mass position estimated pendulum position over time. In determining mass position, integrated accelerometer data experienced drift, deviating from reasonable values and were determined impractical for Kalman filter input. This led to using only the CODA-determined position as the true position. Accelerometer artifacts resulted in mean differences with the CODA markers of less than 1 mm over 3 cm displacements excluding a mass with mechanical difficulties. The largest mean difference across four tests was 0.66 mm, which is 96.17 percent accurate. Mean differences between base positions collected from accelerometers and CODA markers were found for the global x and y directions. Maximum deviations were 1.64 mm and 4.45 mm, respectively, which are 99.56 and 99.63 percent accurate. Results show the effectiveness of this procedure in calculating the location of the bases of sprung masses in two dimensions. The basis of this research contributes to the determination of bone position over time that will increase the potential of understanding fundamental, rigid body and joint motions in a clinical setting using noninvasive methods

    Visionary Ophthalmics: Confluence of Computer Vision and Deep Learning for Ophthalmology

    Get PDF
    Ophthalmology is a medical field ripe with opportunities for meaningful application of computer vision algorithms. The field utilizes data from multiple disparate imaging techniques, ranging from conventional cameras to tomography, comprising a diverse set of computer vision challenges. Computer vision has a rich history of techniques that can adequately meet many of these challenges. However, the field has undergone something of a revolution in recent times as deep learning techniques have sprung into the forefront following advances in GPU hardware. This development raises important questions regarding how to best leverage insights from both modern deep learning approaches and more classical computer vision approaches for a given problem. In this dissertation, we tackle challenging computer vision problems in ophthalmology using methods all across this spectrum. Perhaps our most significant work is a highly successful iris registration algorithm for use in laser eye surgery. This algorithm relies on matching features extracted from the structure tensor and a Gabor wavelet – a classically driven approach that does not utilize modern machine learning. However, drawing on insight from the deep learning revolution, we demonstrate successful application of backpropagation to optimize the registration significantly faster than the alternative of relying on finite differences. Towards the other end of the spectrum, we also present a novel framework for improving RANSAC segmentation algorithms by utilizing a convolutional neural network (CNN) trained on a RANSAC-based loss function. Finally, we apply state-of-the-art deep learning methods to solve the problem of pathological fluid detection in optical coherence tomography images of the human retina, using a novel retina-specific data augmentation technique to greatly expand the data set. Altogether, our work demonstrates benefits of applying a holistic view of computer vision, which leverages deep learning and associated insights without neglecting techniques and insights from the previous era

    Capturing Hand-Object Interaction and Reconstruction of Manipulated Objects

    Get PDF
    Hand motion capture with an RGB-D sensor gained recently a lot of research attention, however, even most recent approaches focus on the case of a single isolated hand. We focus instead on hands that interact with other hands or with a rigid or articulated object. Our framework successfully captures motion in such scenarios by combining a generative model with discriminatively trained salient points, collision detection and physics simulation to achieve a low tracking error with physically plausible poses. All components are unified in a single objective function that can be optimized with standard optimization techniques. We initially assume a-priori knowledge of the object’s shape and skeleton. In case of unknown object shape there are existing 3d reconstruction methods that capitalize on distinctive geometric or texture features. These methods though fail for textureless and highly symmetric objects like household articles, mechanical parts or toys. We show that extracting 3d hand motion for in-hand scanning e↵ectively facilitates the reconstruction of such objects and we fuse the rich additional information of hands into a 3d reconstruction pipeline. Finally, although shape reconstruction is enough for rigid objects, there is a lack of tools that build rigged models of articulated objects that deform realistically using RGB-D data. We propose a method that creates a fully rigged model consisting of a watertight mesh, embedded skeleton and skinning weights by employing a combination of deformable mesh tracking, motion segmentation based on spectral clustering and skeletonization based on mean curvature flow

    New Mechatronic Systems for the Diagnosis and Treatment of Cancer

    Get PDF
    Both two dimensional (2D) and three dimensional (3D) imaging modalities are useful tools for viewing the internal anatomy. Three dimensional imaging techniques are required for accurate targeting of needles. This improves the efficiency and control over the intervention as the high temporal resolution of medical images can be used to validate the location of needle and target in real time. Relying on imaging alone, however, means the intervention is still operator dependent because of the difficulty of controlling the location of the needle within the image. The objective of this thesis is to improve the accuracy and repeatability of needle-based interventions over conventional techniques: both manual and automated techniques. This includes increasing the accuracy and repeatability of these procedures in order to minimize the invasiveness of the procedure. In this thesis, I propose that by combining the remote center of motion concept using spherical linkage components into a passive or semi-automated device, the physician will have a useful tracking and guidance system at their disposal in a package, which is less threatening than a robot to both the patient and physician. This design concept offers both the manipulative transparency of a freehand system, and tremor reduction through scaling currently offered in automated systems. In addressing each objective of this thesis, a number of novel mechanical designs incorporating an remote center of motion architecture with varying degrees of freedom have been presented. Each of these designs can be deployed in a variety of imaging modalities and clinical applications, ranging from preclinical to human interventions, with an accuracy of control in the millimeter to sub-millimeter range

    Smart Technologies for Precision Assembly

    Get PDF
    This open access book constitutes the refereed post-conference proceedings of the 9th IFIP WG 5.5 International Precision Assembly Seminar, IPAS 2020, held virtually in December 2020. The 16 revised full papers and 10 revised short papers presented together with 1 keynote paper were carefully reviewed and selected from numerous submissions. The papers address topics such as assembly design and planning; assembly operations; assembly cells and systems; human centred assembly; and assistance methods in assembly
    corecore