32 research outputs found

    Augmented Reality and Artificial Intelligence in Image-Guided and Robot-Assisted Interventions

    Get PDF
    In minimally invasive orthopedic procedures, the surgeon places wires, screws, and surgical implants through the muscles and bony structures under image guidance. These interventions require alignment of the pre- and intra-operative patient data, the intra-operative scanner, surgical instruments, and the patient. Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. State of the art approaches often support the surgeon by using external navigation systems or ill-conditioned image-based registration methods that both have certain drawbacks. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualization device improving traditional workflows. Consequently, the technology is gaining minimum maturity that it requires to redefine new procedures, user interfaces, and interactions. This dissertation investigates the applications of AR, artificial intelligence, and robotics in interventional medicine. Our solutions were applied in a broad spectrum of problems for various tasks, namely improving imaging and acquisition, image computing and analytics for registration and image understanding, and enhancing the interventional visualization. The benefits of these approaches were also discovered in robot-assisted interventions. We revealed how exemplary workflows are redefined via AR by taking full advantage of head-mounted displays when entirely co-registered with the imaging systems and the environment at all times. The proposed AR landscape is enabled by co-localizing the users and the imaging devices via the operating room environment and exploiting all involved frustums to move spatial information between different bodies. The system's awareness of the geometric and physical characteristics of X-ray imaging allows the exploration of different human-machine interfaces. We also leveraged the principles governing image formation and combined it with deep learning and RGBD sensing to fuse images and reconstruct interventional data. We hope that our holistic approaches towards improving the interface of surgery and enhancing the usability of interventional imaging, not only augments the surgeon's capabilities but also augments the surgical team's experience in carrying out an effective intervention with reduced complications

    Visual Perception System for Aerial Manipulation: Methods and Implementations

    Get PDF
    La tecnología se evoluciona a gran velocidad y los sistemas autónomos están empezado a ser una realidad. Las compañías están demandando, cada vez más, soluciones robotizadas para mejorar la eficiencia de sus operaciones. Este también es el caso de los robots aéreos. Su capacidad única de moverse libremente por el aire los hace excelentes para muchas tareas que son tediosas o incluso peligrosas para operadores humanos. Hoy en día, la gran cantidad de sensores y drones comerciales los hace soluciones muy tentadoras. Sin embargo, todavía se requieren grandes esfuerzos de obra humana para customizarlos para cada tarea debido a la gran cantidad de posibles entornos, robots y misiones. Los investigadores diseñan diferentes algoritmos de visión, hardware y sensores para afrontar las diferentes tareas. Actualmente, el campo de la robótica manipuladora aérea está emergiendo con el objetivo de extender la cantidad de aplicaciones que estos pueden realizar. Estas pueden ser entre otras, inspección, mantenimiento o incluso operar válvulas u otras máquinas. Esta tesis presenta un sistema de manipulación aérea y un conjunto de algoritmos de percepción para la automatización de las tareas de manipulación aérea. El diseño completo del sistema es presentado y una serie de frameworks son presentados para facilitar el desarrollo de este tipo de operaciones. En primer lugar, la investigación relacionada con el análisis de objetos para manipulación y planificación de agarre considerando diferentes modelos de objetos es presentado. Dependiendo de estos modelos de objeto, se muestran diferentes algoritmos actuales de análisis de agarre y algoritmos de planificación para manipuladores simples y manipuladores duales. En Segundo lugar, el desarrollo de algoritmos de percepción para detección de objetos y estimación de su posicione es presentado. Estos permiten al sistema identificar objetos de cualquier tipo en cualquier escena para localizarlos para efectuar las tareas de manipulación. Estos algoritmos calculan la información necesaria para los análisis de manipulación descritos anteriormente. En tercer lugar. Se presentan algoritmos de visión para localizar el robot en el entorno al mismo tiempo que se elabora un mapa local, el cual es beneficioso para las tareas de manipulación. Estos mapas se enriquecen con información semántica obtenida en los algoritmos de detección. Por último, se presenta el desarrollo del hardware relacionado con la plataforma aérea, el cual incluye unos manipuladores de bajo peso y la invención de una herramienta para realizar tareas de contacto con superficies rígidas que sirve de estimador de la posición del robot. Todas las técnicas presentadas en esta tesis han sido validadas con extensiva experimentación en plataformas reales.Technology is growing fast, and autonomous systems are becoming a reality. Companies are increasingly demanding robotized solutions to improve the efficiency of their operations. It is also the case for aerial robots. Their unique capability of moving freely in the space makes them suitable for many tasks that are tedious and even dangerous for human operators. Nowadays, the vast amount of sensors and commercial drones makes them highly appealing. However, it is still required a strong manual effort to customize the existing solutions to each particular task due to the number of possible environments, robot designs and missions. Different vision algorithms, hardware devices and sensor setups are usually designed by researchers to tackle specific tasks. Currently, aerial manipulation is being intensively studied to allow aerial robots to extend the number of applications. These could be inspection, maintenance, or even operating valves or other machines. This thesis presents an aerial manipulation system and a set of perception algorithms for the automation aerial manipulation tasks. The complete design of the system is presented and modular frameworks are shown to facilitate the development of these kind of operations. At first, the research about object analysis for manipulation and grasp planning considering different object models is presented. Depend on the model of the objects, different state of art grasping analysis are reviewed and planning algorithms for both single and dual manipulators are shown. Secondly, the development of perception algorithms for object detection and pose estimation are presented. They allows the system to identify many kind of objects in any scene and locate them to perform manipulation tasks. These algorithms produce the necessary information for the manipulation analysis described in the previous paragraph. Thirdly, it is presented how to use vision to localize the robot in the environment. At the same time, local maps are created which can be beneficial for the manipulation tasks. These maps are are enhanced with semantic information from the perception algorithm mentioned above. At last, the thesis presents the development of the hardware of the aerial platform which includes the lightweight manipulators and the invention of a novel tool that allows the aerial robot to operate in contact with static objects. All the techniques presented in this thesis have been validated throughout extensive experimentation with real aerial robotic platforms

    Fruit Detection and Tree Segmentation for Yield Mapping in Orchards

    Get PDF
    Accurate information gathering and processing is critical for precision horticulture, as growers aim to optimise their farm management practices. An accurate inventory of the crop that details its spatial distribution along with health and maturity, can help farmers efficiently target processes such as chemical and fertiliser spraying, crop thinning, harvest management, labour planning and marketing. Growers have traditionally obtained this information by using manual sampling techniques, which tend to be labour intensive, spatially sparse, expensive, inaccurate and prone to subjective biases. Recent advances in sensing and automation for field robotics allow for key measurements to be made for individual plants throughout an orchard in a timely and accurate manner. Farmer operated machines or unmanned robotic platforms can be equipped with a range of sensors to capture a detailed representation over large areas. Robust and accurate data processing techniques are therefore required to extract high level information needed by the grower to support precision farming. This thesis focuses on yield mapping in orchards using image and light detection and ranging (LiDAR) data captured using an unmanned ground vehicle (UGV). The contribution is the framework and algorithmic components for orchard mapping and yield estimation that is applicable to different fruit types and orchard configurations. The framework includes detection of fruits in individual images and tracking them over subsequent frames. The fruit counts are then associated to individual trees, which are segmented from image and LiDAR data, resulting in a structured spatial representation of yield. The first contribution of this thesis is the development of a generic and robust fruit detection algorithm. Images captured in the outdoor environment are susceptible to highly variable external factors that lead to significant appearance variations. Specifically in orchards, variability is caused by changes in illumination, target pose, tree types, etc. The proposed techniques address these issues by using state-of-the-art feature learning approaches for image classification, while investigating the utility of orchard domain knowledge for fruit detection. Detection is performed using both pixel-wise classification of images followed instance segmentation, and bounding-box regression approaches. The experimental results illustrate the versatility of complex deep learning approaches over a multitude of fruit types. The second contribution of this thesis is a tree segmentation approach to detect the individual trees that serve as a standard unit for structured orchard information systems. The work focuses on trellised trees, which present unique challenges for segmentation algorithms due to their intertwined nature. LiDAR data are used to segment the trellis face, and to generate proposals for individual trees trunks. Additional trunk proposals are provided using pixel-wise classification of the image data. The multi-modal observations are fine-tuned by modelling trunk locations using a hidden semi-Markov model (HSMM), within which prior knowledge of tree spacing is incorporated. The final component of this thesis addresses the visual occlusion of fruit within geometrically complex canopies by using a multi-view detection and tracking approach. Single image fruit detections are tracked over a sequence of images, and associated to individual trees or farm rows, with the spatial distribution of the fruit counting forming a yield map over the farm. The results show the advantage of using multi-view imagery (instead of single view analysis) for fruit counting and yield mapping. This thesis includes extensive experimentation in almond, apple and mango orchards, with data captured by a UGV spanning a total of 5 hectares of farm area, over 30 km of vehicle traversal and more than 7,000 trees. The validation of the different processes is performed using manual annotations, which includes fruit and tree locations in image and LiDAR data respectively. Additional evaluation of yield mapping is performed by comparison against fruit counts on trees at the farm and counts made by the growers post-harvest. The framework developed in this thesis is demonstrated to be accurate compared to ground truth at all scales of the pipeline, including fruit detection and tree mapping, leading to accurate yield estimation, per tree and per row, for the different crops. Through the multitude of field experiments conducted over multiple seasons and years, the thesis presents key practical insights necessary for commercial development of an information gathering system in orchards

    Semi-Autonomous Control of an Exoskeleton using Computer Vision

    Get PDF

    Learning to understand the world in 3D

    Get PDF
    3D Computer vision is a research topic gathering even increasing attention thanks to the more and more widespread availability of off-the-shelf depth sensors and large-scale 3D datasets. The main purpose of 3D computer vision is to understand the geometry of the objects in order to interact with them. Recently, the success of deep neural networks for processing images has fostered a data driven approach to solve 3D vision problems. Inspired by the potential of this field, in this thesis we will address two main problems: (a) how to leverage machine/deep learning techniques to build a robust and effective pipeline to establish correspondences between surfaces, and (b) how to obtain a reliable 3D reconstruction of an object using RGB images sparsely acquired from different point of views by means of deep neural networks. At the heart of many 3D computer vision applications lies surface matching, an effective paradigm aimed at finding correspondences between points belonging to different shapes. To this end, it is essential to first identify the characteristic points of an object and then create an adequate representation of them. We will refer to these two steps as keypoint detection and keypoint description, respectively. As a first contribution (a) of this Ph.D thesis, we will propose data driven solutions to tackle the problems of keypoint detection and description. As a further interesting direction of research, we investigate the problem of 3D object reconstruction from RGB data only (b). If in the past this application has been addressed by SLAM and Structure from motion (SfM) techniques, this radically changed in recent years thanks to the dawn of deep learning. Following this trend, we will introduce a novel approach that combines traditional computer vision techniques with deep learning to perform a view point variant 3D object reconstruction from non-overlapping RGB views

    Higher level techniques for the artistic rendering of images and video

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Contemporary Robotics

    Get PDF
    This book book is a collection of 18 chapters written by internationally recognized experts and well-known professionals of the field. Chapters contribute to diverse facets of contemporary robotics and autonomous systems. The volume is organized in four thematic parts according to the main subjects, regarding the recent advances in the contemporary robotics. The first thematic topics of the book are devoted to the theoretical issues. This includes development of algorithms for automatic trajectory generation using redudancy resolution scheme, intelligent algorithms for robotic grasping, modelling approach for reactive mode handling of flexible manufacturing and design of an advanced controller for robot manipulators. The second part of the book deals with different aspects of robot calibration and sensing. This includes a geometric and treshold calibration of a multiple robotic line-vision system, robot-based inline 2D/3D quality monitoring using picture-giving and laser triangulation, and a study on prospective polymer composite materials for flexible tactile sensors. The third part addresses issues of mobile robots and multi-agent systems, including SLAM of mobile robots based on fusion of odometry and visual data, configuration of a localization system by a team of mobile robots, development of generic real-time motion controller for differential mobile robots, control of fuel cells of mobile robots, modelling of omni-directional wheeled-based robots, building of hunter- hybrid tracking environment, as well as design of a cooperative control in distributed population-based multi-agent approach. The fourth part presents recent approaches and results in humanoid and bioinspirative robotics. It deals with design of adaptive control of anthropomorphic biped gait, building of dynamic-based simulation for humanoid robot walking, building controller for perceptual motor control dynamics of humans and biomimetic approach to control mechatronic structure using smart materials

    NASA Tech Briefs, September 2010

    Get PDF
    Topics covered include: Instrument for Measuring Thermal Conductivity of Materials at Low Temperatures; Multi-Axis Accelerometer Calibration System; Pupil Alignment Measuring Technique and Alignment Reference for Instruments or Optical Systems; Autonomous System for Monitoring the Integrity of Composite Fan Housings; A Safe, Self-Calibrating, Wireless System for Measuring Volume of Any Fuel at Non-Horizontal Orientation; Adaptation of the Camera Link Interface for Flight-Instrument Applications; High-Performance CCSDS Encapsulation Service Implementation in FPGA; High-Performance CCSDS AOS Protocol Implementation in FPGA; Advanced Flip Chips in Extreme Temperature Environments; Diffuse-Illumination Systems for Growing Plants; Microwave Plasma Hydrogen Recovery System; Producing Hydrogen by Plasma Pyrolysis of Methane; Self-Deployable Membrane Structures; Reactivation of a Tin-Oxide-Containing Catalys; Functionalization of Single-Wall Carbon Nanotubes by Photo-Oxidation; Miniature Piezoelectric Macro-Mass Balance; Acoustic Liner for Turbomachinery Applications; Metering Gas Strut for Separating Rocket Stages; Large-Flow-Area Flow-Selective Liquid/Gas Separator; Counterflowing Jet Subsystem Design; Water Tank with Capillary Air/Liquid Separation; True Shear Parallel Plate Viscometer; Focusing Diffraction Grating Element with Aberration Control; Universal Millimeter-Wave Radar Front End; Mode Selection for a Single-Frequency Fiber Laser; Qualification and Selection of Flight Diode Lasers for Space Applications; Plenoptic Imager for Automated Surface Navigation; Maglev Facility for Simulating Variable Gravity; Hybrid AlGaN-SiC Avalanche Photodiode for Deep-UV Photon Detection; High-Speed Operation of Interband Cascade Lasers; 3D GeoWall Analysis System for Shuttle External Tank Foreign Object Debris Events; Charge-Spot Model for Electrostatic Forces in Simulation of Fine Particulates; Hidden Statistics Approach to Quantum Simulations; Reconstituted Three-Dimensional Interactive Imaging; Determining Atmospheric-Density Profile of Titan; Digital Microfluidics Sample Analyzer; Radiation Protection Using Carbon Nanotube Derivatives; Process to Selectively Distinguish Viable from Non-Viable Bacterial Cells; and TEAMS Model Analyzer

    Videos in Context for Telecommunication and Spatial Browsing

    Get PDF
    The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance
    corecore