137 research outputs found

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

    Get PDF
    Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but have traditionally needed to be trained offline on the target scene, preventing relocalisation in new environments. Recently, we showed how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. The adapted forests achieved relocalisation performance that was on par with that of offline forests, and our approach was able to estimate the camera pose in close to real time. In this paper, we present an extension of this work that achieves significantly better relocalisation performance whilst running fully in real time. To achieve this, we make several changes to the original approach: (i) instead of accepting the camera pose hypothesis without question, we make it possible to score the final few hypotheses using a geometric approach and select the most promising; (ii) we chain several instantiations of our relocaliser together in a cascade, allowing us to try faster but less accurate relocalisation first, only falling back to slower, more accurate relocalisation as necessary; and (iii) we tune the parameters of our cascade to achieve effective overall performance. These changes allow us to significantly improve upon the performance our original state-of-the-art method was able to achieve on the well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional contributions, we present a way of visualising the internal behaviour of our forests and show how to entirely circumvent the need to pre-train a forest on a generic scene.Comment: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin assert joint first authorshi

    Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes

    Full text link
    Long-term camera re-localization is an important task with numerous computer vision and robotics applications. Whilst various outdoor benchmarks exist that target lighting, weather and seasonal changes, far less attention has been paid to appearance changes that occur indoors. This has led to a mismatch between popular indoor benchmarks, which focus on static scenes, and indoor environments that are of interest for many real-world applications. In this paper, we adapt 3RScan - a recently introduced indoor RGB-D dataset designed for object instance re-localization - to create RIO10, a new long-term camera re-localization benchmark focused on indoor scenes. We propose new metrics for evaluating camera re-localization and explore how state-of-the-art camera re-localizers perform according to these metrics. We also examine in detail how different types of scene change affect the performance of different methods, based on novel ways of detecting such changes in a given RGB-D frame. Our results clearly show that long-term indoor re-localization is an unsolved problem. Our benchmark and tools are publicly available at waldjohannau.github.io/RIO10Comment: ECCV 2020, project website https://waldjohannau.github.io/RIO1

    Medical SLAM in an autonomous robotic system

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-operative morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilities by observing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted instruments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This thesis addresses the ambitious goal of achieving surgical autonomy, through the study of the anatomical environment by Initially studying the technology present and what is needed to analyze the scene: vision sensors. A novel endoscope for autonomous surgical task execution is presented in the first part of this thesis. Which combines a standard stereo camera with a depth sensor. This solution introduces several key advantages, such as the possibility of reconstructing the 3D at a greater distance than traditional endoscopes. Then the problem of hand-eye calibration is tackled, which unites the vision system and the robot in a single reference system. Increasing the accuracy in the surgical work plan. In the second part of the thesis the problem of the 3D reconstruction and the algorithms currently in use were addressed. In MIS, simultaneous localization and mapping (SLAM) can be used to localize the pose of the endoscopic camera and build ta 3D model of the tissue surface. Another key element for MIS is to have real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy. Starting from the ORB-SLAM algorithm we have modified the architecture to make it usable in an anatomical environment by adding the registration of the pre-operative information of the intervention to the map obtained from the SLAM. Once it has been proven that the slam algorithm is usable in an anatomical environment, it has been improved by adding semantic segmentation to be able to distinguish dynamic features from static ones. All the results in this thesis are validated on training setups, which mimics some of the challenges of real surgery and on setups that simulate the human body within Autonomous Robotic Surgery (ARS) and Smart Autonomous Robotic Assistant Surgeon (SARAS) projects

    Application of augmented reality and robotic technology in broadcasting: A survey

    Get PDF
    As an innovation technique, Augmented Reality (AR) has been gradually deployed in the broadcast, videography and cinematography industries. Virtual graphics generated by AR are dynamic and overlap on the surface of the environment so that the original appearance can be greatly enhanced in comparison with traditional broadcasting. In addition, AR enables broadcasters to interact with augmented virtual 3D models on a broadcasting scene in order to enhance the performance of broadcasting. Recently, advanced robotic technologies have been deployed in a camera shooting system to create a robotic cameraman so that the performance of AR broadcasting could be further improved, which is highlighted in the paper

    Medical SLAM in an autonomous robotic system

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-operative morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilities by observing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted instruments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This thesis addresses the ambitious goal of achieving surgical autonomy, through the study of the anatomical environment by Initially studying the technology present and what is needed to analyze the scene: vision sensors. A novel endoscope for autonomous surgical task execution is presented in the first part of this thesis. Which combines a standard stereo camera with a depth sensor. This solution introduces several key advantages, such as the possibility of reconstructing the 3D at a greater distance than traditional endoscopes. Then the problem of hand-eye calibration is tackled, which unites the vision system and the robot in a single reference system. Increasing the accuracy in the surgical work plan. In the second part of the thesis the problem of the 3D reconstruction and the algorithms currently in use were addressed. In MIS, simultaneous localization and mapping (SLAM) can be used to localize the pose of the endoscopic camera and build ta 3D model of the tissue surface. Another key element for MIS is to have real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy. Starting from the ORB-SLAM algorithm we have modified the architecture to make it usable in an anatomical environment by adding the registration of the pre-operative information of the intervention to the map obtained from the SLAM. Once it has been proven that the slam algorithm is usable in an anatomical environment, it has been improved by adding semantic segmentation to be able to distinguish dynamic features from static ones. All the results in this thesis are validated on training setups, which mimics some of the challenges of real surgery and on setups that simulate the human body within Autonomous Robotic Surgery (ARS) and Smart Autonomous Robotic Assistant Surgeon (SARAS) projects

    Real-time RGB-D camera pose estimation in novel scenes using a relocalisation cascade

    Get PDF
    Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but have traditionally needed to be trained offline on the target scene, preventing relocalisation in new environments. Recently, we showed how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. The adapted forests achieved relocalisation performance that was on par with that of offline forests, and our approach was able to estimate the camera pose in close to real time. In this paper, we present an extension of this work that achieves significantly better relocalisation performance whilst running fully in real time. To achieve this, we make several changes to the original approach: (i) instead of accepting the camera pose hypothesis without question, we make it possible to score the final few hypotheses using a geometric approach and select the most promising; (ii) we chain several instantiations of our relocaliser together in a cascade, allowing us to try faster but less accurate relocalisation first, only falling back to slower, more accurate relocalisation as necessary; and (iii) we tune the parameters of our cascade to achieve effective overall performance. These changes allow us to significantly improve upon the performance our original state-of-the-art method was able to achieve on the well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional contributions, we present a way of visualising the internal behaviour of our forests and show how to entirely circumvent the need to pre-train a forest on a generic scene

    Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems

    Get PDF
    Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle Objekte an einem willkürlichen Ort anzuzeigen, kann AR Technologie auch genutzt werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese Spielart von AR als "Fused Reality" definiert und eingehend diskutiert. Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lässt sich gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen Gebäudemodellen demonstrieren, wo sich gebäudespezifische Informationen - beispielsweise der Verlauf von Leitungen und Kabeln innerhalb der Wände - lagegerecht am realen Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality Anwendung realisieren zu können, müssen einige grundlegende Bedingungen erfüllt sein. So kann ein bestimmtes Gebäude nur dann mit ortsbezogenen Informationen augmentiert werden, wenn von diesem Gebäude ein digitales Modell verfügbar ist. Zwar werden größere Bauprojekt heutzutage oft unter Zuhilfename von Building Information Modelling (BIM) geplant und durchgeführt, sodass ein digitales Modell direkt zusammen mit dem realen Gebäude ensteht, jedoch sind im Falle älterer Bestandsgebäude digitale Modelle meist nicht verfügbar. Ein digitales Modell eines bestehenden Gebäudes manuell zu erstellen, ist zwar möglich, jedoch mit großem Aufwand verbunden. Ist ein passendes Gebäudemodell vorhanden, muss ein AR Gerät außerdem in der Lage sein, die eigene Position und Orientierung im Gebäude relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht anzeigen zu können. Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik untersucht und diskutiert. Dabei werden zunächst verschiedene Möglichkeiten diskutiert, Indoor-Gebäudegeometrie mittels Sensorsystemen zu erfassen. Anschließend wird eine Untersuchung präsentiert, inwiefern moderne AR Geräte, die in der Regel ebenfalls über eine Vielzahl an Sensoren verfügen, ebenfalls geeignet sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor Mapping Datensätze können daraufhin genutzt werden, um automatisiert Gebäudemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes, voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird außerdem auf der Grundlage vierer zu diesem Zweck erfasster Datensätze mit zugehörigen Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene Möglichkeiten diskutiert, mobile AR Geräte innerhalb eines Gebäudes und des zugehörigen Gebäudemodells zu lokalisieren. In diesem Kontext wird außerdem auch die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode präsentiert. Abschließend wird zudem ein neuer Ansatz, Indoor-Mapping Datensätze an den Achsen des Koordinatensystems auszurichten, vorgestellt

    Interactive Remote Collaboration Using Augmented Reality

    Get PDF
    With the widespread deployment of fast data connections and availability of a variety of sensors for different modalities, the potential of remote collaboration has greatly increased. While the now ubiquitous video conferencing applications take advantage of some of these capabilities, the use of video between remote users is limited to passively watching disjoint video feeds and provides no means for interaction with the remote environment. However, collaboration often involves sharing, exploring, referencing, or even manipulating the physical world, and thus tools should provide support for these interactions.We suggest that augmented reality is an intuitive and user-friendly paradigm to communicate information about the physical environment, and that integration of computer vision and augmented reality facilitates more immersive and more direct interaction with the remote environment than what is possible with today's tools.In this dissertation, we present contributions to realizing this vision on several levels. First, we describe a conceptual framework for unobtrusive mobile video-mediated communication in which the remote user can explore the live scene independent of the local user's current camera movement, and can communicate information by creating spatial annotations that are immediately visible to the local user in augmented reality. Second, we describe the design and implementation of several, increasingly more flexible and immersive user interfaces and system prototypes that implement this concept. Our systems do not require any preparation or instrumentation of the environment; instead, the physical scene is tracked and modeled incrementally using monocular computer vision. The emerging model then supports anchoring of annotations, virtual navigation, and synthesis of novel views of the scene. Third, we describe the design, execution and analysis of three user studies comparing our prototype implementations with more conventional interfaces and/or evaluating specific design elements. Study participants overwhelmingly preferred our technology, and their task performance was significantly better compared with a video-only interface, though no task performance difference was observed compared with a ``static marker'' interface. Last, we address a particular technical limitation of current monocular tracking and mapping systems which was found to be impeding and present a conceptual solution; namely, we describe a concept and proof-of-concept implementation for automatic model selection which allows tracking and modeling to cope with both parallax-inducing and rotation-only camera movements.We suggest that our results demonstrate the maturity and usability of our systems, and, more importantly, the potential of our approach to improve video-mediated communication and broaden its applicability

    Serious Games and Mixed Reality Applications for Healthcare

    Get PDF
    Virtual reality (VR) and augmented reality (AR) have long histories in the healthcare sector, offering the opportunity to develop a wide range of tools and applications aimed at improving the quality of care and efficiency of services for professionals and patients alike. The best-known examples of VR–AR applications in the healthcare domain include surgical planning and medical training by means of simulation technologies. Techniques used in surgical simulation have also been applied to cognitive and motor rehabilitation, pain management, and patient and professional education. Serious games are ones in which the main goal is not entertainment, but a crucial purpose, ranging from the acquisition of knowledge to interactive training.These games are attracting growing attention in healthcare because of their several benefits: motivation, interactivity, adaptation to user competence level, flexibility in time, repeatability, and continuous feedback. Recently, healthcare has also become one of the biggest adopters of mixed reality (MR), which merges real and virtual content to generate novel environments, where physical and digital objects not only coexist, but are also capable of interacting with each other in real time, encompassing both VR and AR applications.This Special Issue aims to gather and publish original scientific contributions exploring opportunities and addressing challenges in both the theoretical and applied aspects of VR–AR and MR applications in healthcare
    • …
    corecore