137 research outputs found
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade
Camera pose estimation is an important problem in computer vision. Common
techniques either match the current image against keyframes with known poses,
directly regress the pose, or establish correspondences between keypoints in
the image and points in the scene to estimate the pose. In recent years,
regression forests have become a popular alternative to establish such
correspondences. They achieve accurate results, but have traditionally needed
to be trained offline on the target scene, preventing relocalisation in new
environments. Recently, we showed how to circumvent this limitation by adapting
a pre-trained forest to a new scene on the fly. The adapted forests achieved
relocalisation performance that was on par with that of offline forests, and
our approach was able to estimate the camera pose in close to real time. In
this paper, we present an extension of this work that achieves significantly
better relocalisation performance whilst running fully in real time. To achieve
this, we make several changes to the original approach: (i) instead of
accepting the camera pose hypothesis without question, we make it possible to
score the final few hypotheses using a geometric approach and select the most
promising; (ii) we chain several instantiations of our relocaliser together in
a cascade, allowing us to try faster but less accurate relocalisation first,
only falling back to slower, more accurate relocalisation as necessary; and
(iii) we tune the parameters of our cascade to achieve effective overall
performance. These changes allow us to significantly improve upon the
performance our original state-of-the-art method was able to achieve on the
well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional
contributions, we present a way of visualising the internal behaviour of our
forests and show how to entirely circumvent the need to pre-train a forest on a
generic scene.Comment: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin
assert joint first authorshi
Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes
Long-term camera re-localization is an important task with numerous computer
vision and robotics applications. Whilst various outdoor benchmarks exist that
target lighting, weather and seasonal changes, far less attention has been paid
to appearance changes that occur indoors. This has led to a mismatch between
popular indoor benchmarks, which focus on static scenes, and indoor
environments that are of interest for many real-world applications. In this
paper, we adapt 3RScan - a recently introduced indoor RGB-D dataset designed
for object instance re-localization - to create RIO10, a new long-term camera
re-localization benchmark focused on indoor scenes. We propose new metrics for
evaluating camera re-localization and explore how state-of-the-art camera
re-localizers perform according to these metrics. We also examine in detail how
different types of scene change affect the performance of different methods,
based on novel ways of detecting such changes in a given RGB-D frame. Our
results clearly show that long-term indoor re-localization is an unsolved
problem. Our benchmark and tools are publicly available at
waldjohannau.github.io/RIO10Comment: ECCV 2020, project website https://waldjohannau.github.io/RIO1
Medical SLAM in an autonomous robotic system
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-operative morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilities by observing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted instruments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This thesis addresses the ambitious goal of achieving surgical autonomy, through the study of the anatomical environment by Initially studying the technology present and what is needed to analyze the scene: vision sensors. A novel endoscope for autonomous surgical task execution is presented in the first part of this thesis. Which combines a standard stereo camera with a depth sensor. This solution introduces several key advantages, such as the possibility of reconstructing the 3D at a greater distance than traditional endoscopes. Then the problem of hand-eye calibration is tackled, which unites the vision system and the robot in a single reference system. Increasing the accuracy in the surgical work plan. In the second part of the thesis the problem of the 3D reconstruction and the algorithms currently in use were addressed. In MIS, simultaneous localization and mapping (SLAM) can be used to localize the pose of the endoscopic camera and build ta 3D model of the tissue surface. Another key element for MIS is to have real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy. Starting from the ORB-SLAM algorithm we have modified the architecture to make it usable in an anatomical environment by adding the registration of the pre-operative information of the intervention to the map obtained from the SLAM. Once it has been proven that the slam algorithm is usable in an anatomical environment, it has been improved by adding semantic segmentation to be able to distinguish dynamic features from static ones. All the results in this thesis are validated on training setups, which mimics some of the challenges of real surgery and on setups that simulate the human body within Autonomous Robotic Surgery (ARS) and Smart Autonomous Robotic Assistant Surgeon (SARAS) projects
Application of augmented reality and robotic technology in broadcasting: A survey
As an innovation technique, Augmented Reality (AR) has been gradually deployed in the broadcast, videography and cinematography industries. Virtual graphics generated by AR are dynamic and overlap on the surface of the environment so that the original appearance can be greatly enhanced in comparison with traditional broadcasting. In addition, AR enables broadcasters to interact with augmented virtual 3D models on a broadcasting scene in order to enhance the performance of broadcasting. Recently, advanced robotic technologies have been deployed in a camera shooting system to create a robotic cameraman so that the performance of AR broadcasting could be further improved, which is highlighted in the paper
Medical SLAM in an autonomous robotic system
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-operative morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilities by observing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted instruments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This thesis addresses the ambitious goal of achieving surgical autonomy, through the study of the anatomical environment by Initially studying the technology present and what is needed to analyze the scene: vision sensors. A novel endoscope for autonomous surgical task execution is presented in the first part of this thesis. Which combines a standard stereo camera with a depth sensor. This solution introduces several key advantages, such as the possibility of reconstructing the 3D at a greater distance than traditional endoscopes. Then the problem of hand-eye calibration is tackled, which unites the vision system and the robot in a single reference system. Increasing the accuracy in the surgical work plan. In the second part of the thesis the problem of the 3D reconstruction and the algorithms currently in use were addressed. In MIS, simultaneous localization and mapping (SLAM) can be used to localize the pose of the endoscopic camera and build ta 3D model of the tissue surface. Another key element for MIS is to have real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy. Starting from the ORB-SLAM algorithm we have modified the architecture to make it usable in an anatomical environment by adding the registration of the pre-operative information of the intervention to the map obtained from the SLAM. Once it has been proven that the slam algorithm is usable in an anatomical environment, it has been improved by adding semantic segmentation to be able to distinguish dynamic features from static ones. All the results in this thesis are validated on training setups, which mimics some of the challenges of real surgery and on setups that simulate the human body within Autonomous Robotic Surgery (ARS) and Smart Autonomous Robotic Assistant Surgeon (SARAS) projects
Real-time RGB-D camera pose estimation in novel scenes using a relocalisation cascade
Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but have traditionally needed to be trained offline on the target scene, preventing relocalisation in new environments. Recently, we showed how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. The adapted forests achieved relocalisation performance that was on par with that of offline forests, and our approach was able to estimate the camera pose in close to real time. In this paper, we present an extension of this work that achieves significantly better relocalisation performance whilst running fully in real time. To achieve this, we make several changes to the original approach: (i) instead of accepting the camera pose hypothesis without question, we make it possible to score the final few hypotheses using a geometric approach and select the most promising; (ii) we chain several instantiations of our relocaliser together in a cascade, allowing us to try faster but less accurate relocalisation first, only falling back to slower, more accurate relocalisation as necessary; and (iii) we tune the parameters of our cascade to achieve effective overall performance. These changes allow us to significantly improve upon the performance our original state-of-the-art method was able to achieve on the well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional contributions, we present a way of visualising the internal behaviour of our forests and show how to entirely circumvent the need to pre-train a forest on a generic scene
Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems
Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt
innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle
Objekte an einem willkĂĽrlichen Ort anzuzeigen, kann AR Technologie auch genutzt
werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten
beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene
Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese
Spielart von AR als "Fused Reality" definiert und eingehend diskutiert.
Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lässt sich
gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen Gebäudemodellen
demonstrieren, wo sich gebäudespezifische Informationen - beispielsweise der
Verlauf von Leitungen und Kabeln innerhalb der Wände - lagegerecht am realen
Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality
Anwendung realisieren zu können, müssen einige grundlegende Bedingungen erfüllt
sein. So kann ein bestimmtes Gebäude nur dann mit ortsbezogenen Informationen
augmentiert werden, wenn von diesem Gebäude ein digitales Modell verfügbar ist.
Zwar werden größere Bauprojekt heutzutage oft unter Zuhilfename von Building
Information Modelling (BIM) geplant und durchgefĂĽhrt, sodass ein digitales Modell
direkt zusammen mit dem realen Gebäude ensteht, jedoch sind im Falle älterer
Bestandsgebäude digitale Modelle meist nicht verfügbar. Ein digitales Modell eines
bestehenden Gebäudes manuell zu erstellen, ist zwar möglich, jedoch mit großem
Aufwand verbunden. Ist ein passendes Gebäudemodell vorhanden, muss ein AR
Gerät außerdem in der Lage sein, die eigene Position und Orientierung im Gebäude
relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht
anzeigen zu können.
Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik
untersucht und diskutiert. Dabei werden zunächst verschiedene Möglichkeiten
diskutiert, Indoor-Gebäudegeometrie mittels Sensorsystemen zu erfassen. Anschließend
wird eine Untersuchung präsentiert, inwiefern moderne AR Geräte, die
in der Regel ebenfalls ĂĽber eine Vielzahl an Sensoren verfĂĽgen, ebenfalls geeignet
sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor
Mapping Datensätze können daraufhin genutzt werden, um automatisiert
Gebäudemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes,
voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird auĂźerdem
auf der Grundlage vierer zu diesem Zweck erfasster Datensätze mit zugehörigen
Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene
Möglichkeiten diskutiert, mobile AR Geräte innerhalb eines Gebäudes und des zugehörigen
Gebäudemodells zu lokalisieren. In diesem Kontext wird außerdem auch
die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode präsentiert.
Abschließend wird zudem ein neuer Ansatz, Indoor-Mapping Datensätze an den
Achsen des Koordinatensystems auszurichten, vorgestellt
Interactive Remote Collaboration Using Augmented Reality
With the widespread deployment of fast data connections and availability of a variety of sensors for different modalities, the potential of remote collaboration has greatly increased. While the now ubiquitous video conferencing applications take advantage of some of these capabilities, the use of video between remote users is limited to passively watching disjoint video feeds and provides no means for interaction with the remote environment. However, collaboration often involves sharing, exploring, referencing, or even manipulating the physical world, and thus tools should provide support for these interactions.We suggest that augmented reality is an intuitive and user-friendly paradigm to communicate information about the physical environment, and that integration of computer vision and augmented reality facilitates more immersive and more direct interaction with the remote environment than what is possible with today's tools.In this dissertation, we present contributions to realizing this vision on several levels. First, we describe a conceptual framework for unobtrusive mobile video-mediated communication in which the remote user can explore the live scene independent of the local user's current camera movement, and can communicate information by creating spatial annotations that are immediately visible to the local user in augmented reality. Second, we describe the design and implementation of several, increasingly more flexible and immersive user interfaces and system prototypes that implement this concept. Our systems do not require any preparation or instrumentation of the environment; instead, the physical scene is tracked and modeled incrementally using monocular computer vision. The emerging model then supports anchoring of annotations, virtual navigation, and synthesis of novel views of the scene. Third, we describe the design, execution and analysis of three user studies comparing our prototype implementations with more conventional interfaces and/or evaluating specific design elements. Study participants overwhelmingly preferred our technology, and their task performance was significantly better compared with a video-only interface, though no task performance difference was observed compared with a ``static marker'' interface. Last, we address a particular technical limitation of current monocular tracking and mapping systems which was found to be impeding and present a conceptual solution; namely, we describe a concept and proof-of-concept implementation for automatic model selection which allows tracking and modeling to cope with both parallax-inducing and rotation-only camera movements.We suggest that our results demonstrate the maturity and usability of our systems, and, more importantly, the potential of our approach to improve video-mediated communication and broaden its applicability
Serious Games and Mixed Reality Applications for Healthcare
Virtual reality (VR) and augmented reality (AR) have long histories in the healthcare sector, offering the opportunity to develop a wide range of tools and applications aimed at improving the quality of care and efficiency of services for professionals and patients alike. The best-known examples of VR–AR applications in the healthcare domain include surgical planning and medical training by means of simulation technologies. Techniques used in surgical simulation have also been applied to cognitive and motor rehabilitation, pain management, and patient and professional education. Serious games are ones in which the main goal is not entertainment, but a crucial purpose, ranging from the acquisition of knowledge to interactive training.These games are attracting growing attention in healthcare because of their several benefits: motivation, interactivity, adaptation to user competence level, flexibility in time, repeatability, and continuous feedback. Recently, healthcare has also become one of the biggest adopters of mixed reality (MR), which merges real and virtual content to generate novel environments, where physical and digital objects not only coexist, but are also capable of interacting with each other in real time, encompassing both VR and AR applications.This Special Issue aims to gather and publish original scientific contributions exploring opportunities and addressing challenges in both the theoretical and applied aspects of VR–AR and MR applications in healthcare
- …