74 research outputs found

    Smart Localization Using a New Sensor Association Framework for Outdoor Augmented Reality Systems

    Get PDF
    Augmented Reality (AR) aims at enhancing our the real world, by adding fictitious elements that are not perceptible naturally such as: computer-generated images, virtual objects, texts, symbols, graphics, sounds, and smells. The quality of the real/virtual registration depends mainly on the accuracy of the 3D camera pose estimation. In this paper, we present an original real-time localization system for outdoor AR which combines three heterogeneous sensors: a camera, a GPS, and an inertial sensor. The proposed system is subdivided into two modules: the main module is vision based; it estimates the user’s location using a markerless tracking method. When the visual tracking fails, the system switches automatically to the secondary localization module composed of the GPS and the inertial sensor

    Intraoperative Endoscopic Augmented Reality in Third Ventriculostomy

    Get PDF
    In neurosurgery, as a result of the brain-shift, the preoperative patient models used as a intraoperative reference change. A meaningful use of the preoperative virtual models during the operation requires for a model update. The NEAR project, Neuroendoscopy towards Augmented Reality, describes a new camera calibration model for high distorted lenses and introduces the concept of active endoscopes endowed with with navigation, camera calibration, augmented reality and triangulation modules

    Accurate 3D-reconstruction and -navigation for high-precision minimal-invasive interventions

    Get PDF
    The current lateral skull base surgery is largely invasive since it requires wide exposure and direct visualization of anatomical landmarks to avoid damaging critical structures. A multi-port approach aiming to reduce such invasiveness has been recently investigated. Thereby three canals are drilled from the skull surface to the surgical region of interest: the first canal for the instrument, the second for the endoscope, and the third for material removal or an additional instrument. The transition to minimal invasive approaches in the lateral skull base surgery requires sub-millimeter accuracy and high outcome predictability, which results in high requirements for the image acquisition as well as for the navigation. Computed tomography (CT) is a non-invasive imaging technique allowing the visualization of the internal patient organs. Planning optimal drill channels based on patient-specific models requires high-accurate three-dimensional (3D) CT images. This thesis focuses on the reconstruction of high quality CT volumes. Therefore, two conventional imaging systems are investigated: spiral CT scanners and C-arm cone-beam CT (CBCT) systems. Spiral CT scanners acquire volumes with typically anisotropic resolution, i.e. the voxel spacing in the slice-selection-direction is larger than the in-the-plane spacing. A new super-resolution reconstruction approach is proposed to recover images with high isotropic resolution from two orthogonal low-resolution CT volumes. C-arm CBCT systems offers CT-like 3D imaging capabilities while being appropriate for interventional suites. A main drawback of these systems is the commonly encountered CT artifacts due to several limitations in the imaging system, such as the mechanical inaccuracies. This thesis contributes new methods to enhance the CBCT reconstruction quality by addressing two main reconstruction artifacts: the misalignment artifacts caused by mechanical inaccuracies, and the metal-artifacts caused by the presence of metal objects in the scanned region. CBCT scanners are appropriate for intra-operative image-guided navigation. For instance, they can be used to control the drill process based on intra-operatively acquired 2D fluoroscopic images. For a successful navigation, accurate estimate of C-arm pose relative to the patient anatomy and the associated surgical plan is required. A new algorithm has been developed to fulfill this task with high-precision. The performance of the introduced methods is demonstrated on simulated and real data

    Lisätyn todellisuuden käyttöliittymä puoliautonomisiin työkoneisiin

    Get PDF
    Forest machines are being automated today. However, the challenging environment and complexity of the work makes the task difficult. A forest machine operator needs easily interpretable input from the machine in order to supervise and control it. Hence, a device that would show the digital information as a part of the real environment is desired. The goal of the thesis is to implement a real time augmented reality display for forest machines. The main task is to estimate the pose of the user’s head because the virtual data should be aligned with real objects. Also, the digital content and how it is visualized has to be considered. A machine vision camera and inertial measurements are used in the pose estimation. Visual markers are utilized to get pose estimate of the camera. And, orientation from inertial measurements is estimated using an extended Kalman filter. To get the final estimate, the orientations of the two devices are sensor fused. Furthermore, the virtual data comes mainly from an on-board lidar. A 3D point cloud and a wire frame model of a forestry crane are augmented to a live video on a PC. The implemented system proved to work outdoors with actual hardware in real time. Although there are some identifiable errors in the pose estimate, the initial results are encouraging. Further improvements should be targeted to the accuracy of marker detection and to the development of a comprehensive sensor fusion algorithm.Haastava ympäristö ja monimutkaiset työtehtävät tekevät metsäkoneiden toimintojen automatisoimisesta vaikeaa. Olisikin toivottavaa, että metsäkoneenkuljettaja pystyisi tulkitsemaan koneelta tulevaa tietoa helposti ja nopeasti. Ratkaisuksi ehdotetaan järjestelmää, joka sulauttaa digitaalisen tiedon osaksi käyttöympäristöä. Tämä mahdollistaisi puoliautonomisen työkoneen sujuvamman valvomisen ja ohjaamisen. Tämän työn tavoitteena on toteuttaa lisätyn todellisuuden näyttö metsäkoneisiin. Tärkeimpänä tehtävänä on estimoida käyttäjän pään sijainti ja asento, sillä digitaalisen datan pitäisi limittyä todellisuuden kanssa. Lisäksi on pohdittava virtuaalisen tiedon sisältö, ja kuinka se esitetään käyttäjälle. Asennon ja paikan mittaamiseen käytetään päähän kiinnitettyä konenäkökameraa ja inertiamittausyksikköä. Kameralla tunnistetaan työkoneen hyttiin sijoitettuja tunnistemerkkejä, joilla sekä kameran paikkaa että asentoa voidaan estimoida. Asentoestimaattia korjataan vielä inertiamittauksilla anturifuusiota hyödyntäen. Virtuaalinen tieto näytölle tulee pääasiassa laserkeilaimelta ja se lisätään tietokoneen ruudulla näkyvään videoon kolmiulotteisena pistepilvenä. Myös metsäkoneen puomi ja työkalu esitetään virtuaalisena mallina. Toteutettu järjestelmä osoittautui toimimaan oikealla laitteistolla ulkoilmassa tehdyssä kokeessa. Alustavat tulokset ovat rohkaisevia, mutta myös paikan ja asennon virheitä havaittiin ja identifioitiin. Tulevaisuuden kehityskohteita ovat tunnisteiden paikan tarkempi mittaaminen ja kokonaisvaltaisemman anturifuusion kehittäminen

    Joint optimization of manifold learning and sparse representations for face and gesture analysis

    Get PDF
    Face and gesture understanding algorithms are powerful enablers in intelligent vision systems for surveillance, security, entertainment, and smart spaces. In the future, complex networks of sensors and cameras may disperse directions to lost tourists, perform directory lookups in the office lobby, or contact the proper authorities in case of an emergency. To be effective, these systems will need to embrace human subtleties while interacting with people in their natural conditions. Computer vision and machine learning techniques have recently become adept at solving face and gesture tasks using posed datasets in controlled conditions. However, spontaneous human behavior under unconstrained conditions, or in the wild, is more complex and is subject to considerable variability from one person to the next. Uncontrolled conditions such as lighting, resolution, noise, occlusions, pose, and temporal variations complicate the matter further. This thesis advances the field of face and gesture analysis by introducing a new machine learning framework based upon dimensionality reduction and sparse representations that is shown to be robust in posed as well as natural conditions. Dimensionality reduction methods take complex objects, such as facial images, and attempt to learn lower dimensional representations embedded in the higher dimensional data. These alternate feature spaces are computationally more efficient and often more discriminative. The performance of various dimensionality reduction methods on geometric and appearance based facial attributes are studied leading to robust facial pose and expression recognition models. The parsimonious nature of sparse representations (SR) has successfully been exploited for the development of highly accurate classifiers for various applications. Despite the successes of SR techniques, large dictionaries and high dimensional data can make these classifiers computationally demanding. Further, sparse classifiers are subject to the adverse effects of a phenomenon known as coefficient contamination, where for example variations in pose may affect identity and expression recognition. This thesis analyzes the interaction between dimensionality reduction and sparse representations to present a unified sparse representation classification framework that addresses both issues of computational complexity and coefficient contamination. Semi-supervised dimensionality reduction is shown to mitigate the coefficient contamination problems associated with SR classifiers. The combination of semi-supervised dimensionality reduction with SR systems forms the cornerstone for a new face and gesture framework called Manifold based Sparse Representations (MSR). MSR is shown to deliver state-of-the-art facial understanding capabilities. To demonstrate the applicability of MSR to new domains, MSR is expanded to include temporal dynamics. The joint optimization of dimensionality reduction and SRs for classification purposes is a relatively new field. The combination of both concepts into a single objective function produce a relation that is neither convex, nor directly solvable. This thesis studies this problem to introduce a new jointly optimized framework. This framework, termed LGE-KSVD, utilizes variants of Linear extension of Graph Embedding (LGE) along with modified K-SVD dictionary learning to jointly learn the dimensionality reduction matrix, sparse representation dictionary, sparse coefficients, and sparsity-based classifier. By injecting LGE concepts directly into the K-SVD learning procedure, this research removes the support constraints K-SVD imparts on dictionary element discovery. Results are shown for facial recognition, facial expression recognition, human activity analysis, and with the addition of a concept called active difference signatures, delivers robust gesture recognition from Kinect or similar depth cameras

    Map-Based Localization for Unmanned Aerial Vehicle Navigation

    Get PDF
    Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments. Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments. The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%

    AUGMENTED REALITY AND INTRAOPERATIVE C-ARM CONE-BEAM COMPUTED TOMOGRAPHY FOR IMAGE-GUIDED ROBOTIC SURGERY

    Get PDF
    Minimally-invasive robotic-assisted surgery is a rapidly-growing alternative to traditionally open and laparoscopic procedures; nevertheless, challenges remain. Standard of care derives surgical strategies from preoperative volumetric data (i.e., computed tomography (CT) and magnetic resonance (MR) images) that benefit from the ability of multiple modalities to delineate different anatomical boundaries. However, preoperative images may not reflect a possibly highly deformed perioperative setup or intraoperative deformation. Additionally, in current clinical practice, the correspondence of preoperative plans to the surgical scene is conducted as a mental exercise; thus, the accuracy of this practice is highly dependent on the surgeon’s experience and therefore subject to inconsistencies. In order to address these fundamental limitations in minimally-invasive robotic surgery, this dissertation combines a high-end robotic C-arm imaging system and a modern robotic surgical platform as an integrated intraoperative image-guided system. We performed deformable registration of preoperative plans to a perioperative cone-beam computed tomography (CBCT), acquired after the patient is positioned for intervention. From the registered surgical plans, we overlaid critical information onto the primary intraoperative visual source, the robotic endoscope, by using augmented reality. Guidance afforded by this system not only uses augmented reality to fuse virtual medical information, but also provides tool localization and other dynamic intraoperative updated behavior in order to present enhanced depth feedback and information to the surgeon. These techniques in guided robotic surgery required a streamlined approach to creating intuitive and effective human-machine interferences, especially in visualization. Our software design principles create an inherently information-driven modular architecture incorporating robotics and intraoperative imaging through augmented reality. The system's performance is evaluated using phantoms and preclinical in-vivo experiments for multiple applications, including transoral robotic surgery, robot-assisted thoracic interventions, and cocheostomy for cochlear implantation. The resulting functionality, proposed architecture, and implemented methodologies can be further generalized to other C-arm-based image guidance for additional extensions in robotic surgery

    Developing Ultrasound-Guided Intervention Technologies Enabled by Sensing Active Acoustic and Photoacoustic Point Sources

    Get PDF
    Image-guided therapy is a central part of modern medicine. By incorporating medical imaging into the planning, surgical, and evaluation process, image-guided therapy has helped surgeons perform less invasive and more precise procedures. Of the most commonly used medical imaging modalities, ultrasound imaging offers a unique combination of cost-effectiveness, safety, and mobility. Advanced ultrasound guided interventional systems will often require calibration and tracking technologies to enable all of their capabilities. Many of these technologies rely on localizing point based fiducials to accomplish their task. In this thesis, I investigate how sensing and localizing active acoustic and photoacoustic point sources can have a substantial impact in intraoperative ultrasound. The goals of these methods are (1) to improve localization and visualization for point targets that are not easily distinguished under conventional ultrasound and (2) to track and register ultrasound sensors with the use of active point sources as non-physical fiducials or markers. We applied these methods to three main research topics. The first is an ultrasound calibration framework that utilizes an active acoustic source as the phantom to aid in in-plane segmentation as well as out-of-plane estimation. The second is an interventional photoacoustic surgical system that utilizes the photoacoustic effect to create markers for tracking ultrasound transducers. We demonstrate variations of this idea to track a wide range of ultrasound transducers (three-dimensional, two-dimensional, bi-planar). The third is a set of interventional tool tracking methods combining the use of acoustic elements embedded onto the tool with the use of photoacoustic markers

    Automatic Reconstruction of Textured 3D Models

    Get PDF
    Three dimensional modeling and visualization of environments is an increasingly important problem. This work addresses the problem of automatic 3D reconstruction and we present a system for unsupervised reconstruction of textured 3D models in the context of modeling indoor environments. We present solutions to all aspects of the modeling process and an integrated system for the automatic creation of large scale 3D models

    Augmented Reality and Artificial Intelligence in Image-Guided and Robot-Assisted Interventions

    Get PDF
    In minimally invasive orthopedic procedures, the surgeon places wires, screws, and surgical implants through the muscles and bony structures under image guidance. These interventions require alignment of the pre- and intra-operative patient data, the intra-operative scanner, surgical instruments, and the patient. Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. State of the art approaches often support the surgeon by using external navigation systems or ill-conditioned image-based registration methods that both have certain drawbacks. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualization device improving traditional workflows. Consequently, the technology is gaining minimum maturity that it requires to redefine new procedures, user interfaces, and interactions. This dissertation investigates the applications of AR, artificial intelligence, and robotics in interventional medicine. Our solutions were applied in a broad spectrum of problems for various tasks, namely improving imaging and acquisition, image computing and analytics for registration and image understanding, and enhancing the interventional visualization. The benefits of these approaches were also discovered in robot-assisted interventions. We revealed how exemplary workflows are redefined via AR by taking full advantage of head-mounted displays when entirely co-registered with the imaging systems and the environment at all times. The proposed AR landscape is enabled by co-localizing the users and the imaging devices via the operating room environment and exploiting all involved frustums to move spatial information between different bodies. The system's awareness of the geometric and physical characteristics of X-ray imaging allows the exploration of different human-machine interfaces. We also leveraged the principles governing image formation and combined it with deep learning and RGBD sensing to fuse images and reconstruct interventional data. We hope that our holistic approaches towards improving the interface of surgery and enhancing the usability of interventional imaging, not only augments the surgeon's capabilities but also augments the surgical team's experience in carrying out an effective intervention with reduced complications
    corecore