735 research outputs found

    Recent Developments and Future Challenges in Medical Mixed Reality

    Get PDF
    As AR technology matures, we have seen many applicationsemerge in entertainment, education and training. However, the useof AR is not yet common in medical practice, despite the great po-tential of this technology to help not only learning and training inmedicine, but also in assisting diagnosis and surgical guidance. Inthis paper, we present recent trends in the use of AR across all med-ical specialties and identify challenges that must be overcome tonarrow the gap between academic research and practical use of ARin medicine. A database of 1403 relevant research papers publishedover the last two decades has been reviewed by using a novel re-search trend analysis method based on text mining algorithm. Wesemantically identified 10 topics including varies of technologiesand applications based on the non-biased and in-personal cluster-ing results from the Latent Dirichlet Allocatio (LDA) model andanalysed the trend of each topic from 1995 to 2015. The statisticresults reveal a taxonomy that can best describes the developmentof the medical AR research during the two decades. And the trendanalysis provide a higher level of view of how the taxonomy haschanged and where the focus will goes. Finally, based on the valu-able results, we provide a insightful discussion to the current limi-tations, challenges and future directions in the field. Our objectiveis to aid researchers to focus on the application areas in medicalAR that are most needed, as well as providing medical practitioners with latest technology advancements

    Medical SLAM in an autonomous robotic system

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-operative morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilities by observing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted instruments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This thesis addresses the ambitious goal of achieving surgical autonomy, through the study of the anatomical environment by Initially studying the technology present and what is needed to analyze the scene: vision sensors. A novel endoscope for autonomous surgical task execution is presented in the first part of this thesis. Which combines a standard stereo camera with a depth sensor. This solution introduces several key advantages, such as the possibility of reconstructing the 3D at a greater distance than traditional endoscopes. Then the problem of hand-eye calibration is tackled, which unites the vision system and the robot in a single reference system. Increasing the accuracy in the surgical work plan. In the second part of the thesis the problem of the 3D reconstruction and the algorithms currently in use were addressed. In MIS, simultaneous localization and mapping (SLAM) can be used to localize the pose of the endoscopic camera and build ta 3D model of the tissue surface. Another key element for MIS is to have real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy. Starting from the ORB-SLAM algorithm we have modified the architecture to make it usable in an anatomical environment by adding the registration of the pre-operative information of the intervention to the map obtained from the SLAM. Once it has been proven that the slam algorithm is usable in an anatomical environment, it has been improved by adding semantic segmentation to be able to distinguish dynamic features from static ones. All the results in this thesis are validated on training setups, which mimics some of the challenges of real surgery and on setups that simulate the human body within Autonomous Robotic Surgery (ARS) and Smart Autonomous Robotic Assistant Surgeon (SARAS) projects

    Medical SLAM in an autonomous robotic system

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-operative morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilities by observing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted instruments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This thesis addresses the ambitious goal of achieving surgical autonomy, through the study of the anatomical environment by Initially studying the technology present and what is needed to analyze the scene: vision sensors. A novel endoscope for autonomous surgical task execution is presented in the first part of this thesis. Which combines a standard stereo camera with a depth sensor. This solution introduces several key advantages, such as the possibility of reconstructing the 3D at a greater distance than traditional endoscopes. Then the problem of hand-eye calibration is tackled, which unites the vision system and the robot in a single reference system. Increasing the accuracy in the surgical work plan. In the second part of the thesis the problem of the 3D reconstruction and the algorithms currently in use were addressed. In MIS, simultaneous localization and mapping (SLAM) can be used to localize the pose of the endoscopic camera and build ta 3D model of the tissue surface. Another key element for MIS is to have real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy. Starting from the ORB-SLAM algorithm we have modified the architecture to make it usable in an anatomical environment by adding the registration of the pre-operative information of the intervention to the map obtained from the SLAM. Once it has been proven that the slam algorithm is usable in an anatomical environment, it has been improved by adding semantic segmentation to be able to distinguish dynamic features from static ones. All the results in this thesis are validated on training setups, which mimics some of the challenges of real surgery and on setups that simulate the human body within Autonomous Robotic Surgery (ARS) and Smart Autonomous Robotic Assistant Surgeon (SARAS) projects

    Angular variation as a monocular cue for spatial percepcion

    Get PDF
    Monocular cues are spatial sensory inputs which are picked up exclusively from one eye. They are in majority static features that provide depth information and are extensively used in graphic art to create realistic representations of a scene. Since the spatial information contained in these cues is picked up from the retinal image, the existence of a link between it and the theory of direct perception can be conveniently assumed. According to this theory, spatial information of an environment is directly contained in the optic array. Thus, this assumption makes possible the modeling of visual perception processes through computational approaches. In this thesis, angular variation is considered as a monocular cue, and the concept of direct perception is adopted by a computer vision approach that considers it as a suitable principle from which innovative techniques to calculate spatial information can be developed. The expected spatial information to be obtained from this monocular cue is the position and orientation of an object with respect to the observer, which in computer vision is a well known field of research called 2D-3D pose estimation. In this thesis, the attempt to establish the angular variation as a monocular cue and thus the achievement of a computational approach to direct perception is carried out by the development of a set of pose estimation methods. Parting from conventional strategies to solve the pose estimation problem, a first approach imposes constraint equations to relate object and image features. In this sense, two algorithms based on a simple line rotation motion analysis were developed. These algorithms successfully provide pose information; however, they depend strongly on scene data conditions. To overcome this limitation, a second approach inspired in the biological processes performed by the human visual system was developed. It is based in the proper content of the image and defines a computational approach to direct perception. The set of developed algorithms analyzes the visual properties provided by angular variations. The aim is to gather valuable data from which spatial information can be obtained and used to emulate a visual perception process by establishing a 2D-3D metric relation. Since it is considered fundamental in the visual-motor coordination and consequently essential to interact with the environment, a significant cognitive effect is produced by the application of the developed computational approach in environments mediated by technology. In this work, this cognitive effect is demonstrated by an experimental study where a number of participants were asked to complete an action-perception task. The main purpose of the study was to analyze the visual guided behavior in teleoperation and the cognitive effect caused by the addition of 3D information. The results presented a significant influence of the 3D aid in the skill improvement, which showed an enhancement of the sense of presence.Las señales monoculares son entradas sensoriales capturadas exclusivamente por un solo ojo que ayudan a la percepción de distancia o espacio. Son en su mayoría características estáticas que proveen información de profundidad y son muy utilizadas en arte gráfico para crear apariencias reales de una escena. Dado que la información espacial contenida en dichas señales son extraídas de la retina, la existencia de una relación entre esta extracción de información y la teoría de percepción directa puede ser convenientemente asumida. De acuerdo a esta teoría, la información espacial de todo le que vemos está directamente contenido en el arreglo óptico. Por lo tanto, esta suposición hace posible el modelado de procesos de percepción visual a través de enfoques computacionales. En esta tesis doctoral, la variación angular es considerada como una señal monocular, y el concepto de percepción directa adoptado por un enfoque basado en algoritmos de visión por computador que lo consideran un principio apropiado para el desarrollo de nuevas técnicas de cálculo de información espacial. La información espacial esperada a obtener de esta señal monocular es la posición y orientación de un objeto con respecto al observador, lo cual en visión por computador es un conocido campo de investigación llamado estimación de la pose 2D-3D. En esta tesis doctoral, establecer la variación angular como señal monocular y conseguir un modelo matemático que describa la percepción directa, se lleva a cabo mediante el desarrollo de un grupo de métodos de estimación de la pose. Partiendo de estrategias convencionales, un primer enfoque implanta restricciones geométricas en ecuaciones para relacionar características del objeto y la imagen. En este caso, dos algoritmos basados en el análisis de movimientos de rotación de una línea recta fueron desarrollados. Estos algoritmos exitosamente proveen información de la pose. Sin embargo, dependen fuertemente de condiciones de la escena. Para superar esta limitación, un segundo enfoque inspirado en los procesos biológicos ejecutados por el sistema visual humano fue desarrollado. Está basado en el propio contenido de la imagen y define un enfoque computacional a la percepción directa. El grupo de algoritmos desarrollados analiza las propiedades visuales suministradas por variaciones angulares. El propósito principal es el de reunir datos de importancia con los cuales la información espacial pueda ser obtenida y utilizada para emular procesos de percepción visual mediante el establecimiento de relaciones métricas 2D- 3D. Debido a que dicha relación es considerada fundamental en la coordinación visuomotora y consecuentemente esencial para interactuar con lo que nos rodea, un efecto cognitivo significativo puede ser producido por la aplicación de métodos de L estimación de pose en entornos mediados tecnológicamente. En esta tesis doctoral, este efecto cognitivo ha sido demostrado por un estudio experimental en el cual un número de participantes fueron invitados a ejecutar una tarea de acción-percepción. El propósito principal de este estudio fue el análisis de la conducta guiada visualmente en teleoperación y el efecto cognitivo causado por la inclusión de información 3D. Los resultados han presentado una influencia notable de la ayuda 3D en la mejora de la habilidad, así como un aumento de la sensación de presencia

    Autonomous Eye Tracking in Octopus bimaculoides

    Get PDF
    The importance of the position of cephalopods, and particularly octopuses, as the most intelligent group of invertebrates is becoming increasingly appreciated by the neuroscience research community. Cephalopods are the most distantly related species to humans that possesses advanced cognitive abilities; as their intelligence evolved independently from vertebrates, comparative analyses reveal trends in the evolution of nervous systems and the foundations of intelligence itself. Vision is an especially important area of cephalopod cognition to research because cephalopods are predominantly visual creatures, like humans, and the rapid transduction of visual signals allows the inner-workings of octopus cognition to be revealed in real time. While octopuses can be conditioned to indicate what they see through responses to conditioned visual stimuli, no system as of yet provides a non-invasive means of determining what an octopus is looking at without training. This thesis introduces an automated methodological framework to predict the direction of an octopuses gaze for use in visual cognition research. The system utilizes deep learning models to track the eyes of octopuses, then predicts where an octopus is looking based off of the orientation of their eyes and known anatomical traits that constrain where their vision could be directed. Data could not be collected this spring to train a model and test the tool in the experimental setting the system utilizes, however analyses conducted on data not intended for this project suggest the approach is feasible for estimating an octopus\u27 gaze and offer insights into how to do so most effectively

    Learning-based depth and pose prediction for 3D scene reconstruction in endoscopy

    Get PDF
    Colorectal cancer is the third most common cancer worldwide. Early detection and treatment of pre-cancerous tissue during colonoscopy is critical to improving prognosis. However, navigating within the colon and inspecting the endoluminal tissue comprehensively are challenging, and success in both varies based on the endoscopist's skill and experience. Computer-assisted interventions in colonoscopy show much promise in improving navigation and inspection. For instance, 3D reconstruction of the colon during colonoscopy could promote more thorough examinations and increase adenoma detection rates which are associated with improved survival rates. Given the stakes, this thesis seeks to advance the state of research from feature-based traditional methods closer to a data-driven 3D reconstruction pipeline for colonoscopy. More specifically, this thesis explores different methods that improve subtasks of learning-based 3D reconstruction. The main tasks are depth prediction and camera pose estimation. As training data is unavailable, the author, together with her co-authors, proposes and publishes several synthetic datasets and promotes domain adaptation models to improve applicability to real data. We show, through extensive experiments, that our depth prediction methods produce more robust results than previous work. Our pose estimation network trained on our new synthetic data outperforms self-supervised methods on real sequences. Our box embeddings allow us to interpret the geometric relationship and scale difference between two images of the same surface without the need for feature matches that are often unobtainable in surgical scenes. Together, the methods introduced in this thesis help work towards a complete, data-driven 3D reconstruction pipeline for endoscopy

    Recognition of Instrument Passing and Group Attention for Understanding Intraoperative State of Surgical Team

    Get PDF
    Appropriate evaluation of the intraoperative state of a surgical team is essential for the improvement of teamwork and hence a safe surgical environment. Traditional methods to evaluate intraoperative team states such as interview and self-check questionnaire on each surgical team member often require human efforts, which are time-consuming and can be biased by individual recall. One effective solution is to analyze the surgical video and track the important team activities, such as whether the members are complying with the surgical procedure or are being distracted by unexpected events. However, due to the complexity of the situations in an operating room, identifying the team activities without any human effort remains challenging. In this work, we propose a novel approach that automatically recognizes and quantifies intraoperative activities from surgery videos. As a first step, we focus on recognizing two activities that especially involve multiple individuals: (a) passing of clean-packaged surgery instruments which is a representative interaction between the surgical technologists such as the circulating nurse and scrub nurse, and (b) group attention that may be attracted by unexpected events. We record surgical videos as input, and apply pose estimation and particle filters to extract individual's face orientation, body orientation, and arm raise. These results coupled with individual IDs are then sent to an estimation model that provides the probability of each target activity. Simultaneously, a person model is generated and bound to each individual, which describes all the involved activities along the timeline. We tested our method using videos of simulated activities. The results showed that the system was able to recognize instrument passing and group attention with F1 = 0.95 and F1 = 0.66, respectively. We also implemented a system with an interface that automatically annotated intraoperative activities along the video timeline, and invited feedback from surgical technologists. The results suggest that the quantified and visualized activities can help improve understanding of the intraoperative state of the surgical team

    An Investigation of Skill Acquisition under Conditions of Augmented Reality

    Get PDF
    Augmented reality is a virtual environment that integrates rendered content with the experience of the real world. There is evidence suggesting that augmented reality provides for important spatial constancy of objects relative to the real world coordinate system and that this quality contributes to rapid skill acquisition. The qualities of simulation, through the use of augmented reality, may be incorporated into actual job activities to produce a condition of just-in-time learning. This may make possible the rapid acquisition of information and reliable completion of novel or infrequently performed tasks by individuals possessing a basic skill-set. The purpose of this research has been to investigate the degree to which the acquisition of a skill is enhanced through the use of an augmented reality training device
    corecore