609 research outputs found
Deep Reinforcement Learning in Surgical Robotics: Enhancing the Automation Level
Surgical robotics is a rapidly evolving field that is transforming the
landscape of surgeries. Surgical robots have been shown to enhance precision,
minimize invasiveness, and alleviate surgeon fatigue. One promising area of
research in surgical robotics is the use of reinforcement learning to enhance
the automation level. Reinforcement learning is a type of machine learning that
involves training an agent to make decisions based on rewards and punishments.
This literature review aims to comprehensively analyze existing research on
reinforcement learning in surgical robotics. The review identified various
applications of reinforcement learning in surgical robotics, including
pre-operative, intra-body, and percutaneous procedures, listed the typical
studies, and compared their methodologies and results. The findings show that
reinforcement learning has great potential to improve the autonomy of surgical
robots. Reinforcement learning can teach robots to perform complex surgical
tasks, such as suturing and tissue manipulation. It can also improve the
accuracy and precision of surgical robots, making them more effective at
performing surgeries
Co-Attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery
Medical students and junior surgeons often rely on senior surgeons and
specialists to answer their questions when learning surgery. However, experts
are often busy with clinical and academic work, and have little time to give
guidance. Meanwhile, existing deep learning (DL)-based surgical Visual Question
Answering (VQA) systems can only provide simple answers without the location of
the answers. In addition, vision-language (ViL) embedding is still a less
explored research in these kinds of tasks. Therefore, a surgical Visual
Question Localized-Answering (VQLA) system would be helpful for medical
students and junior surgeons to learn and understand from recorded surgical
videos. We propose an end-to-end Transformer with Co-Attention gaTed
Vision-Language (CAT-ViL) for VQLA in surgical scenarios, which does not
require feature extraction through detection models. The CAT-ViL embedding
module is designed to fuse heterogeneous features from visual and textual
sources. The fused embedding will feed a standard Data-Efficient Image
Transformer (DeiT) module, before the parallel classifier and detector for
joint prediction. We conduct the experimental validation on public surgical
videos from MICCAI EndoVis Challenge 2017 and 2018. The experimental results
highlight the superior performance and robustness of our proposed model
compared to the state-of-the-art approaches. Ablation studies further prove the
outstanding performance of all the proposed components. The proposed method
provides a promising solution for surgical scene understanding, and opens up a
primary step in the Artificial Intelligence (AI)-based VQLA system for surgical
training. Our code is publicly available.Comment: To appear in MICCAI 2023. Code availability:
https://github.com/longbai1006/CAT-Vi
Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery
The visual-question localized-answering (VQLA) system can serve as a
knowledgeable assistant in surgical education. Except for providing text-based
answers, the VQLA system can highlight the interested region for better
surgical scene understanding. However, deep neural networks (DNNs) suffer from
catastrophic forgetting when learning new knowledge. Specifically, when DNNs
learn on incremental classes or tasks, their performance on old tasks drops
dramatically. Furthermore, due to medical data privacy and licensing issues, it
is often difficult to access old data when updating continual learning (CL)
models. Therefore, we develop a non-exemplar continual surgical VQLA framework,
to explore and balance the rigidity-plasticity trade-off of DNNs in a
sequential learning paradigm. We revisit the distillation loss in CL tasks, and
propose rigidity-plasticity-aware distillation (RP-Dist) and self-calibrated
heterogeneous distillation (SH-Dist) to preserve the old knowledge. The weight
aligning (WA) technique is also integrated to adjust the weight bias between
old and new tasks. We further establish a CL framework on three public surgical
datasets in the context of surgical settings that consist of overlapping
classes between old and new surgical VQLA tasks. With extensive experiments, we
demonstrate that our proposed method excellently reconciles learning and
forgetting on the continual surgical VQLA over conventional CL methods. Our
code is publicly accessible.Comment: To appear in MICCAI 2023. Code availability:
https://github.com/longbai1006/CS-VQL
Sim-to-Real Segmentation in Robot-assisted Transoral Tracheal Intubation
Robotic-assisted tracheal intubation requires the robot to distinguish
anatomical features like an experienced physician using deep-learning
techniques. However, real datasets of oropharyngeal organs are limited due to
patient privacy issues, making it challenging to train deep-learning models for
accurate image segmentation. We hereby consider generating a new data modality
through a virtual environment to assist the training process. Specifically,
this work introduces a virtual dataset generated by the Simulation Open
Framework Architecture (SOFA) framework to overcome the limited availability of
actual endoscopic images. We also propose a domain adaptive Sim-to-Real method
for oropharyngeal organ image segmentation, which employs an image blending
strategy called IoU-Ranking Blend (IRB) and style-transfer techniques to
address discrepancies between datasets. Experimental results demonstrate the
superior performance of the proposed approach with domain adaptive models,
improving segmentation accuracy and training stability. In the practical
application, the trained segmentation model holds great promise for
robot-assisted intubation surgery and intelligent surgical navigation.Comment: Extended abstract in IEEE ICRA 2023 Workshop (New Evolutions in
Surgical Robotics: Embracing Multimodal Imaging Guidance, Intelligence, and
Bio-inspired Mechanisms
Domain Adaptive Sim-to-Real Segmentation of Oropharyngeal Organs
Video-assisted transoral tracheal intubation (TI) necessitates using an
endoscope that helps the physician insert a tracheal tube into the glottis
instead of the esophagus. The growing trend of robotic-assisted TI would
require a medical robot to distinguish anatomical features like an experienced
physician which can be imitated by utilizing supervised deep-learning
techniques. However, the real datasets of oropharyngeal organs are often
inaccessible due to limited open-source data and patient privacy. In this work,
we propose a domain adaptive Sim-to-Real framework called IoU-Ranking
Blend-ArtFlow (IRB-AF) for image segmentation of oropharyngeal organs. The
framework includes an image blending strategy called IoU-Ranking Blend (IRB)
and style-transfer method ArtFlow. Here, IRB alleviates the problem of poor
segmentation performance caused by significant datasets domain differences;
while ArtFlow is introduced to reduce the discrepancies between datasets
further. A virtual oropharynx image dataset generated by the SOFA framework is
used as the learning subject for semantic segmentation to deal with the limited
availability of actual endoscopic images. We adapted IRB-AF with the
state-of-the-art domain adaptive segmentation models. The results demonstrate
the superior performance of our approach in further improving the segmentation
accuracy and training stability.Comment: The manuscript is accepted by Medical & Biological Engineering &
Computing. Code and dataset:
https://github.com/gkw0010/EISOST-Sim2Real-Dataset-Releas
Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic Surgery
Deep Neural Networks (DNNs) based semantic segmentation of the robotic instruments and tissues can enhance the precision of surgical activities in robot-assisted surgery. However, in biological learning, DNNs cannot learn incremental tasks over time and exhibit catastrophic forgetting, which refers to the sharp decline in performance on previously learned tasks after learning a new one. Specifically, when data scarcity is the issue, the model shows a rapid drop in performance on previously learned instruments after learning new data with new instruments. The problem becomes worse when it limits releasing the dataset of the old instruments for the old model due to privacy concerns and the unavailability of the data for the new or updated version of the instruments for the continual learning model. For this purpose, we develop a privacy-preserving synthetic continual semantic segmentation framework by blending and harmonizing (i) open-source old instruments foreground to the synthesized background without revealing real patient data in public and (ii) new instruments foreground to extensively augmented real background. To boost the balanced logit distillation from the old model to the continual learning model, we design overlapping class-aware temperature normalization (CAT) by controlling model learning utility. We also introduce multi-scale shifted-feature distillation (SD) to maintain long and short-range spatial relationships among the semantic objects where conventional short-range spatial features with limited information reduce the power of feature distillation. We demonstrate the effectiveness of our framework on the EndoVis 2017 and 2018 instrument segmentation dataset with a generalized continual learning setting. Code is available at https://github.com/XuMengyaAmy/Synthetic_CAT_SD
Generalizing Surgical Instruments Segmentation to Unseen Domains with One-to-Many Synthesis
Despite their impressive performance in various surgical scene understanding
tasks, deep learning-based methods are frequently hindered from deploying to
real-world surgical applications for various causes. Particularly, data
collection, annotation, and domain shift in-between sites and patients are the
most common obstacles. In this work, we mitigate data-related issues by
efficiently leveraging minimal source images to generate synthetic surgical
instrument segmentation datasets and achieve outstanding generalization
performance on unseen real domains. Specifically, in our framework, only one
background tissue image and at most three images of each foreground instrument
are taken as the seed images. These source images are extensively transformed
and employed to build up the foreground and background image pools, from which
randomly sampled tissue and instrument images are composed with multiple
blending techniques to generate new surgical scene images. Besides, we
introduce hybrid training-time augmentations to diversify the training data
further. Extensive evaluation on three real-world datasets, i.e., Endo2017,
Endo2018, and RoboTool, demonstrates that our one-to-many synthetic surgical
instruments datasets generation and segmentation framework can achieve
encouraging performance compared with training with real data. Notably, on the
RoboTool dataset, where a more significant domain gap exists, our framework
shows its superiority of generalization by a considerable margin. We expect
that our inspiring results will attract research attention to improving model
generalization with data synthesizing.Comment: First two authors contributed equally. Accepted by IROS202
The Quantitative Diagnosis Method of Rubbing Rotor System
The dynamics of the rubbing rotor system is analyzed by applying harmonic balance method. The relationship between harmonic components in the response of the rubbing rotor system and the dynamic stiffness matrix of the fault free rotor system is revealed, based on which a new model based method for rubbing identification is presented. By applying this method, the fault location and rubbing forces of the single rubbing rotor system can be identified by using vibration data of only two nodes, the rubbing locations and rubbing forces of the double rubbing rotor system can be identified by using vibration data of three nodes. The numerical simulations and experiments on the rotor test-rig are carried out to verify the efficiency of the present method
- …