55 research outputs found
Investigation of shale gas flows under confinement using a self-consistent multiscale approach
This report summarises our recent findings on non-ideal gas flow characteristics in shale nanopores facilitated by our strongly-inhomogeneous kinetic model. As there are a significant portion of nano-scale pores in shale, gas molecule size is comparable to both the gas mean free path and the characteristic length of the flow domain, and various factors including fluid-solid and fluid-fluid interactions, pore confinement, surface roughness, and non-equilibrium effect need to be considered in a self-consistent manner. These factors are either considered in the governing equation according to the dense gas and mean-field theories, or through the boundary condition based on molecular dynamics simulations. Our kinetic model results are consistent with the molecular dynamics data at the molecular scale and converge to the continuum predictions when pores become large. This model serves as an accurate tool to investigate multiscale transport of shale gas and is helpful for upscaling from the microscopic to continuum levels with a firm theoretical basis.Cited as: Shan, B., Ju, L., Guo, Z., Zhang, Y. Investigation of shale gas flows under confinement using a self-consistent multiscale approach. Advances in Geo-Energy Research, 2022, 6(6): x-x. https://doi.org/10.46690/ager.2022.06.1
Visual-Kinematics Graph Learning for Procedure-agnostic Instrument Tip Segmentation in Robotic Surgeries
Accurate segmentation of surgical instrument tip is an important task for
enabling downstream applications in robotic surgery, such as surgical skill
assessment, tool-tissue interaction and deformation modeling, as well as
surgical autonomy. However, this task is very challenging due to the small
sizes of surgical instrument tips, and significant variance of surgical scenes
across different procedures. Although much effort has been made on visual-based
methods, existing segmentation models still suffer from low robustness thus not
usable in practice. Fortunately, kinematics data from the robotic system can
provide reliable prior for instrument location, which is consistent regardless
of different surgery types. To make use of such multi-modal information, we
propose a novel visual-kinematics graph learning framework to accurately
segment the instrument tip given various surgical procedures. Specifically, a
graph learning framework is proposed to encode relational features of
instrument parts from both image and kinematics. Next, a cross-modal
contrastive loss is designed to incorporate robust geometric prior from
kinematics to image for tip segmentation. We have conducted experiments on a
private paired visual-kinematics dataset including multiple procedures, i.e.,
prostatectomy, total mesorectal excision, fundoplication and distal gastrectomy
on cadaver, and distal gastrectomy on porcine. The leave-one-procedure-out
cross validation demonstrated that our proposed multi-modal segmentation method
significantly outperformed current image-based state-of-the-art approaches,
exceeding averagely 11.2% on Dice.Comment: Accepted to IROS 202
AutoLaparo: A New Dataset of Integrated Multi-tasks for Image-guided Surgical Automation in Laparoscopic Hysterectomy
Computer-assisted minimally invasive surgery has great potential in
benefiting modern operating theatres. The video data streamed from the
endoscope provides rich information to support context-awareness for
next-generation intelligent surgical systems. To achieve accurate perception
and automatic manipulation during the procedure, learning based technique is a
promising way, which enables advanced image analysis and scene understanding in
recent years. However, learning such models highly relies on large-scale,
high-quality, and multi-task labelled data. This is currently a bottleneck for
the topic, as available public dataset is still extremely limited in the field
of CAI. In this paper, we present and release the first integrated dataset
(named AutoLaparo) with multiple image-based perception tasks to facilitate
learning-based automation in hysterectomy surgery. Our AutoLaparo dataset is
developed based on full-length videos of entire hysterectomy procedures.
Specifically, three different yet highly correlated tasks are formulated in the
dataset, including surgical workflow recognition, laparoscope motion
prediction, and instrument and key anatomy segmentation. In addition, we
provide experimental results with state-of-the-art models as reference
benchmarks for further model developments and evaluations on this dataset. The
dataset is available at https://autolaparo.github.io.Comment: Accepted at MICCAI 202
Distilled Visual and Robot Kinematics Embeddings for Metric Depth Estimation in Monocular Scene Reconstruction
Estimating precise metric depth and scene reconstruction from monocular
endoscopy is a fundamental task for surgical navigation in robotic surgery.
However, traditional stereo matching adopts binocular images to perceive the
depth information, which is difficult to transfer to the soft robotics-based
surgical systems due to the use of monocular endoscopy. In this paper, we
present a novel framework that combines robot kinematics and monocular
endoscope images with deep unsupervised learning into a single network for
metric depth estimation and then achieve 3D reconstruction of complex anatomy.
Specifically, we first obtain the relative depth maps of surgical scenes by
leveraging a brightness-aware monocular depth estimation method. Then, the
corresponding endoscope poses are computed based on non-linear optimization of
geometric and photometric reprojection residuals. Afterwards, we develop a
Depth-driven Sliding Optimization (DDSO) algorithm to extract the scaling
coefficient from kinematics and calculated poses offline. By coupling the
metric scale and relative depth data, we form a robust ensemble that represents
the metric and consistent depth. Next, we treat the ensemble as supervisory
labels to train a metric depth estimation network for surgeries (i.e.,
MetricDepthS-Net) that distills the embeddings from the robot kinematics,
endoscopic videos, and poses. With accurate metric depth estimation, we utilize
a dense visual reconstruction method to recover the 3D structure of the whole
surgical site. We have extensively evaluated the proposed framework on public
SCARED and achieved comparable performance with stereo-based depth estimation
methods. Our results demonstrate the feasibility of the proposed approach to
recover the metric depth and 3D structure with monocular inputs
Stereo Dense Scene Reconstruction and Accurate Localization for Learning-Based Navigation of Laparoscope in Minimally Invasive Surgery
Objective: The computation of anatomical information and laparoscope position
is a fundamental block of surgical navigation in Minimally Invasive Surgery
(MIS). Recovering a dense 3D structure of surgical scene using visual cues
remains a challenge, and the online laparoscopic tracking primarily relies on
external sensors, which increases system complexity. Methods: Here, we propose
a learning-driven framework, in which an image-guided laparoscopic localization
with 3D reconstructions of complex anatomical structures is obtained. To
reconstruct the 3D structure of the whole surgical environment, we first
fine-tune a learning-based stereoscopic depth perception method, which is
robust to the texture-less and variant soft tissues, for depth estimation.
Then, we develop a dense visual reconstruction algorithm to represent the scene
by surfels, estimate the laparoscope poses and fuse the depth maps into a
unified reference coordinate for tissue reconstruction. To estimate poses of
new laparoscope views, we achieve a coarse-to-fine localization method, which
incorporates our reconstructed 3D model. Results: We evaluate the
reconstruction method and the localization module on three datasets, namely,
the stereo correspondence and reconstruction of endoscopic data (SCARED), the
ex-vivo phantom and tissue data collected with Universal Robot (UR) and Karl
Storz Laparoscope, and the in-vivo DaVinci robotic surgery dataset, where the
reconstructed 3D structures have rich details of surface texture with an
accuracy error under 1.71 mm and the localization module can accurately track
the laparoscope with only images as input. Conclusions: Experimental results
demonstrate the superior performance of the proposed method in 3D anatomy
reconstruction and laparoscopic localization. Significance: The proposed
framework can be potentially extended to the current surgical navigation
system
Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark
Purpose: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center video dataset. In this work we investigated the generalizability of phase recognition algorithms in a multicenter setting including more difficult recognition tasks such as surgical action and surgical skill.
Methods: To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 h was created. Labels included framewise annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 international Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 research teams trained and submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment.
Results: F1-scores were achieved for phase recognition between 23.9% and 67.7% (n = 9 teams), for instrument presence detection between 38.5% and 63.8% (n = 8 teams), but for action recognition only between 21.8% and 23.3% (n = 5 teams). The average absolute error for skill assessment was 0.78 (n = 1 team).
Conclusion: Surgical workflow and skill analysis are promising technologies to support the surgical team, but there is still room for improvement, as shown by our comparison of machine learning algorithms. This novel HeiChole benchmark can be used for comparable evaluation and validation of future work. In future studies, it is of utmost importance to create more open, high-quality datasets in order to allow the development of artificial intelligence and cognitive robotics in surgery
- …