Search CORE

638 research outputs found

Temporal Action Segmentation: An Analysis of Modern Techniques

Author: Ding Guodong
Sener Fadime
Yao Angela
Publication venue
Publication date: 12/08/2023
Field of study

Temporal action segmentation (TAS) in videos aims at densely identifying video frames in minutes-long videos with multiple action classes. As a long-range video understanding task, researchers have developed an extended collection of methods and examined their performance using various benchmarks. Despite the rapid growth of TAS techniques in recent years, no systematic survey has been conducted in these sectors. This survey analyzes and summarizes the most significant contributions and trends. In particular, we first examine the task definition, common benchmarks, types of supervision, and prevalent evaluation measures. In addition, we systematically investigate two essential techniques of this topic, i.e., frame representation and temporal modeling, which have been studied extensively in the literature. We then conduct a thorough review of existing TAS works categorized by their levels of supervision and conclude our survey by identifying and emphasizing several research gaps. In addition, we have curated a list of TAS resources, which is available at https://github.com/nus-cvml/awesome-temporal-action-segmentation.Comment: 19 pages, 9 figures, 8 table

arXiv.org e-Print Archive

CataNet: Predicting remaining cataract surgery duration

Author: Gallardo Mathias
Hayoz Michel
Marafioti Andrés
Neila Pablo Márquez
Sznitman Raphael
Wolf Sebastian
Zinkernagel Martin
Publication venue
Publication date: 21/06/2021
Field of study

Cataract surgery is a sight saving surgery that is performed over 10 million times each year around the world. With such a large demand, the ability to organize surgical wards and operating rooms efficiently is critical to delivery this therapy in routine clinical care. In this context, estimating the remaining surgical duration (RSD) during procedures is one way to help streamline patient throughput and workflows. To this end, we propose CataNet, a method for cataract surgeries that predicts in real time the RSD jointly with two influential elements: the surgeon's experience, and the current phase of the surgery. We compare CataNet to state-of-the-art RSD estimation methods, showing that it outperforms them even when phase and experience are not considered. We investigate this improvement and show that a significant contributor is the way we integrate the elapsed time into CataNet's feature extractor.Comment: Accepted at MICCAI 202

arXiv.org e-Print Archive

Bern Open Repository and Information System (BORIS)

Automatic Detection of Out-of-body Frames in Surgical Videos for Privacy Protection Using Self-supervised Learning and Minimal Labels

Author: Jarc Anthony
Liu Xi
Perreault Conor
Wang Ziheng
Publication venue
Publication date: 31/03/2023
Field of study

Endoscopic video recordings are widely used in minimally invasive robot-assisted surgery, but when the endoscope is outside the patient's body, it can capture irrelevant segments that may contain sensitive information. To address this, we propose a framework that accurately detects out-of-body frames in surgical videos by leveraging self-supervision with minimal data labels. We use a massive amount of unlabeled endoscopic images to learn meaningful representations in a self-supervised manner. Our approach, which involves pre-training on an auxiliary task and fine-tuning with limited supervision, outperforms previous methods for detecting out-of-body frames in surgical videos captured from da Vinci X and Xi surgical systems. The average F1 scores range from 96.00 to 98.02. Remarkably, using only 5% of the training labels, our approach still maintains an average F1 score performance above 97, outperforming fully-supervised methods with 95% fewer labels. These results demonstrate the potential of our framework to facilitate the safe handling of surgical video recordings and enhance data privacy protection in minimally invasive surgery.Comment: A 15-page journal article submitted to Journal of Medical Robotics Research (JMRR

arXiv.org e-Print Archive

SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge

Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segmentation algorithms are often trained and make predictions in isolation from each other, without exploiting potential cross-task relationships. With the EndoVis 2022 SAR-RARP50 challenge, we release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP). The aim of the challenge is twofold. First, to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain. Second, to further explore the potential of multitask-based learning approaches and determine their comparative advantage against their single-task counterparts. A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation. The complete SAR-RARP50 dataset is available at: https://rdr.ucl.ac.uk/projects/SARRARP50_Segmentation_of_surgical_instrumentation_and_Action_Recognition_on_Robot-Assisted_Radical_Prostatectomy_Challenge/19109

arXiv.org e-Print Archive

Generic Object Detection and Segmentation for Real-World Environments

Author: Johansen Anders Skaarup
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2023
Field of study

VBN

Unveiling healthcare data archiving: Exploring the role of artificial intelligence in medical image analysis

Author: F. VILLANI
Publication venue
Publication date: 01/01/2024
Field of study

Gli archivi sanitari digitali possono essere considerati dei moderni database progettati per immagazzinare e gestire ingenti quantità di informazioni mediche, dalle cartelle cliniche dei pazienti, a studi clinici fino alle immagini mediche e a dati genomici. I dati strutturati e non strutturati che compongono gli archivi sanitari sono oggetto di scrupolose e rigorose procedure di validazione per garantire accuratezza, affidabilità e standardizzazione a fini clinici e di ricerca. Nel contesto di un settore sanitario in continua e rapida evoluzione, l’intelligenza artificiale (IA) si propone come una forza trasformativa, capace di riformare gli archivi sanitari digitali migliorando la gestione, l’analisi e il recupero di vasti set di dati clinici, al fine di ottenere decisioni cliniche più informate e ripetibili, interventi tempestivi e risultati migliorati per i pazienti. Tra i diversi dati archiviati, la gestione e l’analisi delle immagini mediche in archivi digitali presentano numerose sfide dovute all’eterogeneità dei dati, alla variabilità della qualità delle immagini, nonché alla mancanza di annotazioni. L’impiego di soluzioni basate sull’IA può aiutare a risolvere efficacemente queste problematiche, migliorando l’accuratezza dell’analisi delle immagini, standardizzando la qualità dei dati e facilitando la generazione di annotazioni dettagliate. Questa tesi ha lo scopo di utilizzare algoritmi di IA per l’analisi di immagini mediche depositate in archivi sanitari digitali. Il presente lavoro propone di indagare varie tecniche di imaging medico, ognuna delle quali è caratterizzata da uno specifico dominio di applicazione e presenta quindi un insieme unico di sfide, requisiti e potenziali esiti. In particolare, in questo lavoro di tesi sarà oggetto di approfondimento l’assistenza diagnostica degli algoritmi di IA per tre diverse tecniche di imaging, in specifici scenari clinici: i) Immagini endoscopiche ottenute durante esami di laringoscopia; ciò include un’esplorazione approfondita di tecniche come la detection di keypoints per la stima della motilità delle corde vocali e la segmentazione di tumori del tratto aerodigestivo superiore; ii) Immagini di risonanza magnetica per la segmentazione dei dischi intervertebrali, per la diagnosi e il trattamento di malattie spinali, così come per lo svolgimento di interventi chirurgici guidati da immagini; iii) Immagini ecografiche in ambito reumatologico, per la valutazione della sindrome del tunnel carpale attraverso la segmentazione del nervo mediano. Le metodologie esposte in questo lavoro evidenziano l’efficacia degli algoritmi di IA nell’analizzare immagini mediche archiviate. I progressi metodologici ottenuti sottolineano il notevole potenziale dell’IA nel rivelare informazioni implicitamente presenti negli archivi sanitari digitali

Archivio istituzionale della ricerca - Università di Macerata

Deep Retinal Optical Flow: From Synthetic Dataset Generation to Framework Creation and Evaluation

Author: Ravasio Claudio S.
Publication venue: UCL (University College London)
Publication date: 28/10/2022
Field of study

Sustained delivery of regenerative retinal therapies by robotic systems requires intra-operative tracking of the retinal fundus. This thesis presents a supervised convolutional neural network to densely predict optical flow of the retinal fundus, using semantic segmentation as an auxiliary task. Retinal flow information missing due to occlusion by surgical tools or other effects is implicitly inpainted, allowing for the robust tracking of surgical targets. As manual annotation of optical flow is infeasible, a flexible algorithm for the generation of large synthetic training datasets on the basis of given intra-operative retinal images and tool templates is developed. The compositing of synthetic images is approached as a layer-wise operation implementing a number of transforms at every level which can be extended as required, mimicking the various phenomena visible in real data. Optical flow ground truth is calculated from motion transforms with the help of oflib, an open-source optical flow library available from the Python Package Index. It enables the user to manipulate, evaluate, and combine flow fields. The PyTorch version of oflib is fully differentiable and therefore suitable for use in deep learning methods requiring back-propagation. The optical flow estimation from the network trained on synthetic data is evaluated using three performance metrics obtained from tracking a grid and sparsely annotated ground truth points. The evaluation benchmark consists of a series of challenging real intra-operative clips obtained from an extensive internally acquired dataset encompassing representative surgical cases. The deep learning approach clearly outperforms variational baseline methods and is shown to generalise well to real data showing scenarios routinely observed during vitreoretinal procedures. This indicates complex synthetic training datasets can be used to specifically guide optical flow estimation, laying the foundation for a robust system which can assist with intra-operative tracking of moving surgical targets even when occluded

UCL Discovery

Artificial Intelligence for Emerging Technology in Surgery: Systematic Review and Validation

Author: Anyanwu Tobenna
Gao Bin
Nwoye Ephraim
Woo Wai Lok
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/06/2022
Field of study

Surgery is a high-risk procedure of therapy and is associated to post trauma complications of longer hospital stay, estimated blood loss and long duration of surgeries. Reports have suggested that over 2.5% patients die during and post operation. This paper is aimed at systematic review of previous research on artificial intelligence (AI) in surgery, analyzing their results with suitable software to validate their research by obtaining same or contrary results. Six published research articles have been reviewed across three continents. These articles have been re-validated using software including SPSS and MedCalc to obtain the statistical features such as the mean, standard deviation, significant level, and standard error. From the significant values, the experiments are then classified according to the null (p0.05) hypotheses. The results obtained from the analysis have suggested significant difference in operating time, docking time, staging time, and estimated blood loss but show no significant difference in length of hospital stay, recovery time and lymph nodes harvested between robotic assisted surgery using AI and normal conventional surgery. From the evaluations, this research suggests that AI-assisted surgery improves over the conventional surgery as safer and more efficient system of surgery with minimal or no complications

Northumbria University Research Portal