Search CORE

5,073 research outputs found

Estimating position & velocity in 3D space from monocular video sequences using a deep neural network

Author: Casals Gelpi Alicia
Fernández Ruzafa José
Marbán González Arturo
Samek Wojciech
Srinivasan Vignesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

This work describes a regression model based on Convolutional Neural Networks (CNN) and Long-Short Term Memory (LSTM) networks for tracking objects from monocular video sequences. The target application being pursued is Vision-Based Sensor Substitution (VBSS). In particular, the tool-tip position and velocity in 3D space of a pair of surgical robotic instruments (SRI) are estimated for three surgical tasks, namely suturing, needle-passing and knot-tying. The CNN extracts features from individual video frames and the LSTM network processes these features over time and continuously outputs a 12-dimensional vector with the estimated position and velocity values. A series of analyses and experiments are carried out in the regression model to reveal the benefits and drawbacks of different design choices. First, the impact of the loss function is investigated by adequately weighing the Root Mean Squared Error (RMSE) and Gradient Difference Loss (GDL), using the VGG16 neural network for feature extraction. Second, this analysis is extended to a Residual Neural Network designed for feature extraction, which has fewer parameters than the VGG16 model, resulting in a reduction of ~96.44 % in the neural network size. Third, the impact of the number of time steps used to model the temporal information processed by the LSTM network is investigated. Finally, the capability of the regression model to generalize to the data related to "unseen" surgical tasks (unavailable in the training set) is evaluated. The aforesaid analyses are experimentally validated on the public dataset JIGSAWS. These analyses provide some guidelines for the design of a regression model in the context of VBSS, specifically when the objective is to estimate a set of 1D time series signals from video sequences.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Fraunhofer-ePrints

Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation

Author: Abbeel Pieter
Agrawal Pulkit
Chen Dian
Isola Phillip
Levine Sergey
Malik Jitendra
Nair Ashvin
Publication venue
Publication date: 06/03/2017
Field of study

Manipulation of deformable objects, such as ropes and cloth, is an important but challenging problem in robotics. We present a learning-based system where a robot takes as input a sequence of images of a human manipulating a rope from an initial to goal configuration, and outputs a sequence of actions that can reproduce the human demonstration, using only monocular images as input. To perform this task, the robot learns a pixel-level inverse dynamics model of rope manipulation directly from images in a self-supervised manner, using about 60K interactions with the rope collected autonomously by the robot. The human demonstration provides a high-level plan of what to do and the low-level inverse model is used to execute the plan. We show that by combining the high and low-level plans, the robot can successfully manipulate a rope into a variety of target shapes using only a sequence of human-provided images for direction.Comment: 8 pages, accepted to International Conference on Robotics and Automation (ICRA) 201

arXiv.org e-Print Archive

Crossref

Ergonomics of the Operative Field in Paediatric Minimal Access Surgery

Author: Lee Alex Chi Hang
Lee Alex Chi Hang
Publication venue
Publication date: 01/01/2009
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Automated pick-up of suturing needles for robotic surgical assistance

Author: Chadebecq Francois
D'Ettorre Claudia
De Momi Elena
Du Xiaofei
Dwyer George
Stoyanov Danail
Vasconcelos Francisco
Publication venue
Publication date: 09/04/2018
Field of study

Robot-assisted laparoscopic prostatectomy (RALP) is a treatment for prostate cancer that involves complete or nerve sparing removal prostate tissue that contains cancer. After removal the bladder neck is successively sutured directly with the urethra. The procedure is called urethrovesical anastomosis and is one of the most dexterity demanding tasks during RALP. Two suturing instruments and a pair of needles are used in combination to perform a running stitch during urethrovesical anastomosis. While robotic instruments provide enhanced dexterity to perform the anastomosis, it is still highly challenging and difficult to learn. In this paper, we presents a vision-guided needle grasping method for automatically grasping the needle that has been inserted into the patient prior to anastomosis. We aim to automatically grasp the suturing needle in a position that avoids hand-offs and immediately enables the start of suturing. The full grasping process can be broken down into: a needle detection algorithm; an approach phase where the surgical tool moves closer to the needle based on visual feedback; and a grasping phase through path planning based on observed surgical practice. Our experimental results show examples of successful autonomous grasping that has the potential to simplify and decrease the operational time in RALP by assisting a small component of urethrovesical anastomosis

arXiv.org e-Print Archive

Crossref

UCL Discovery

Data-driven robotic manipulation of cloth-like deformable objects : the present, challenges and future prospects

Author: Kadi Halid A.
Terzić Kasim
Publication venue: 'MDPI AG'
Publication date: 21/02/2023
Field of study

Manipulating cloth-like deformable objects (CDOs) is a long-standing problem in the robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level of compression strength while two points on the article are pushed towards each other and include objects such as ropes (1D), fabrics (2D) and bags (3D). In general, CDOs’ many degrees of freedom (DoF) introduce severe self-occlusion and complex state–action dynamics as significant obstacles to perception and manipulation systems. These challenges exacerbate existing issues of modern robotic control methods such as imitation learning (IL) and reinforcement learning (RL). This review focuses on the application details of data-driven control methods on four major task families in this domain: cloth shaping, knot tying/untying, dressing and bag manipulation. Furthermore, we identify specific inductive biases in these four domains that present challenges for more general IL and RL algorithms.Publisher PDFPeer reviewe

University of St. Andrews - Pure

St Andrews Research Repository

Learning to tie the knot: The acquisition of functional object representations by physical and observational experience

Author: Cohen NR
Cross ES
Grafton ST
Hamilton AFDC
Publication venue: PUBLIC LIBRARY SCIENCE
Publication date: 12/10/2017
Field of study

Here we examined neural substrates for physically and observationally learning to construct novel objects, and characterized brain regions associated with each kind of learning using fMRI. Each participant was assigned a training partner, and for five consecutive days practiced tying one group of knots (“tied” condition) or watched their partner tie different knots (“watched” condition) while a third set of knots remained untrained. Functional MRI was obtained prior to and immediately following the week of training while participants performed a visual knot-matching task. After training, a portion of left superior parietal lobule demonstrated a training by scan session interaction. This means this parietal region responded selectively to knots that participants had physically learned to tie in the post-training scan session but not the pre-training scan session. A conjunction analysis on the post-training scan data showed right intraparietal sulcus and right dorsal premotor cortex to respond when viewing images of knots from the tied and watched conditions compared to knots that were untrained during the post-training scan session. This suggests that these brain areas track both physical and observational learning. Together, the data provide preliminary evidence of engagement of brain regions associated with hand-object interactions when viewing objects associated with physical experience, and with observational experience without concurrent physical practice

UCL Discovery