Search CORE

2,386 research outputs found

Dynamic Face Video Segmentation via Reinforcement Learning

Author: Cheng Shiyang
Dong Mingzhi
Pantic Maja
Shen Jie
Wang Yujiang
Wu Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/02/2021
Field of study

For real-time semantic video segmentation, most recent works utilised a dynamic framework with a key scheduler to make online key/non-key decisions. Some works used a fixed key scheduling policy, while others proposed adaptive key scheduling methods based on heuristic strategies, both of which may lead to suboptimal global performance. To overcome this limitation, we model the online key decision process in dynamic video segmentation as a deep reinforcement learning problem and learn an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return. Moreover, we study the application of dynamic video segmentation on face videos, a field that has not been investigated before. By evaluating on the 300VW dataset, we show that the performance of our reinforcement key scheduler outperforms that of various baselines in terms of both effective key selections and running speed. Further results on the Cityscapes dataset demonstrate that our proposed method can also generalise to other scenarios. To the best of our knowledge, this is the first work to use reinforcement learning for online key-frame decision in dynamic video segmentation, and also the first work on its application on face videos.Comment: CVPR 2020. 300VW with segmentation labels is available at: https://github.com/mapleandfire/300VW-Mas

arXiv.org e-Print Archive

Crossref

SurReal: enhancing Surgical simulation Realism using style transfer

Author: Flouty Evangello
Giataganas Petros
Luengo Imanol
Nehme Jean
Stoyanov Danail
Wisanuvej Piyamate
Publication venue
Publication date: 06/09/2018
Field of study

Surgical simulation is an increasingly important element of surgical education. Using simulation can be a means to address some of the significant challenges in developing surgical skills with limited time and resources. The photo-realistic fidelity of simulations is a key feature that can improve the experience and transfer ratio of trainees. In this paper, we demonstrate how we can enhance the visual fidelity of existing surgical simulation by performing style transfer of multi-class labels from real surgical video onto synthetic content. We demonstrate our approach on simulations of cataract surgery using real data labels from an existing public dataset. Our results highlight the feasibility of the approach and also the powerful possibility to extend this technique to incorporate additional temporal constraints and to different applications

arXiv.org e-Print Archive

UCL Discovery

Temporal Interpolation via Motion Field Prediction

Author: Karani Neerav
Konukoglu Ender
Tanner Christine
Zhang Lin
Publication venue
Publication date: 01/01/2018
Field of study

Navigated 2D multi-slice dynamic Magnetic Resonance (MR) imaging enables high contrast 4D MR imaging during free breathing and provides in-vivo observations for treatment planning and guidance. Navigator slices are vital for retrospective stacking of 2D data slices in this method. However, they also prolong the acquisition sessions. Temporal interpolation of navigator slices an be used to reduce the number of navigator acquisitions without degrading specificity in stacking. In this work, we propose a convolutional neural network (CNN) based method for temporal interpolation via motion field prediction. The proposed formulation incorporates the prior knowledge that a motion field underlies changes in the image intensities over time. Previous approaches that interpolate directly in the intensity space are prone to produce blurry images or even remove structures in the images. Our method avoids such problems and faithfully preserves the information in the image. Further, an important advantage of our formulation is that it provides an unsupervised estimation of bi-directional motion fields. We show that these motion fields can be used to halve the number of registrations required during 4D reconstruction, thus substantially reducing the reconstruction time.Comment: Submitted to 1st Conference on Medical Imaging with Deep Learning (MIDL 2018), Amsterdam, The Netherland

arXiv.org e-Print Archive

Repository for Publications and Research Data