2,386 research outputs found
Dynamic Face Video Segmentation via Reinforcement Learning
For real-time semantic video segmentation, most recent works utilised a
dynamic framework with a key scheduler to make online key/non-key decisions.
Some works used a fixed key scheduling policy, while others proposed adaptive
key scheduling methods based on heuristic strategies, both of which may lead to
suboptimal global performance. To overcome this limitation, we model the online
key decision process in dynamic video segmentation as a deep reinforcement
learning problem and learn an efficient and effective scheduling policy from
expert information about decision history and from the process of maximising
global return. Moreover, we study the application of dynamic video segmentation
on face videos, a field that has not been investigated before. By evaluating on
the 300VW dataset, we show that the performance of our reinforcement key
scheduler outperforms that of various baselines in terms of both effective key
selections and running speed. Further results on the Cityscapes dataset
demonstrate that our proposed method can also generalise to other scenarios. To
the best of our knowledge, this is the first work to use reinforcement learning
for online key-frame decision in dynamic video segmentation, and also the first
work on its application on face videos.Comment: CVPR 2020. 300VW with segmentation labels is available at:
https://github.com/mapleandfire/300VW-Mas
SurReal: enhancing Surgical simulation Realism using style transfer
Surgical simulation is an increasingly important element of surgical
education. Using simulation can be a means to address some of the significant
challenges in developing surgical skills with limited time and resources. The
photo-realistic fidelity of simulations is a key feature that can improve the
experience and transfer ratio of trainees. In this paper, we demonstrate how we
can enhance the visual fidelity of existing surgical simulation by performing
style transfer of multi-class labels from real surgical video onto synthetic
content. We demonstrate our approach on simulations of cataract surgery using
real data labels from an existing public dataset. Our results highlight the
feasibility of the approach and also the powerful possibility to extend this
technique to incorporate additional temporal constraints and to different
applications
Temporal Interpolation via Motion Field Prediction
Navigated 2D multi-slice dynamic Magnetic Resonance (MR) imaging enables high
contrast 4D MR imaging during free breathing and provides in-vivo observations
for treatment planning and guidance. Navigator slices are vital for
retrospective stacking of 2D data slices in this method. However, they also
prolong the acquisition sessions. Temporal interpolation of navigator slices an
be used to reduce the number of navigator acquisitions without degrading
specificity in stacking. In this work, we propose a convolutional neural
network (CNN) based method for temporal interpolation via motion field
prediction. The proposed formulation incorporates the prior knowledge that a
motion field underlies changes in the image intensities over time. Previous
approaches that interpolate directly in the intensity space are prone to
produce blurry images or even remove structures in the images. Our method
avoids such problems and faithfully preserves the information in the image.
Further, an important advantage of our formulation is that it provides an
unsupervised estimation of bi-directional motion fields. We show that these
motion fields can be used to halve the number of registrations required during
4D reconstruction, thus substantially reducing the reconstruction time.Comment: Submitted to 1st Conference on Medical Imaging with Deep Learning
(MIDL 2018), Amsterdam, The Netherland
- …