122,206 research outputs found
Recycle-GAN: Unsupervised Video Retargeting
We introduce a data-driven approach for unsupervised video retargeting that
translates content from one domain to another while preserving the style native
to a domain, i.e., if contents of John Oliver's speech were to be transferred
to Stephen Colbert, then the generated content/speech should be in Stephen
Colbert's style. Our approach combines both spatial and temporal information
along with adversarial losses for content translation and style preservation.
In this work, we first study the advantages of using spatiotemporal constraints
over spatial constraints for effective retargeting. We then demonstrate the
proposed approach for the problems where information in both space and time
matters such as face-to-face translation, flower-to-flower, wind and cloud
synthesis, sunrise and sunset.Comment: ECCV 2018; Please refer to project webpage for videos -
http://www.cs.cmu.edu/~aayushb/Recycle-GA
Improving Surgical Training Phantoms by Hyperrealism: Deep Unpaired Image-to-Image Translation from Real Surgeries
Current `dry lab' surgical phantom simulators are a valuable tool for
surgeons which allows them to improve their dexterity and skill with surgical
instruments. These phantoms mimic the haptic and shape of organs of interest,
but lack a realistic visual appearance. In this work, we present an innovative
application in which representations learned from real intraoperative
endoscopic sequences are transferred to a surgical phantom scenario. The term
hyperrealism is introduced in this field, which we regard as a novel subform of
surgical augmented reality for approaches that involve real-time object
transfigurations. For related tasks in the computer vision community, unpaired
cycle-consistent Generative Adversarial Networks (GANs) have shown excellent
results on still RGB images. Though, application of this approach to continuous
video frames can result in flickering, which turned out to be especially
prominent for this application. Therefore, we propose an extension of
cycle-consistent GANs, named tempCycleGAN, to improve temporal consistency.The
novel method is evaluated on captures of a silicone phantom for training
endoscopic reconstructive mitral valve procedures. Synthesized videos show
highly realistic results with regard to 1) replacement of the silicone
appearance of the phantom valve by intraoperative tissue texture, while 2)
explicitly keeping crucial features in the scene, such as instruments, sutures
and prostheses. Compared to the original CycleGAN approach, tempCycleGAN
efficiently removes flickering between frames. The overall approach is expected
to change the future design of surgical training simulators since the generated
sequences clearly demonstrate the feasibility to enable a considerably more
realistic training experience for minimally-invasive procedures.Comment: 8 pages, accepted at MICCAI 2018, supplemental material at
https://youtu.be/qugAYpK-Z4
Temporal Interpolation via Motion Field Prediction
Navigated 2D multi-slice dynamic Magnetic Resonance (MR) imaging enables high
contrast 4D MR imaging during free breathing and provides in-vivo observations
for treatment planning and guidance. Navigator slices are vital for
retrospective stacking of 2D data slices in this method. However, they also
prolong the acquisition sessions. Temporal interpolation of navigator slices an
be used to reduce the number of navigator acquisitions without degrading
specificity in stacking. In this work, we propose a convolutional neural
network (CNN) based method for temporal interpolation via motion field
prediction. The proposed formulation incorporates the prior knowledge that a
motion field underlies changes in the image intensities over time. Previous
approaches that interpolate directly in the intensity space are prone to
produce blurry images or even remove structures in the images. Our method
avoids such problems and faithfully preserves the information in the image.
Further, an important advantage of our formulation is that it provides an
unsupervised estimation of bi-directional motion fields. We show that these
motion fields can be used to halve the number of registrations required during
4D reconstruction, thus substantially reducing the reconstruction time.Comment: Submitted to 1st Conference on Medical Imaging with Deep Learning
(MIDL 2018), Amsterdam, The Netherland
- …