9,193 research outputs found
MedGAN: Medical Image Translation using GANs
Image-to-image translation is considered a new frontier in the field of
medical image analysis, with numerous potential applications. However, a large
portion of recent approaches offers individualized solutions based on
specialized task-specific architectures or require refinement through
non-end-to-end training. In this paper, we propose a new framework, named
MedGAN, for medical image-to-image translation which operates on the image
level in an end-to-end manner. MedGAN builds upon recent advances in the field
of generative adversarial networks (GANs) by merging the adversarial framework
with a new combination of non-adversarial losses. We utilize a discriminator
network as a trainable feature extractor which penalizes the discrepancy
between the translated medical images and the desired modalities. Moreover,
style-transfer losses are utilized to match the textures and fine-structures of
the desired target images to the translated images. Additionally, we present a
new generator architecture, titled CasNet, which enhances the sharpness of the
translated medical outputs through progressive refinement via encoder-decoder
pairs. Without any application-specific modifications, we apply MedGAN on three
different tasks: PET-CT translation, correction of MR motion artefacts and PET
image denoising. Perceptual analysis by radiologists and quantitative
evaluations illustrate that the MedGAN outperforms other existing translation
approaches.Comment: 16 pages, 8 figure
Self-imitating Feedback Generation Using GAN for Computer-Assisted Pronunciation Training
Self-imitating feedback is an effective and learner-friendly method for
non-native learners in Computer-Assisted Pronunciation Training. Acoustic
characteristics in native utterances are extracted and transplanted onto
learner's own speech input, and given back to the learner as a corrective
feedback. Previous works focused on speech conversion using prosodic
transplantation techniques based on PSOLA algorithm. Motivated by the visual
differences found in spectrograms of native and non-native speeches, we
investigated applying GAN to generate self-imitating feedback by utilizing
generator's ability through adversarial training. Because this mapping is
highly under-constrained, we also adopt cycle consistency loss to encourage the
output to preserve the global structure, which is shared by native and
non-native utterances. Trained on 97,200 spectrogram images of short utterances
produced by native and non-native speakers of Korean, the generator is able to
successfully transform the non-native spectrogram input to a spectrogram with
properties of self-imitating feedback. Furthermore, the transformed spectrogram
shows segmental corrections that cannot be obtained by prosodic
transplantation. Perceptual test comparing the self-imitating and correcting
abilities of our method with the baseline PSOLA method shows that the
generative approach with cycle consistency loss is promising
Adaptive transfer functions: improved multiresolution visualization of medical models
The final publication is available at Springer via http://dx.doi.org/10.1007/s00371-016-1253-9Medical datasets are continuously increasing in size. Although larger models may be available for certain research purposes, in the common clinical practice the models are usually of up to 512x512x2000 voxels. These resolutions exceed the capabilities of conventional GPUs, the ones usually found in the medical doctors’ desktop PCs. Commercial solutions typically reduce the data by downsampling the dataset iteratively until it fits the available target specifications. The data loss reduces the visualization quality and this is not commonly compensated with other actions that might alleviate its effects. In this paper, we propose adaptive transfer functions, an algorithm that improves the transfer function in downsampled multiresolution models so that the quality of renderings is highly improved. The technique is simple and lightweight, and it is suitable, not only to visualize huge models that would not fit in a GPU, but also to render not-so-large models in mobile GPUs, which are less capable than their desktop counterparts. Moreover, it can also be used to accelerate rendering frame rates using lower levels of the multiresolution hierarchy while still maintaining high-quality results in a focus and context approach. We also show an evaluation of these results based on perceptual metrics.Peer ReviewedPostprint (author's final draft
- …