8 research outputs found
3D Face Style Transfer with a Hybrid Solution of NeRF and Mesh Rasterization
Style transfer for human face has been widely researched in recent years.
Majority of the existing approaches work in 2D image domain and have 3D
inconsistency issue when applied on different viewpoints of the same face. In
this paper, we tackle the problem of 3D face style transfer which aims at
generating stylized novel views of a 3D human face with multi-view consistency.
We propose to use a neural radiance field (NeRF) to represent 3D human face and
combine it with 2D style transfer to stylize the 3D face. We find that directly
training a NeRF on stylized images from 2D style transfer brings in 3D
inconsistency issue and causes blurriness. On the other hand, training a NeRF
jointly with 2D style transfer objectives shows poor convergence due to the
identity and head pose gap between style image and content image. It also poses
challenge in training time and memory due to the need of volume rendering for
full image to apply style transfer loss functions. We therefore propose a
hybrid framework of NeRF and mesh rasterization to combine the benefits of high
fidelity geometry reconstruction of NeRF and fast rendering speed of mesh. Our
framework consists of three stages: 1. Training a NeRF model on input face
images to learn the 3D geometry; 2. Extracting a mesh from the trained NeRF
model and optimizing it with style transfer objectives via differentiable
rasterization; 3. Training a new color network in NeRF conditioned on a style
embedding to enable arbitrary style transfer to the 3D face. Experiment results
show that our approach generates high quality face style transfer with great 3D
consistency, while also enabling a flexible style control
Brainomaly: Unsupervised Neurologic Disease Detection Utilizing Unannotated T1-weighted Brain MR Images
Harnessing the power of deep neural networks in the medical imaging domain is
challenging due to the difficulties in acquiring large annotated datasets,
especially for rare diseases, which involve high costs, time, and effort for
annotation. Unsupervised disease detection methods, such as anomaly detection,
can significantly reduce human effort in these scenarios. While anomaly
detection typically focuses on learning from images of healthy subjects only,
real-world situations often present unannotated datasets with a mixture of
healthy and diseased subjects. Recent studies have demonstrated that utilizing
such unannotated images can improve unsupervised disease and anomaly detection.
However, these methods do not utilize knowledge specific to registered
neuroimages, resulting in a subpar performance in neurologic disease detection.
To address this limitation, we propose Brainomaly, a GAN-based image-to-image
translation method specifically designed for neurologic disease detection.
Brainomaly not only offers tailored image-to-image translation suitable for
neuroimages but also leverages unannotated mixed images to achieve superior
neurologic disease detection. Additionally, we address the issue of model
selection for inference without annotated samples by proposing a pseudo-AUC
metric, further enhancing Brainomaly's detection performance. Extensive
experiments and ablation studies demonstrate that Brainomaly outperforms
existing state-of-the-art unsupervised disease and anomaly detection methods by
significant margins in Alzheimer's disease detection using a publicly available
dataset and headache detection using an institutional dataset. The code is
available from https://github.com/mahfuzmohammad/Brainomaly.Comment: Accepted in WACV 202
Rethinking CycleGAN: Improving Quality of GANs for Unpaired Image-to-Image Translation
An unpaired image-to-image (I2I) translation technique seeks to find a
mapping between two domains of data in a fully unsupervised manner. While the
initial solutions to the I2I problem were provided by the generative
adversarial neural networks (GANs), currently, diffusion models (DM) hold the
state-of-the-art status on the I2I translation benchmarks in terms of FID. Yet,
they suffer from some limitations, such as not using data from the source
domain during the training, or maintaining consistency of the source and
translated images only via simple pixel-wise errors. This work revisits the
classic CycleGAN model and equips it with recent advancements in model
architectures and model training procedures. The revised model is shown to
significantly outperform other advanced GAN- and DM-based competitors on a
variety of benchmarks. In the case of Male2Female translation of CelebA, the
model achieves over 40% improvement in FID score compared to the
state-of-the-art results. This work also demonstrates the ineffectiveness of
the pixel-wise I2I translation faithfulness metrics and suggests their
revision. The code and trained models are available at
https://github.com/LS4GAN/uvcgan
BioGAN: An unpaired GAN-based image to image translation model for microbiological images
Background and objective: A diversified dataset is crucial for training a well-generalized supervised computer vision algorithm. However, in the field of microbiology, generation and annotation of a diverse dataset including field-taken images are time-consuming, costly, and in some cases impossible. Image to image translation frameworks allow us to diversify the dataset by transferring images from one domain to another. However, most existing image translation techniques require a paired dataset (original image and its corresponding image in the target domain), which poses a significant challenge in collecting such datasets. In addition, the application of these image translation frameworks in microbiology] is rarely discussed . In this study, we aim to develop an unpaired GAN-based (Generative Adversarial Network) image to image translation model for microbiological images, and study how it can improve generalization ability of object detection models. Methods: In this paper, we present an unpaired and unsupervised image translation model to translate laboratory-taken microbiological images to field images, building upon the recent advances in GAN networks and Perceptual loss function. We propose a novel design for a GAN model, BioGAN, by utilizing Adversarial and Perceptual loss in order to transform high level features of laboratory-taken images of Prototheca bovis into field images, while keeping their spatial features. Results: We studied the contribution of Adversarial and Perceptual loss in the generation of realistic field images. We used the synthetic field images, generated by BioGAN, to train an object-detection framework, and compared the results with those of an object-detection framework trained with laboratory images; this resulted in up to 68.1% and 75.3% improvement on F1score and mAP, respectively. We also present the results of a qualitative evaluation test, performed by experts, of the similarity of BioGAN synthetic images with field images