28 research outputs found

    Attribute Based Interpretable Evaluation Metrics for Generative Models

    Full text link
    When the training dataset comprises a 1:1 proportion of dogs to cats, a generative model that produces 1:1 dogs and cats better resembles the training species distribution than another model with 3:1 dogs and cats. Can we capture this phenomenon using existing metrics? Unfortunately, we cannot, because these metrics do not provide any interpretability beyond "diversity". In this context, we propose a new evaluation protocol that measures the divergence of a set of generated images from the training set regarding the distribution of attribute strengths as follows. Single-attribute Divergence (SaD) measures the divergence regarding PDFs of a single attribute. Paired-attribute Divergence (PaD) measures the divergence regarding joint PDFs of a pair of attributes. They provide which attributes the models struggle. For measuring the attribute strengths of an image, we propose Heterogeneous CLIPScore (HCS) which measures the cosine similarity between image and text vectors with heterogeneous initial points. With SaD and PaD, we reveal the following about existing generative models. ProjectedGAN generates implausible attribute relationships such as a baby with a beard even though it has competitive scores of existing metrics. Diffusion models struggle to capture diverse colors in the datasets. The larger sampling timesteps of latent diffusion model generate the more minor objects including earrings and necklaces. Stable Diffusion v1.5 better captures the attributes than v2.1. Our metrics lay a foundation for explainable evaluations of generative models

    LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

    Full text link
    Existing techniques for image-to-image translation commonly have suffered from two critical problems: heavy reliance on per-sample domain annotation and/or inability of handling multiple attributes per image. Recent truly-unsupervised methods adopt clustering approaches to easily provide per-sample one-hot domain labels. However, they cannot account for the real-world setting: one sample may have multiple attributes. In addition, the semantics of the clusters are not easily coupled to the human understanding. To overcome these, we present a LANguage-driven Image-to-image Translation model, dubbed LANIT. We leverage easy-to-obtain candidate attributes given in texts for a dataset: the similarity between images and attributes indicates per-sample domain labels. This formulation naturally enables multi-hot label so that users can specify the target domain with a set of attributes in language. To account for the case that the initial prompts are inaccurate, we also present prompt learning. We further present domain regularization loss that enforces translated images be mapped to the corresponding domain. Experiments on several standard benchmarks demonstrate that LANIT achieves comparable or superior performance to existing models.Comment: Accepted to CVPR 2023. Project Page: https://ku-cvlab.github.io/LANIT

    BallGAN: 3D-aware Image Synthesis with a Spherical Background

    Full text link
    3D-aware GANs aim to synthesize realistic 3D scenes such that they can be rendered in arbitrary perspectives to produce images. Although previous methods produce realistic images, they suffer from unstable training or degenerate solutions where the 3D geometry is unnatural. We hypothesize that the 3D geometry is underdetermined due to the insufficient constraint, i.e., being classified as real image to the discriminator is not enough. To solve this problem, we propose to approximate the background as a spherical surface and represent a scene as a union of the foreground placed in the sphere and the thin spherical background. It reduces the degree of freedom in the background field. Accordingly, we modify the volume rendering equation and incorporate dedicated constraints to design a novel 3D-aware GAN framework named BallGAN. BallGAN has multiple advantages as follows. 1) It produces more reasonable 3D geometry; the images of a scene across different viewpoints have better photometric consistency and fidelity than the state-of-the-art methods. 2) The training becomes much more stable. 3) The foreground can be separately rendered on top of different arbitrary backgrounds.Comment: Project Page: https://minjung-s.github.io/ballga

    AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks

    Full text link
    To deliver the artistic expression of the target style, recent studies exploit the attention mechanism owing to its ability to map the local patches of the style image to the corresponding patches of the content image. However, because of the low semantic correspondence between arbitrary content and artworks, the attention module repeatedly abuses specific local patches from the style image, resulting in disharmonious and evident repetitive artifacts. To overcome this limitation and accomplish impeccable artistic style transfer, we focus on enhancing the attention mechanism and capturing the rhythm of patterns that organize the style. In this paper, we introduce a novel metric, namely pattern repeatability, that quantifies the repetition of patterns in the style image. Based on the pattern repeatability, we propose Aesthetic Pattern-Aware style transfer Networks (AesPA-Net) that discover the sweet spot of local and global style expressions. In addition, we propose a novel self-supervisory task to encourage the attention mechanism to learn precise and meaningful semantic correspondence. Lastly, we introduce the patch-wise style loss to transfer the elaborate rhythm of local patterns. Through qualitative and quantitative evaluations, we verify the reliability of the proposed pattern repeatability that aligns with human perception, and demonstrate the superiority of the proposed framework.Comment: Accepted by ICCV 2023. Code is available at this https://github.com/Kibeom-Hong/AesPA-Ne

    Comparative analysis of FBS containing media and serum free chemically defined media, CellCor for adipose derived stem cells production

    Get PDF
    Background: As a result of the aging society, the average OECD life expectancy has grown to about 80 years, yet the average health life still remains at only 65 years, leaving more than 15 years of life in an uncertain health state. Regenerative medicine is a new concept of medicine that combines cells and biomaterials to restore the functions of aged or damaged tissues or organs. It is also a good treatment for chronic diseases and incurable diseases, receiving attention as a new paradigm for treating diseases. Problems: As the market for regenerative medicine grows, mass production of consistent quality cells is required. Media is the most important thing in mass production of consistent quality cells. However, the fetal bovine serum (FBS) containing media that is currently wide used has many problems, such as unidentified viral infection, immunogenicity, lot variations, unstable supply, and ethical issues. To solve these problems and make rapid progress in regenerative medicine, a high-performance serum free chemically defined media (CDM) is needed. Solution: CellCor is a serum free CDM that provides excellent performance, safety, economy and consistency in stem cell production. CellCor allows higher-speed cell production rate than current FBS containing culture media (Figure 1). Compared to the FBS containing media, CellCor is able to maintain stem cell markers, higher population homogeneity, genetic stability, and excellent differentiation potency even at later passage. Please click Additional Files below to see the full abstract
    corecore