14 research outputs found

    DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models

    Full text link
    The recent progress in diffusion-based text-to-image generation models has significantly expanded generative capabilities via conditioning the text descriptions. However, since relying solely on text prompts is still restrictive for fine-grained customization, we aim to extend the boundaries of conditional generation to incorporate diverse types of modalities, e.g., sketch, box, and style embedding, simultaneously. We thus design a multimodal text-to-image diffusion model, coined as DiffBlender, that achieves the aforementioned goal in a single model by training only a few small hypernetworks. DiffBlender facilitates a convenient scaling of input modalities, without altering the parameters of an existing large-scale generative model to retain its well-established knowledge. Furthermore, our study sets new standards for multimodal generation by conducting quantitative and qualitative comparisons with existing approaches. By diversifying the channels of conditioning modalities, DiffBlender faithfully reflects the provided information or, in its absence, creates imaginative generation.Comment: 18 pages, 16 figures, and 3 table

    Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations

    Full text link
    Training deep generative models usually requires a large amount of data. To alleviate the data collection cost, the task of zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain without any further training samples. Due to the data absence, the textual description of the target domain and the vision-language models, e.g., CLIP, are utilized to effectively guide the generator. However, with only a single representative text feature instead of real images, the synthesized images gradually lose diversity as the model is optimized, which is also known as mode collapse. To tackle the problem, we propose a novel method to find semantic variations of the target text in the CLIP space. Specifically, we explore diverse semantic variations based on the informative text feature of the target domain while regularizing the uncontrolled deviation of the semantic information. With the obtained variations, we design a novel directional moment loss that matches the first and second moments of image and text direction distributions. Moreover, we introduce elastic weight consolidation and a relation consistency loss to effectively preserve valuable content information from the source domain, e.g., appearances. Through extensive experiments, we demonstrate the efficacy of the proposed methods in ensuring sample diversity in various scenarios of zero-shot GAN adaptation. We also conduct ablation studies to validate the effect of each proposed component. Notably, our model achieves a new state-of-the-art on zero-shot GAN adaptation in terms of both diversity and quality.Comment: Accepted to ICCV 2023 (poster

    DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models

    Full text link
    Recent progresses in large-scale text-to-image models have yielded remarkable accomplishments, finding various applications in art domain. However, expressing unique characteristics of an artwork (e.g. brushwork, colortone, or composition) with text prompts alone may encounter limitations due to the inherent constraints of verbal description. To this end, we introduce DreamStyler, a novel framework designed for artistic image synthesis, proficient in both text-to-image synthesis and style transfer. DreamStyler optimizes a multi-stage textual embedding with a context-aware text prompt, resulting in prominent image quality. In addition, with content and style guidance, DreamStyler exhibits flexibility to accommodate a range of style references. Experimental results demonstrate its superior performance across multiple scenarios, suggesting its promising potential in artistic product creation

    AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks

    Full text link
    To deliver the artistic expression of the target style, recent studies exploit the attention mechanism owing to its ability to map the local patches of the style image to the corresponding patches of the content image. However, because of the low semantic correspondence between arbitrary content and artworks, the attention module repeatedly abuses specific local patches from the style image, resulting in disharmonious and evident repetitive artifacts. To overcome this limitation and accomplish impeccable artistic style transfer, we focus on enhancing the attention mechanism and capturing the rhythm of patterns that organize the style. In this paper, we introduce a novel metric, namely pattern repeatability, that quantifies the repetition of patterns in the style image. Based on the pattern repeatability, we propose Aesthetic Pattern-Aware style transfer Networks (AesPA-Net) that discover the sweet spot of local and global style expressions. In addition, we propose a novel self-supervisory task to encourage the attention mechanism to learn precise and meaningful semantic correspondence. Lastly, we introduce the patch-wise style loss to transfer the elaborate rhythm of local patterns. Through qualitative and quantitative evaluations, we verify the reliability of the proposed pattern repeatability that aligns with human perception, and demonstrate the superiority of the proposed framework.Comment: Accepted by ICCV 2023. Code is available at this https://github.com/Kibeom-Hong/AesPA-Ne

    Factors associated with patients’ choice of physician in the Korean population: Database analyses of a tertiary hospital

    No full text
    <div><p>This study aimed to determine the factors influencing patients’ choice of physician at the first visit through database analysis of a tertiary hospital in South Korea. We collected data on the first treatments performed by physicians who had treated patients for at least 3 consecutive years over 10 years (from 2003 to 2012) from the database of Seoul National University’s affiliated tertiary hospital. Ultimately, we obtained data on 524,012 first treatments of 319,004 patients performed by 115 physicians. Variables including physicians’ age and medical school and patients’ age were evaluated as influencing factors for the number of first treatments performed by each physician in each year using a Poisson regression through generalized estimating equations with a log link. The number of first treatments decreased over the study period. Notably, the relative risk for first treatments was lower among older physicians than among younger physicians (relative risk 0.96; 95% confidence interval 0.95 to 0.98). Physicians graduating from Seoul National University (SNU) also had a higher risk for performing first treatments than did those not from SNU (relative risk 1.58; 95% confidence interval 1.18 to 2.10). Finally, relative risk was also higher among older patients than among younger patients (relative risk 1.03; 95% confidence interval 1.01 to 1.04). This study systematically demonstrated that physicians’ age, whether the physician graduated from the highest-quality university, and patients’ age all related to patients’ choice of physician at the first visit in a tertiary university hospital. These findings might be due to Korean cultural factors.</p></div

    Oxaliplatin disrupts nucleolar function through biophysical disintegration

    No full text
    Platinum (Pt) compounds such as oxaliplatin are among the most commonly prescribed anti-cancer drugs. Despite their considerable clinical impact, the molecular basis of platinum cytotoxicity and cancer specificity remain unclear. Here we show that oxaliplatin, a backbone for the treatment of colorectal cancer, causes liquid-liquid demixing of nucleoli at clinically relevant concentrations. Our data suggest that this biophysical defect leads to cell-cycle arrest, shutdown of Pol I-mediated transcription, and ultimately cell death. We propose that instead of targeting a single molecule, oxaliplatin preferentially partitions into nucleoli, where it modifies nucleolar RNA and proteins. This mechanism provides a general approach for drugging the increasing number of cellular processes linked to biomolecular condensates
    corecore