4 research outputs found

    Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models

    Full text link
    Recent advances in diffusion models enable many powerful instruments for image editing. One of these instruments is text-driven image manipulations: editing semantic attributes of an image according to the provided text description. % Popular text-conditional diffusion models offer various high-quality image manipulation methods for a broad range of text prompts. Existing diffusion-based methods already achieve high-quality image manipulations for a broad range of text prompts. However, in practice, these methods require high computation costs even with a high-end GPU. This greatly limits potential real-world applications of diffusion-based image editing, especially when running on user devices. In this paper, we address efficiency of the recent text-driven editing methods based on unconditional diffusion models and develop a novel algorithm that learns image manipulations 4.5-10 times faster and applies them 8 times faster. We carefully evaluate the visual quality and expressiveness of our approach on multiple datasets using human annotators. Our experiments demonstrate that our algorithm achieves the quality of much more expensive methods. Finally, we show that our approach can adapt the pretrained model to the user-specified image and text description on the fly just for 4 seconds. In this setting, we notice that more compact unconditional diffusion models can be considered as a rational alternative to the popular text-conditional counterparts

    Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models

    Full text link
    Knowledge distillation methods have recently shown to be a promising direction to speedup the synthesis of large-scale diffusion models by requiring only a few inference steps. While several powerful distillation methods were recently proposed, the overall quality of student samples is typically lower compared to the teacher ones, which hinders their practical usage. In this work, we investigate the relative quality of samples produced by the teacher text-to-image diffusion model and its distilled student version. As our main empirical finding, we discover that a noticeable portion of student samples exhibit superior fidelity compared to the teacher ones, despite the "approximate" nature of the student. Based on this finding, we propose an adaptive collaboration between student and teacher diffusion models for effective text-to-image synthesis. Specifically, the distilled model produces the initial sample, and then an oracle decides whether it needs further improvements with a slow teacher model. Extensive experiments demonstrate that the designed pipeline surpasses state-of-the-art text-to-image alternatives for various inference budgets in terms of human preference. Furthermore, the proposed approach can be naturally used in popular applications such as text-guided image editing and controllable generation.Comment: CVPR2024 camera ready v

    Surrogate-Assisted Evolutionary Generative Design Of Breakwaters Using Deep Convolutional Networks

    Full text link
    In the paper, a multi-objective evolutionary surrogate-assisted approach for the fast and effective generative design of coastal breakwaters is proposed. To approximate the computationally expensive objective functions, the deep convolutional neural network is used as a surrogate model. This model allows optimizing a configuration of breakwaters with a different number of structures and segments. In addition to the surrogate, an assistant model was developed to estimate the confidence of predictions. The proposed approach was tested on the synthetic water area, the SWAN model was used to calculate the wave heights. The experimental results confirm that the proposed approach allows obtaining more effective (less expensive with better protective properties) solutions than non-surrogate approaches for the same time

    Generative Design of Physical Objects using Modular Framework

    Full text link
    In recent years generative design techniques have become firmly established in numerous applied fields, especially in engineering. These methods are demonstrating intensive growth owing to promising outlook. However, existing approaches are limited by the specificity of problem under consideration. In addition, they do not provide desired flexibility. In this paper we formulate general approach to an arbitrary generative design problem and propose novel framework called GEFEST (Generative Evolution For Encoded STructure) on its basis. The developed approach is based on three general principles: sampling, estimation and optimization. This ensures the freedom of method adjustment for solution of particular generative design problem and therefore enables to construct the most suitable one. A series of experimental studies was conducted to confirm the effectiveness of the GEFEST framework. It involved synthetic and real-world cases (coastal engineering, microfluidics, thermodynamics and oil field planning). Flexible structure of the GEFEST makes it possible to obtain the results that surpassing baseline solutions
    corecore