4 research outputs found
Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models
Recent advances in diffusion models enable many powerful instruments for
image editing. One of these instruments is text-driven image manipulations:
editing semantic attributes of an image according to the provided text
description. % Popular text-conditional diffusion models offer various
high-quality image manipulation methods for a broad range of text prompts.
Existing diffusion-based methods already achieve high-quality image
manipulations for a broad range of text prompts. However, in practice, these
methods require high computation costs even with a high-end GPU. This greatly
limits potential real-world applications of diffusion-based image editing,
especially when running on user devices.
In this paper, we address efficiency of the recent text-driven editing
methods based on unconditional diffusion models and develop a novel algorithm
that learns image manipulations 4.5-10 times faster and applies them 8 times
faster. We carefully evaluate the visual quality and expressiveness of our
approach on multiple datasets using human annotators. Our experiments
demonstrate that our algorithm achieves the quality of much more expensive
methods. Finally, we show that our approach can adapt the pretrained model to
the user-specified image and text description on the fly just for 4 seconds. In
this setting, we notice that more compact unconditional diffusion models can be
considered as a rational alternative to the popular text-conditional
counterparts
Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
Knowledge distillation methods have recently shown to be a promising
direction to speedup the synthesis of large-scale diffusion models by requiring
only a few inference steps. While several powerful distillation methods were
recently proposed, the overall quality of student samples is typically lower
compared to the teacher ones, which hinders their practical usage. In this
work, we investigate the relative quality of samples produced by the teacher
text-to-image diffusion model and its distilled student version. As our main
empirical finding, we discover that a noticeable portion of student samples
exhibit superior fidelity compared to the teacher ones, despite the
"approximate" nature of the student. Based on this finding, we propose an
adaptive collaboration between student and teacher diffusion models for
effective text-to-image synthesis. Specifically, the distilled model produces
the initial sample, and then an oracle decides whether it needs further
improvements with a slow teacher model. Extensive experiments demonstrate that
the designed pipeline surpasses state-of-the-art text-to-image alternatives for
various inference budgets in terms of human preference. Furthermore, the
proposed approach can be naturally used in popular applications such as
text-guided image editing and controllable generation.Comment: CVPR2024 camera ready v
Surrogate-Assisted Evolutionary Generative Design Of Breakwaters Using Deep Convolutional Networks
In the paper, a multi-objective evolutionary surrogate-assisted approach for
the fast and effective generative design of coastal breakwaters is proposed. To
approximate the computationally expensive objective functions, the deep
convolutional neural network is used as a surrogate model. This model allows
optimizing a configuration of breakwaters with a different number of structures
and segments. In addition to the surrogate, an assistant model was developed to
estimate the confidence of predictions. The proposed approach was tested on the
synthetic water area, the SWAN model was used to calculate the wave heights.
The experimental results confirm that the proposed approach allows obtaining
more effective (less expensive with better protective properties) solutions
than non-surrogate approaches for the same time
Generative Design of Physical Objects using Modular Framework
In recent years generative design techniques have become firmly established
in numerous applied fields, especially in engineering. These methods are
demonstrating intensive growth owing to promising outlook. However, existing
approaches are limited by the specificity of problem under consideration. In
addition, they do not provide desired flexibility. In this paper we formulate
general approach to an arbitrary generative design problem and propose novel
framework called GEFEST (Generative Evolution For Encoded STructure) on its
basis. The developed approach is based on three general principles: sampling,
estimation and optimization. This ensures the freedom of method adjustment for
solution of particular generative design problem and therefore enables to
construct the most suitable one. A series of experimental studies was conducted
to confirm the effectiveness of the GEFEST framework. It involved synthetic and
real-world cases (coastal engineering, microfluidics, thermodynamics and oil
field planning). Flexible structure of the GEFEST makes it possible to obtain
the results that surpassing baseline solutions