Diffusion models have revolutionized text-to-image generation, but their
real-world applications are hampered by the extensive time needed for hundreds
of diffusion steps. Although progressive distillation has been proposed to
speed up diffusion sampling to 2-8 steps, it still falls short in one-step
generation, and necessitates training multiple student models, which is highly
parameter-extensive and time-consuming. To overcome these limitations, we
introduce High-frequency-Promoting Adaptation (HiPA), a parameter-efficient
approach to enable one-step text-to-image diffusion. Grounded in the insight
that high-frequency information is essential but highly lacking in one-step
diffusion, HiPA focuses on training one-step, low-rank adaptors to specifically
enhance the under-represented high-frequency abilities of advanced diffusion
models. The learned adaptors empower these diffusion models to generate
high-quality images in just a single step. Compared with progressive
distillation, HiPA achieves much better performance in one-step text-to-image
generation (37.3 β 23.8 in FID-5k on MS-COCO 2017) and 28.6x
training speed-up (108.8 β 3.8 A100 GPU days), requiring only 0.04%
training parameters (7,740 million β 3.3 million). We also
demonstrate HiPA's effectiveness in text-guided image editing, inpainting and
super-resolution tasks, where our adapted models consistently deliver
high-quality outputs in just one diffusion step. The source code will be
released