Creative sketch is a universal way of visual expression, but translating
images from an abstract sketch is very challenging. Traditionally, creating a
deep learning model for sketch-to-image synthesis needs to overcome the
distorted input sketch without visual details, and requires to collect
large-scale sketch-image datasets. We first study this task by using diffusion
models. Our model matches sketches through the cross domain constraints, and
uses a classifier to guide the image synthesis more accurately. Extensive
experiments confirmed that our method can not only be faithful to user's input
sketches, but also maintain the diversity and imagination of synthetic image
results. Our model can beat GAN-based method in terms of generation quality and
human evaluation, and does not rely on massive sketch-image datasets.
Additionally, we present applications of our method in image editing and
interpolation