On a shutter press, modern handheld cameras capture multiple images in rapid
succession and merge them to generate a single image. However, individual
frames in a burst are misaligned due to inevitable motions and contain multiple
degradations. The challenge is to properly align the successive image shots and
merge their complimentary information to achieve high-quality outputs. Towards
this direction, we propose Burstormer: a novel transformer-based architecture
for burst image restoration and enhancement. In comparison to existing works,
our approach exploits multi-scale local and non-local features to achieve
improved alignment and feature fusion. Our key idea is to enable inter-frame
communication in the burst neighborhoods for information aggregation and
progressive fusion while modeling the burst-wide context. However, the input
burst frames need to be properly aligned before fusing their information.
Therefore, we propose an enhanced deformable alignment module for aligning
burst features with regards to the reference frame. Unlike existing methods,
the proposed alignment module not only aligns burst features but also exchanges
feature information and maintains focused communication with the reference
frame through the proposed reference-based feature enrichment mechanism, which
facilitates handling complex motions. After multi-level alignment and
enrichment, we re-emphasize on inter-frame communication within burst using a
cyclic burst sampling module. Finally, the inter-frame information is
aggregated using the proposed burst feature fusion module followed by
progressive upsampling. Our Burstormer outperforms state-of-the-art methods on
burst super-resolution, burst denoising and burst low-light enhancement. Our
codes and pretrained models are available at https://
github.com/akshaydudhane16/BurstormerComment: Accepted at CVPR 202