Sharing knowledge between information extraction tasks has always been a
challenge due to the diverse data formats and task variations. Meanwhile, this
divergence leads to information waste and increases difficulties in building
complex applications in real scenarios. Recent studies often formulate IE tasks
as a triplet extraction problem. However, such a paradigm does not support
multi-span and n-ary extraction, leading to weak versatility. To this end, we
reorganize IE problems into unified multi-slot tuples and propose a universal
framework for various IE tasks, namely Mirror. Specifically, we recast existing
IE tasks as a multi-span cyclic graph extraction problem and devise a
non-autoregressive graph decoding algorithm to extract all spans in a single
step. It is worth noting that this graph structure is incredibly versatile, and
it supports not only complex IE tasks, but also machine reading comprehension
and classification tasks. We manually construct a corpus containing 57 datasets
for model pretraining, and conduct experiments on 30 datasets across 8
downstream tasks. The experimental results demonstrate that our model has
decent compatibility and outperforms or reaches competitive performance with
SOTA systems under few-shot and zero-shot settings. The code, model weights,
and pretraining corpus are available at https://github.com/Spico197/Mirror .Comment: Accepted to EMNLP23 main conferenc