Current research in form understanding predominantly relies on large
pre-trained language models, necessitating extensive data for pre-training.
However, the importance of layout structure (i.e., the spatial relationship
between the entity blocks in the visually rich document) to relation extraction
has been overlooked. In this paper, we propose REgion-Aware Relation Extraction
(RE2) that leverages region-level spatial structure among the entity blocks
to improve their relation prediction. We design an edge-aware graph attention
network to learn the interaction between entities while considering their
spatial relationship defined by their region-level representations. We also
introduce a constraint objective to regularize the model towards consistency
with the inherent constraints of the relation extraction task. Extensive
experiments across various datasets, languages and domains demonstrate the
superiority of our proposed approach.Comment: NAACL 202