65 research outputs found
Explicit Contextual Semantics for Text Comprehension
Who did what to whom is a major focus in natural language understanding,
which is right the aim of semantic role labeling (SRL) task. Despite of sharing
a lot of processing characteristics and even task purpose, it is surprisingly
that jointly considering these two related tasks was never formally reported in
previous work. Thus this paper makes the first attempt to let SRL enhance text
comprehension and inference through specifying verbal predicates and their
corresponding semantic roles. In terms of deep learning models, our embeddings
are enhanced by explicit contextual semantic role labels for more fine-grained
semantics. We show that the salient labels can be conveniently added to
existing models and significantly improve deep learning models in challenging
text comprehension tasks. Extensive experiments on benchmark machine reading
comprehension and inference datasets verify that the proposed semantic learning
helps our system reach new state-of-the-art over strong baselines which have
been enhanced by well pretrained language models from the latest progress.Comment: Proceedings of the 33nd Pacific Asia Conference on Language,
Information and Computation (PACLIC 33
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling
In recent years, the use of multi-modal pre-trained Transformers has led to
significant advancements in visually-rich document understanding. However,
existing models have mainly focused on features such as text and vision while
neglecting the importance of layout relationship between text nodes. In this
paper, we propose GraphLayoutLM, a novel document understanding model that
leverages the modeling of layout structure graph to inject document layout
knowledge into the model. GraphLayoutLM utilizes a graph reordering algorithm
to adjust the text sequence based on the graph structure. Additionally, our
model uses a layout-aware multi-head self-attention layer to learn document
layout knowledge. The proposed model enables the understanding of the spatial
arrangement of text elements, improving document comprehension. We evaluate our
model on various benchmarks, including FUNSD, XFUND and CORD, and achieve
state-of-the-art results among these datasets. Our experimental results
demonstrate that our proposed method provides a significant improvement over
existing approaches and showcases the importance of incorporating layout
information into document understanding models. We also conduct an ablation
study to investigate the contribution of each component of our model. The
results show that both the graph reordering algorithm and the layout-aware
multi-head self-attention layer play a crucial role in achieving the best
performance
- …