26 research outputs found
Topological Sort for Sentence Ordering
Sentence ordering is the task of arranging the sentences of a given text in
the correct order. Recent work using deep neural networks for this task has
framed it as a sequence prediction problem. In this paper, we propose a new
framing of this task as a constraint solving problem and introduce a new
technique to solve it. Additionally, we propose a human evaluation for this
task. The results on both automatic and human metrics across four different
datasets show that this new technique is better at capturing coherence in
documents.Comment: Will be published at the Proceedings of the 58th Annual Meeting of
the Association for Computational Linguistics (ACL) 202
Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals
The ability to sequence unordered events is an essential skill to comprehend
and reason about real world task procedures, which often requires thorough
understanding of temporal common sense and multimodal information, as these
procedures are often communicated through a combination of texts and images.
Such capability is essential for applications such as sequential task planning
and multi-source instruction summarization. While humans are capable of
reasoning about and sequencing unordered multimodal procedural instructions,
whether current machine learning models have such essential capability is still
an open question. In this work, we benchmark models' capability of reasoning
over and sequencing unordered multimodal instructions by curating datasets from
popular online instructional manuals and collecting comprehensive human
annotations. We find models not only perform significantly worse than humans
but also seem incapable of efficiently utilizing the multimodal information. To
improve machines' performance on multimodal event sequencing, we propose
sequentiality-aware pretraining techniques that exploit the sequential
alignment properties of both texts and images, resulting in > 5% significant
improvements.Comment: In Proceedings of the Conference of the 60th Annual Meeting of the
Association for Computational Linguistics (ACL), 202